More Advanced Joins
Contents
The Employee Database
Let us begin by using an example employee database of a fairly small family business, which does not anticipate expanding in the future.
First, we create the table that will hold all of the employees and their contact information:
CREATE TABLE `Employees` ( `ID` TINYINT(3) UNSIGNED NOT NULL AUTO_INCREMENT, `First_Name` VARCHAR(25) NOT NULL, `Last_Name` VARCHAR(25) NOT NULL, `Position` VARCHAR(25) NOT NULL, `Home_Address` VARCHAR(50) NOT NULL, `Home_Phone` VARCHAR(12) NOT NULL, PRIMARY KEY (`ID`) ) TYPE=MyISAM;
Next, we add a few employees to the table:
INSERT INTO `Employees` (`First_Name`, `Last_Name`, `Position`, `Home_Address`, `Home_Phone`) VALUES ('Mustapha', 'Mond', 'Chief Executive Officer', '692 Promiscuous Plaza', '326-555-3492'), ('Henry', 'Foster', 'Store Manager', '314 Savage Circle', '326-555-3847'), ('Bernard', 'Marx', 'Cashier', '1240 Ambient Avenue', '326-555-8456'), ('Lenina', 'Crowne', 'Cashier', '281 Bumblepuppy Boulevard', '328-555-2349'), ('Fanny', 'Crowne', 'Restocker', '1023 Bokanovsky Lane', '326-555-6329'), ('Helmholtz', 'Watson', 'Janitor', '944 Soma Court', '329-555-2478');
Now, we create a second table, containing the hours which each employee clocked in and out during the week:
CREATE TABLE `Hours` ( `ID` TINYINT(3) UNSIGNED NOT NULL, `Clock_In` DATETIME NOT NULL, `Clock_Out` DATETIME NOT NULL ) TYPE=MyISAM;
Finally, although it is a lot of information, we add a full week of hours for each of the employees into the second table that we created:
INSERT INTO `Hours` VALUES ('1', '2005-08-08 07:00:42', '2005-08-08 17:01:36'), ('1', '2005-08-09 07:01:34', '2005-08-09 17:10:11'), ('1', '2005-08-10 06:59:56', '2005-08-10 17:09:29'), ('1', '2005-08-11 07:00:17', '2005-08-11 17:00:47'), ('1', '2005-08-12 07:02:29', '2005-08-12 16:59:12'), ('2', '2005-08-08 07:00:25', '2005-08-08 17:03:13'), ('2', '2005-08-09 07:00:57', '2005-08-09 17:05:09'), ('2', '2005-08-10 06:58:43', '2005-08-10 16:58:24'), ('2', '2005-08-11 07:01:58', '2005-08-11 17:00:45'), ('2', '2005-08-12 07:02:12', '2005-08-12 16:58:57'), ('3', '2005-08-08 07:00:12', '2005-08-08 17:01:32'), ('3', '2005-08-09 07:01:10', '2005-08-09 17:00:26'), ('3', '2005-08-10 06:59:53', '2005-08-10 17:02:53'), ('3', '2005-08-11 07:01:15', '2005-08-11 17:04:23'), ('3', '2005-08-12 07:00:51', '2005-08-12 16:57:52'), ('4', '2005-08-08 06:54:37', '2005-08-08 17:01:23'), ('4', '2005-08-09 06:58:23', '2005-08-09 17:00:54'), ('4', '2005-08-10 06:59:14', '2005-08-10 17:00:12'), ('4', '2005-08-11 07:00:49', '2005-08-11 17:00:34'), ('4', '2005-08-12 07:01:09', '2005-08-12 16:58:29'), ('5', '2005-08-08 07:00:04', '2005-08-08 17:01:43'), ('5', '2005-08-09 07:02:12', '2005-08-09 17:02:13'), ('5', '2005-08-10 06:59:39', '2005-08-10 17:03:37'), ('5', '2005-08-11 07:01:26', '2005-08-11 17:00:03'), ('5', '2005-08-12 07:02:15', '2005-08-12 16:59:02'), ('6', '2005-08-08 07:00:12', '2005-08-08 17:01:02'), ('6', '2005-08-09 07:03:44', '2005-08-09 17:00:00'), ('6', '2005-08-10 06:54:19', '2005-08-10 17:03:31'), ('6', '2005-08-11 07:00:05', '2005-08-11 17:02:57'), ('6', '2005-08-12 07:02:07', '2005-08-12 16:58:23');
Working with the Employee Database
Now that we have a cleanly structured database to work with, let us begin this tutorial by stepping up one notch from the last tutorial and filtering our information a little.
Filtering by Name
Earlier in the week, an anonymous employee reported that Helmholtz came into work almost four minutes late; to verify this, we will begin our investigation by filtering out employees whose first names are "Helmholtz":
SELECT `Employees`.`First_Name`, `Employees`.`Last_Name`, `Hours`.`Clock_In`, `Hours`.`Clock_Out` FROM `Employees` INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID` WHERE `Employees`.`First_Name` = 'Helmholtz';
The result:
+------------+-----------+---------------------+---------------------+ | First_Name | Last_Name | Clock_In | Clock_Out | +------------+-----------+---------------------+---------------------+ | Helmholtz | Watson | 2005-08-08 07:00:12 | 2005-08-08 17:01:02 | | Helmholtz | Watson | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 | | Helmholtz | Watson | 2005-08-10 06:54:19 | 2005-08-10 17:03:31 | | Helmholtz | Watson | 2005-08-11 07:00:05 | 2005-08-11 17:02:57 | | Helmholtz | Watson | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 | +------------+-----------+---------------------+---------------------+ 5 rows in set (0.00 sec)
This is obviously more information than we care to trudge through, considering we only care about when he arrived past 7:00:59 on any given day within this week; thus, we need to add a couple more conditions to our WHERE clause.
Filtering by Name, Date and Time
In the following example, we will filter out all of the times which Helmholtz clocked in that were before 7:01:00 and during the work week that lasted from the 8th to the 12th of August:
SELECT `Employees`.`First_Name`, `Employees`.`Last_Name`, `Hours`.`Clock_In`, `Hours`.`Clock_Out` FROM `Employees` INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID` WHERE `Employees`.`First_Name` = 'Helmholtz' AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') >= '2005-08-08' AND DATE_FORMAT(`Hours`.`Clock_In`, '%Y-%m-%d') <= '2005-08-12' AND DATE_FORMAT(`Hours`.`Clock_In`, '%H:%i:%S') > '07:00:59';
The result:
+------------+-----------+---------------------+---------------------+ | First_Name | Last_Name | Clock_In | Clock_Out | +------------+-----------+---------------------+---------------------+ | Helmholtz | Watson | 2005-08-09 07:03:44 | 2005-08-09 17:00:00 | | Helmholtz | Watson | 2005-08-12 07:02:07 | 2005-08-12 16:58:23 | +------------+-----------+---------------------+---------------------+ 2 rows in set (0.00 sec)
We have now, by merely adding a few more conditions, eliminated all of the irrelevant information; Helmholtz was late to work on the 9th and the 12th of August.
Displaying Total Work Hours per Day
Suppose you would like to—based on the information stored in both of our tables in the employee database—develop a quick list of the total hours each employee has worked for each day recorded; a simple way to estimate the time each employee worked per day is exemplified below:
SELECT `Employees`.`ID`, `Employees`.`First_Name`, `Employees`.`Last_Name`, `Hours`.`Clock_In`, `Hours`.`Clock_Out`, DATE_FORMAT(`Hours`.`Clock_Out`, '%T')-DATE_FORMAT(`Hours`.`Clock_In`, '%T') AS 'Total_Hours' FROM `Employees` INNER JOIN `Hours` ON `Employees`.`ID` = `Hours`.`ID`;
The result (limited by 10):
+----+------------+-----------+---------------------+---------------------+-------------+ | ID | First_Name | Last_Name | Clock_In | Clock_Out | Total_Hours | +----+------------+-----------+---------------------+---------------------+-------------+ | 1 | Mustapha | Mond | 2005-08-08 07:00:42 | 2005-08-08 17:01:36 | 10 | | 1 | Mustapha | Mond | 2005-08-09 07:01:34 | 2005-08-09 17:10:11 | 10 | | 1 | Mustapha | Mond | 2005-08-10 06:59:56 | 2005-08-10 17:09:29 | 11 | | 1 | Mustapha | Mond | 2005-08-11 07:00:17 | 2005-08-11 17:00:47 | 10 | | 1 | Mustapha | Mond | 2005-08-12 07:02:29 | 2005-08-12 16:59:12 | 9 | | 2 | Henry | Foster | 2005-08-08 07:00:25 | 2005-08-08 17:03:13 | 10 | | 2 | Henry | Foster | 2005-08-09 07:00:57 | 2005-08-09 17:05:09 | 10 | | 2 | Henry | Foster | 2005-08-10 06:58:43 | 2005-08-10 16:58:24 | 10 | | 2 | Henry | Foster | 2005-08-11 07:01:58 | 2005-08-11 17:00:45 | 10 | | 2 | Henry | Foster | 2005-08-12 07:02:12 | 2005-08-12 16:58:57 | 9 | +----+------------+-----------+---------------------+---------------------+-------------+ 10 rows in set (0.00 sec)