Chapter 1
Chapter 1
SQL By Example
John Russo
10 9 8 7 6 5 4 3 2 1
SQL by Example uses one case study to teach the reader basic structured
query language (SQL) skills. The author has tested the case study in the
classroom with thousands of students. While other SQL texts tend to use
examples from many different data sets, the author has found that once
students get used to one case study, they learn the material at a much
faster rate. The text begins with an introduction to the case study and trains
the reader to think like the query processing engine for a relational data-
base management system. Once the reader has a grasp of the case study
then SQL programming constructs are introduced with examples from the
case study. In order to reinforce concepts, each chapter has several exer-
cises with solutions provided on the book’s website. SQL by Example
is designed both for those who have never worked with SQL as well as
those with some experience. It is modular in that each chapter can be
approached individually or as part of a sequence, giving the reader flexi-
bility in the way that they learn or refresh concepts. This also makes the
book a great reference to refer back to once the reader is honing his or her
SQL skills on the job.
KeyWords
7 Grouping Data 83
7.1 Introduction 83
7.2 Objectives 83
7.3 The Group by Clause 83
7.4 Calculated Fields 86
7.5 Group by with the Having Clause 88
7.6 Using Count Distinct 89
7.7 Count Distinct and Outer Joins 90
7.8 Inline Views 92
7.9 Summary 95
About the Author 97
Index 99
List of Figures
Figure 2.11. Results of query for items with weight > 500. 32
Figure 3.1. Query of all items weighing more than 500 pounds
using > operator. 36
Figure 3.2. Query of all items of type WP and weight > 500. 39
Figure 3.3. Query of all items of type WP or BL. 40
Figure 3.4. Query of all items that weight more than 2,000 pounds
and are not building materials. 40
Figure 3.5. Query from Figure 3.4 with != operator. 41
Figure 3.6. Query of shipments that originated in Boston and had
captain 001-24 as captain. 43
Figure 3.7. Query of John Smith’s shipments from Boston or
Seattle.43
Figure 3.8a. Query using the between operator. 44
Figure 3.8b. Query using a compound conditional. 45
Figure 3.9. Query of all items weighing more than 500 pounds
using between operator. 45
Figure 3.10a. Query of all captains whose last name begins
with an s. 46
Figure 3.10b. Query of all captains whose last name ends with
e using wildcard. 46
Figure 3.11. Query of all captains whose last name contains
an e. 47
Figure 3.12a. Example query using or operator. 48
Figure 3.12b. Example query using in statement. 48
Figure 3.13. Query to sort captain table by last name. 49
Figure 3.14. Query to sort captain table by last name in
descending order. 50
Figure 4.1. Ship table. 52
Figure 4.2. Query to join captain and shipment tables. 57
Figure 4.3. Query to join shipment and captain tables using
ANSI 1999 standard SQL. 57
Figure 4.4. Query to join manufacturer to ship table to find
manufacturer that built ship 37. 58
Figure 4.5. Query to find different classes of ships built by
manufacturers in California. 59
Figure 4.6. Query to find all shipments that require more than
5 days of travel. 60
List of Figures • xv
I would first like to thank the thousands of students who have successfully
used the Shore to Shore Shipping case study over the years. Through their
feedback, the case study has been refined and made even more useful.
I would also like to thank my former institution, Wentworth Insti-
tute of Technology, as well as my current college, Landmark College, for
the support during the writing of this book. I would also like to thank
my mentor and friend, Meggin McIntosh who gave me invaluable guid-
ance to bring this project to completion. In addition, I would like to thank
my mentor and friend, Monte Unger, who helped me to realize my desire
to write.
Introduction
Key Features
Pathways of Learning
Topic Chapters
Basic SQL syntax 2 and 3
Joining tables 4 and 5
Sub-queries 6
Grouping 7
The scripts for the case study as well as solutions to the on your own exer-
cises and scripts to run example queries can be found at www.profrusso.
com/SQL_BY_EXAMPLE. Also, instructions for loading the scripts can
be found here.
CHAPTER 1
1.1 Overview
In this chapter, I will first introduce a case study that we will use through-
out our discussion of SQL. I have designed this case study after numer-
ous iterations of teaching SQL and have found that students learn best by
using one set of tables with which they become very familiar. Once the
case study has been presented, we will then go through several sample
queries by hand. The objective of this is to learn to think through queries
before writing the actual SQL. We will then move on to a discussion of
SQL syntax and present some examples from the case study.
1.2 Objectives
1. Ship
(a) Each ship has a ship number, class, capacity, date of purchase
and manufacturer ID.
(b) Each ship manufacturer has a manufacturer ID, name, city,
state, representative ID and a bidding preference.
2. Captain
• Each captain has an ID, a first name, last name, license grade
and a date of birth.
3. Item
• Every item that is shipped has an item number, a type, a descrip-
tion and a weight.
4. Shipment
When a shipment is sent out, a shipment manifest is generated, as
shown in Figure 1.1.
(a) The heading contains the shipment id, order date, origin, and desti-
nation, expected date of arrival, ship number and captain.
(b) The body of the manifest contains line items, which represent a
component of the shipment. Each line contains an item number,
type, description, weight and quantity. The total weight is calcu-
lated by multiplying the weight by the quantity.
Heading
SHIPMENT_ID: 11-0006 SHIPMENT DATE: 09/17/2017
DESTINATION: SEATTLE
(c) The footing of the manifest contains the total weight for the entire
shipment. This is compared against the capacity of the ship to
ensure that a ship is not overloaded.
(a) For every shipment, the shipment id, date that the shipment left
port, the origin, destination, ship number and captain. Additional
information about the captain is stored with the captain informa-
tion. The expected arrival date is not stored but is calculated based
upon the origin and destination. Although it is not shown on the
shipment manifest, the arrival date of the shipment is recorded and
stored in the database when it arrives at its destination.
(b) For each line item, the shipment id, item number and quantity. The
type, description and weight are stored with the item information.
The total weight is not stored, since it can be calculated from the
weight and the quantity.
(c) The shipment total weight is not stored in the database. It can be
calculated whenever it is needed for a report or query.
Figures 1.2 to 1.8 show sample data for Shore to Shore. Let’s take a
look around and explore the data a little bit. Once you have a thorough
MANUFACTURER
SHIP
understanding of the data, we will begin to look at how the tables are
related and how information can be obtained by combining tables. But
first, let’s begin by examining each table, its columns and our sample data.
1.4 Captain
ITEM
SHIPMENT
or she can pilot. This column is called License_Grade. We can look across
the table from Capt_ID to License_Grade and determine that the captain
with an ID of 002-14 has a license grade of 2. Since we also want to keep
6 • SQL By Example
DISTANCE
track of a captain’s name, we have two columns to store the captain’s first
name (Fname) and last name (Lname). Looking again to our data, you
can see that captain 004-02 is named Marcia Nesmith. Finally, we wish to
store the date of birth for each captain in a column named DOB. Marcia
Nesmith’s date of birth is May 1, 1957.
Now, just to make sure that you understand the data in the table,
answer the following review questions. We are not asking you to write
SQL queries, but rather to just answer the questions by looking at the
captain table.
SHIPMENT_LINE
1.5 Manufacturer
There are seven manufacturers in our table, each with a unique manufac-
turer ID numbered from 210 to 216. Manufacturer 212 is named Master
Dynamics and is located in Bridgeport, Connecticut.
The Shore to Shore Shipping Case Study • 9
1.6 Ship
There are also 10 ships in our ship table. Each ship has a unique ship
number, class, capacity (in pounds), purchase date, and manufacturer ID.
For example, ship number 11 is a class 2 ship with a capacity of 50,000
pounds. It was purchased on January 30, 1971 and was manufactured by
manufacturer ID 210 (United Ship Builders).
Notice that we do not store specific information about the manufacturer
in the ship table. This information is stored separately in the m anufacturer
table. Later on, you will learn about joining tables together to perform mul-
titable queries. In order to help you to understand this b etter, let’s try gath-
ering some information from both the ship and manufacturer table.
1.6.1 Example Queries
Solution:
First, we will look in the ship table. The manufacturer ID for ship number
37 is 216. Next, look at the manufacturer table.
Manufacturer ID 216 is Union Corp.
Solution:
Since the ship table does not contain the name of the manufacturer, but
only the manufacturer ID, we need to look at the manufacturer table first to
find the ID for General Ship Builders, which is 211. Next, we can examine
the ship table and see that ships 35 and 39 have a manufacturer ID of 211.
Example 1.3 List all the different classes of ships that were built by
manufacturers located in California (State = CA):
Solution:
For this query, we simply want to list the distinct combinations of classes.
Since we do not have specific information about the manufacturer in the
ship table, we must join both tables together. First, let’s look at the man-
ufacturer table. We can see that the only manufacturer in California is
General Ship Builders. The manufacturer ID is 211. Next, let’s look at the
ship table. We can see that ships 35 and 39 both have 211 as the manufac-
turer ID. Notice that ship 35 has a class of 2 and ship 39 has a class of 1.
Therefore, all manufacturers in California build class 1 and class 2 ships.
10 • SQL By Example
For each exercise, write out the answer as well as how you obtained the
result
1.7 Item
There are 16 items, each with a unique item number. Let’s look at item
2123, which is of type FP and has a description of Rice. Its weight is 300
pounds for one unit. The item types are used to differentiate the different
categories of items. For example, FP is short for food products, BL for
building materials, and so on.
1.8 Distance
The Distance table is used to keep track of the distance between any given
origin and destination. Looking at the distance table, you can see that the
distance between Boston and Brazil is 2,500 miles and takes four days.
The number of days is used to calculate an estimate of the arrival date.
1.9 Shipment
is 004-02. A quick glance over at the captain table tells us that the captain’s
name is Marcia Nesmith. We might be curious to see when this shipment
is expected to arrive. If we look at the Distance table, we can see that
Boston to Seattle normally takes seven days. This shipment should have
arrived on September 22, 2018. Management reports could be generated
from calculations such as this to flag problems.
Since the Shipment table can be linked to the Ship table by ship_no,
the Captain table by capt_id and the Distance table by origin and desti-
nation, we can find out a lot of useful information about each shipment
based upon information contained in the other tables. Once again, we will
present several queries and explain how to generate the results. Of course,
you will also have the opportunity to explore the tables on your own.
1.9.1 Example Queries:
Solution:
We can examine the origin column and determine that eight shipments
originated in Boston: 09-0002, 09-0005, 10-0001, 10-0003, 10-0004,
11-0001, 11-0003, 11-0005
Example 2.2 List all shipments that require more than five days of travel.
Solution:
First, we need to examine the Distance table and find all combinations
of origin and destination that have a value greater than five in the days
column.
We can see that the following combinations meet this requirement:
Origin Destination
BOSTON SEATTLE
LONDON SEATTLE
SEATTLE BOSTON
SEATTLE LONDON
Next, let’s look at the Shipment table and find all shipments that have
any Origin and Destination combinations shown in the preceding table.
We can see that shipments 11-0003 and 11-0005 go from Boston to Seat-
tle, shipments 09-0004 and 11-0006 go from London to Seattle, shipments
10-0002 and 11-0004 go from Seattle to London. Shipment 09-0001 goes
12 • SQL By Example
from Seattle to Boston. Therefore, the results of our query are the follow-
ing shipments:
09-0001
09-0004
10-0002
11-0003
11-0004
11-0005
11-0006
Example 2.3 List all shipments that had John Smith as the captain.
Solution:
Since the shipment table only contains the captain ID, we need to first look
up the captain ID for John Smith in the Captain table. His ID is 001-24.
Now, let’s look back at the shipment table. We can see that he was the
captain for shipment 10-0004 and shipment 11-0004.
Example 2.4 List all shipments that were carried by a ship manufactured
by General Ship Builders.
Solution:
This is a bit more complicated query and will require two steps.
The first step is to determine the ship numbers for all ships manufac-
tured by General Ship Builders. This is the same query as example 1.2.
The result of example 1.2 was ship 35 and 39.
The next step is to look at the Shipment table and find all shipments
with a ship number of 35 or 39. Upon careful examination, we determine
that shipments 10-0004 and 11-0001 were carried by ship number 39.
For each exercise, write out the answer as well as how you obtained the
result.
3.1 List all shipments that were greater than 3,000 miles.
3.2 List all shipments that were greater than 3,000 miles and had Sal
Levine as the captain.
3.3 Find the total number of shipments that originated in Boston.
The Shore to Shore Shipping Case Study • 13
3.4 List all shipments that originated in Boston and were carried by ships
built by General Ship Builders.
3.5 (Advanced) List all the shipments that arrived late. (Hint: only look at
shipments with an arrival date, assume that a blank arrival date means
that the shipment has not arrived yet. The expected arrival date can be cal-
culated by adding the days from the distance table to the shipment date.)
1.10 Shipment_Line
Example 3.1 List all shipments that contained item number 3223
Solution:
We can look at the item_no column and determine that the following ship-
ments contained item 3223: 09-0001, 09-0002, 09-0004, 11-0004
Example 3.2 List all shipments that contain food products (type=FP).
Solution:
First, we need to determine all item numbers of items that are type FP.
SHIPMENT_ SHIP_ SHIPMENT_ ARRIVAL_
CAPT_ID ORIGIN DESTINATION ITEM_NO QUANTITY
ID NO DATE DATE
09-0001 25 001-25 March 12, 2016 SEATTLE BOSTON March 19, 3223 100
2016 3297 87
3299 34
09-0002 1 004-03 April 15, 2016 BOSTON SINGAPORE April 20, 3212 100
14 • SQL By Example
2016 3223 50
09-0003 11 003-01 June 01, 2016 BRAZIL BOSTON June 07, 2016 2101 432
2109 1000
2123 34
2125 100
09-0004 5 002-14 July 10, 2016 LONDON SEATTLE July 22, 2016 3212 10
3223 5
09-0005 1 002-15 September 19, BOSTON BRAZIL September 3297 42
2016 23, 2016 7821 5
7823 45
10-0001 1 001-23 January 15, BOSTON LONDON January 21, 7821 10
2017 2017 7829 3
7830 100
Figure 1.9. Line items included in shipment table.
The Shore to Shore Shipping Case Study • 15
Upon examination of the Item table, we find that item numbers 2101,
2109, 2123, and 2125 are of this type.
Next, looking at the Item_no column of the Shipment_Line table, we
find that shipment 10-0004 contains items 2101, 2109, and 2125. Ship-
ment 11-0005 contains items 2101 and 2109. Shipment 11-0006 contains
only 2125 and shipment 09-0003 contains all of the items that are of type
food product.
Example 3.3 List all shipments that contain building materials (Type=BL)
and were shipped on a ship manufactured by Best Industries.
Solution:
Whenever you encounter a fairly complex problem, it is best to break it
down into smaller subproblems and look for problems which you have
seen before. In this case, our first subproblem would be to determine the
shipment numbers of all shipments that carried building materials. This is
exactly the same problem as example 3.2, except we are looking for ship-
ments that contain building materials instead of food products. By exam-
ining the Item and the Shipment_Line tables, we determine that shipments
09-0001, 09-0002, 09-0004, 09-0005, and 11-0004 contained building
materials. Notice that the shipment_line table does not contain any infor-
mation about the ship other than the ship number. We must go to the ship
table to determine which ships were manufactured by Best Industries. This
query is the same as example query 2.2, except we are looking for all ships
manufactured by Best Industries. Since the ship table does not contain the
name of the manufacturer, but only the manufacturer ID, we need to look
at the manufacturer table first to find the ID for Best Industries, which is
215. Next, we can examine the ship table and see that ship number 16 has
a manufacturers ID of 215.
Now, armed with our list of shipments which contain building mate-
rials and our knowledge that ship number 16 was built by Best Industries,
we can now examine the shipment table.
Ship number 16 carried shipment 11-0004, which is in our list of ship-
ments that contain building materials.
We can also represent this graphically. As you go through the text-
book and learn to develop queries, it is often easier to map out the queries
graphically, as in Figure 1.10.
4.2 List all shipments that had John Smith as the captain and contained
building materials.
4.3 List the total weight for shipment number 09-0001.
4.4 Find the shipment that weighed the most.
Now that we have fully described each of the tables in the Shore to Shore
case study, we can begin to think about some practical ways in which
the data would be useful to the organization. Databases are used in most
businesses not just to store information and produce reports, invoices and
other documents, but also to provide management with a means of keep-
ing the organization on target with organizational goals. Let’s examine
one scenario in which the database can be used to assist management.
We will also give you an exercise to try on your own.
The management of Shore to Shore has noticed that there is a problem
with shipments arriving late. Each month, they would like a report to show
shipments that arrived late. A sample report is shown in Figure 1.11.
In order to produce this report, we must think about what information
we are going to need and where it will come from:
1. Heading
The reporting period will be entered by the user, whether it is a
computer operator running the report or a manager who might want
to view it on the screen. The reporting period will be used to deter-
mine what records to include in the detail section of the report.
2. Body
The following information contained in the body of the report will
come directly from the corresponding table as shown in Figure 1.12.
The Shore to Shore Shipping Case Study • 17
Heading
Shore to Shore Shipping Company
Footer
As you can see, most of the information in the body is directly
obtained from the shipment table except for the following:
Captain’s Name
This data element is obtained by finding the captain ID for the
overdue shipment and then looking up the corresponding last name
and first name in the CAPTAIN table.
Days Overdue
In order to determine if a shipment is overdue, we first must
calculate the expected arrival date and compare this to the actual
arrival date. The expected arrival date is calculated by adding
the number of days that the shipment should normally take to the
shipment date. The number of days that the shipment should take
is found in the distance table. In our sample report, the shipment
18 • SQL By Example
Heading
Shore to Shore Shipping Company
Total Shipments: 1
Footer
5.3 Now, let’s look at the total weight element. When you first examine
the report request, you might think that total weight can be determined
from the shipment table. However, we do not store the total weight.
Looking at the item table, we see that there is a weight for each item.
Now, let’s look at the shipment_line table. Notice that there is a quan-
tity for each item that is shipped in a shipment. We can calculate the
total weight for one item in a shipment by multiplying the weight of
the item by the number of items (quantity). However, this only gives
us a part of the picture. Since a shipment is made up of many items,
how do we calculate the total weight for the shipment?
1.12 Summary
In this chapter, we looked at the Shore to Shore Shipping case study with
an eye for thinking like the query engine of a relational database manage-
ment system (RDBMS) when trying to find data from the tables.
Index
A AVG function, 25
and operator, 37, 39, 42, 43 count function, 25–26
ANSI 1992 standard, 56, 57 Max and Min function, 23–24
AVG function, 25
F
B food products (FP), 10
between operator, 44–45 footer information, 18
body information, 16–18 foreign key constraint, 54
building materials (BL), 10
G
C General Ship Builders, 9, 12
captain, 4–6 group by syntax, 84–85
captain data, 2 grouping data
Cartesian product, 54 calculation, 86–88
common field, 65 count distinct, 89–92
composite keys, 53, 59–60 group by clause, 83–86
compound where clauses, 37–41 having clause, 88–89
computed column, 47–49 inline view, 92–94
count distinct, 89–90
and outer joins, 90–92 H
count function, 25–26 having clause, 88–89
heading information, 16
D
database management system, I
16–19 inline view, 92–94
data element, 17 inner joins, 69
distance table, 10, 11 in statement, 47, 48
distinct clause, 26–30 item, 10
dot notation, 21, 22 item data, 2
E J
exists, sub-query, 81 joining tables, 54–57
expression, 23 manufacturer and ship, 57–59
100 • Index
T W
table. see also joining tables; where clause, 30–33
single table queries wildcard character, 46