9 SQL and More
9 SQL and More
R x S: A R.B S.B C D
1 2 2 5 6
1 2 4 7 8
1 2 9 10 11
3 4 2 5 6
3 4 4 7 8
3 4 9 10 11
Example
Instance of Student: Instance of Course:
ID firstName lastNam GPA Address courseNumbe name noOfCredits
e r
111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3
22 Sue Brown 3.1 71 Main st. Comp353 Databases 4
2
33 Ann
3 SELECT Johns
FROM3.7Student,
39 Bay st.
Course;
ID firstName lastNam GPA Address courseNumbe name noOfCredits
e r
111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3
111 Joe Smith 4.0 45 Pine av. Comp353 Databases 4
22 Sue Brown 3.1 71 Main st. Comp352 Data structures 3
2
22 Sue Brown 3.1 71 Main st. Comp353 Databases 4
2
33 Ann Johns 3.7 39 Bay st. Comp352 Data structures 3
3
Example
Instance of Student: Instance of Course:
ID firstName lastNam GPA Address courseNumbe name noOfCredits
e r
111 Joe Smith 4.0 45 Pine av. Comp352 Data structures 3
22 Sue Brown 3.1 71 Main st. Comp353 Databases 4
2
33 Ann Johns 3.7 39 Bay st.
3
ID courseNumbe
SELECT ID, courseNumber r
FROM Student, Course; 111 Comp352
111 Comp353
22 Comp352
2
22 Comp353
2
33 Comp352
3
Example
Relation schemas:
Student (ID, firstName, lastName, address, GPA)
Ugrad (ID, major)
Query:
Find all information available about every undergraduate student
We can try to compute the Cartesian product ()
SELECT FROM Student, Ugrad;
Example
Instance of Student: Instance of Ugrad:
ID firstName lastNam GPA Address ID major
e
111 CS
111 Joe Smith 4.0 45 Pine av. 333 EE
22 Sue Brown 3.1 71 Main st.
2
33 Ann
3 SELECT Johns
FROM3.7Student,
39 Bay st.
Ugrad;
ID firstName lastNam GPA Address ID major
e
111 Joe Smith 4.0 45 Pine av. 111 CS
111 Joe Smith 4.0 45 Pine av. 33 EE Which tuples should
3
22 Sue 111 CS
be in the query result
Brown 3.1 71 Main st.
2 and which shouldn’t?
22 Sue Brown 3.1 71 Main st. 33 EE
2 3
33 Ann Johns 3.7 39 Bay st. 111 CS
Example
Instance of Student: Instance of Ugrad:
ID firstName lastNam GPA Address ID major
e
111 CS
111 Joe Smith 4.0 45 Pine av. 333 EE
22 Sue Brown 3.1 71 Main st.
2
33 Ann Johns 3.7 39 Bay st.
3
SELECT
FROM Student, Ugrad
WHERE Student.ID = Ugrad.ID;
ID firstName lastNam GPA Address ID major
e
111 Joe Smith 4.0 45 Pine av. 111 CS
33 Ann Johns 3.7 39 Bay st. 33 EE
3 3
Joins in SQL
The above query is an example of Join operation
There are various kinds of joins and we will study them
later in detail
To join relations R1,…,Rn in SQL:
List all these relations in the FROM clause
Express the conditions in the WHERE clause in order to get the
desired join
Joining Relations
Relation schemas:
Movie (title, year, length, filmType)
Owns (title, year, studioName)
Query:
Find title, length, and studio name of every movie
Query in SQL:
SELECT Movie.title, Movie.length, Owns.studioName
FROM Movie, Owns
WHERE Movie.title = Owns.title AND Movie.year = Owns.year;
Is Owns in Owns.studioName necessary?
Joining Relations
Relation schemas:
Movie (title, year, length, filmType)
Owns (title, year, studioName)
Query:
Find the title and length of every movie produced by Disney
Query in SQL:
SELECT Movie.title, length
FROM Movie, Owns
WHERE Movie.title = Owns.title AND Movie.year = Owns.year
AND studioName = ’Disney’;
Joining Relations
Relation schemas:
Movie (title, year, length, filmType)
Owns (title, year, studioName)
StarsIn (title, year, starName)
Query:
Find the title and length of each movie with Julia Roberts,
produced by Disney
Query in SQL:
SELECT Movie.title, Movie.length
FROM Movie, Owns, StarsIn
WHERE Movie.title = Owns.title AND Movie.year = Owns.year
AND Movie.title = StarsIn.title AND Movie.year = StarsIn.year
AND studioName = ’Disney’ AND starName = ’Julia Roberts’;
Example
Movie Owns
title year length filmType title year studioName
T1 1990 124 color T1 1990 Disney
T2 1991 144 color T2 1991 MGM
StarsIn
title year starName title length
T1 1990 JR T1 124
T2 1991 JR
Query in SQL:
SELECT Exec.name
FROM Exec, Movie, StarsIn
WHERE Exec.cert# = Movie.producerC# AND
Movie.title = StarsIn.title AND
Movie.year = StarsIn.year AND
starName = ’Harrison Ford’;
Correlated Subqueries
Relation schema:
Movie(title, year, length, filmType, studioName, producerC#)
Query:
Find movie titles that appear more than once
Query in SQL:
SELECT title
FROM Movie Old
WHERE year < ANY (SELECT year
FROM Movie
WHERE title = Old.title);
Note the scopes of the variables in this query.
Correlated Subqueries
Query in SQL
SELECT title
FROM Movie Old
WHERE year ANY (SELECT year
FROM Movie
WHERE title = Old.title);
The condition in the outer WHERE is true only if there is a movie with same
title as Old.title that has a later year
The query will produce a title one fewer times than there are movies with that title
What would be the result if we used “”, instead of “” ?
For a movie title appearing 3 times, we would get 3 copies of the title in the output
Correlated Subqueries
Old
#Producer Studio Name Film Type Length Year Title
2 Disney Color 90 199 Star Wars
5
1 MTM Color 60 199 Gladiator
6
3 Boly Color 90 200 Star wars
0
4 Disney Color 45 200 Brave heart
3
Movie
3
#Producer MTM
Studio Name Color
Film Type 50
Length 200
Year Star
Title wars
Disney Color 2 Star Wars
2 90 199
4 Disney Color 110 200
5 Brave heart
MTM Color 5 Gladiator
1 60 199
6
3 Boly Color 90 200 Star wars
0
4 Disney Color 45 200 Brave heart
Aggregation in SQL
SQL provides five operators that apply to a column of
a relation and produce “some kind of summary”
These operators are called aggregations
These operators are used by applying them to a
scalar-valued expression, typically a column name, in
a SELECT clause
Aggregation Operators
SUM
the sum of values in the column
AVG
the average of values in the column
MIN
the least value in the column
MAX
the greatest value in the column
COUNT
the number of values in the column, including the duplicates, unless
the keyword DISTINCT is used explicitly
Example
Relation schema:
Exec(name, address, cert#, netWorth)
Query:
Find the average net worth of all movie executives
Query in SQL:
SELECT AVG(netWorth)
FROM Exec;
The sum of “all” values in the column netWorth divided by
the number of these values
In general, if a tuple appears n times in a relation, it will be
counted n times when computing the average
Example
Relation schema:
Exec (name, address, cert#, netWorth)
Query:
How many tuples are there in the Exec relation?
Query in SQL:
SELECT COUNT(*)
FROM Exec;
The use of * as a parameter is unique to COUNT;
from column name, and then counts the number of values there
Aggregation -- Grouping
Often we need to consider the tuples in an SQL query in
groups, with regard to the value of some other column(s)
Example: suppose we want to compute:
Total length in minutes of movies produced by each studio:
Movie(title, year, length, filmType, studioName, producerC#)
We must group the tuples in the Movie relation according to
their studio, and get the sum of the length values within each
group; the result would be something like:
studio SUM(length)
Disney 12345
MGM 54321
… …
Aggregation - Grouping
Relation schema:
Movie(title, year, length, filmType, studioName, producerC#)
Query: What is the total length in minutes produced by each studio?
Query in SQL:
SELECT studioName, SUM(length)
FROM Movie
GROUP BY studioName;
Whatever aggregation used in the SELECT clause will be applied
only within groups
Only those attributes mentioned in the GROUP BY clause may
appear unaggregated in the SELECT clause
Can we use GROUP BY without using aggregation? (Yes/No)
Aggregation -- Grouping
Relation schema:
Movie(title, year, length, filmType, studioName, producerC#)
Exec(name, address, cert#, netWorth)
Query:
For each producer (name), list the total length of the films produced
Query in SQL:
SELECT Exec.name, SUM(Movie.length)
FROM Exec, Movie
WHERE Movie.producerC# = Exec.cert#
GROUP BY Exec.name;
Aggregation – HAVING clause
We might be interested in not all but some groups of tuples
that satisfy certain conditions
We can follow a GROUP BY clause with a HAVING clause
HAVING is followed by some conditions about the group
We can not use a HAVING clause without GROUP BY
Aggregation – HAVING clause
Relation schema:
Movie (title, year, length, filmType, studioName, producerC#)
Exec(name, address, cert#, netWorth)
Query:
For those producers who made at least one film prior to 1930, list the
total length of the films produced
Query in SQL:
SELECT Exec.name, SUM(Movie.length)
FROM Exec, Movie
WHERE producerC# = cert#
GROUP BY Exec.name
HAVING MIN(Movie.year) 1930;
Aggregation – HAVING clause
This query chooses the group based on the property of the group
This query chooses the movies based on the property of each movie tuple
In general:
ORDER BY A1 ASC, B DESC, C ASC;
Database Modifications
SQL & Database Modifications?
Next we will look at SQL statements that do not return something,
but rather change the state of the database
There are three types of such SQL statements/transactions:
Insert tuples into a relation
Delete certain tuples from a relation
Update values of certain components of certain existing tuples
We refer to these types of operations collectively as database
modifications, and refer to such requests as transactions
Insertion
The insertion statement consists of:
The keyword INSERT INTO
The name of a relation R
A parenthesized list of attributes of the relation R
The keyword VALUES
A tuple expression, that is, a parenthesized list of concrete values,
one for each attribute in the attribute list
The form of an insert statement:
INSERT INTO R(A1, …An) VALUES (v1,… vn);
• A tuple is created and added, where vi is the value of attribute Ai,
for i = 1,2,…,n
Insertion
Relation schema:
StarsIn (title, year, starName)
Update the database:
Add “Sydney Greenstreet” to the list of stars of The Maltese Falcon
In SQL:
INSERT INTO StarsIn (title,year, starName)
VALUES(’The Maltese Falcon’, 1942, ’Sydney Greenstreet’);
Another formulation of this query:
INSERT INTO StarsIn
VALUES(’The Maltese Falcon’, 1942, ’Sydney Greenstreet’);
Insertion
The previous insertion statement was very simple
It added only one tuple into a relation
Instead of using explicit values for one tuple, we can compute
a set of tuples to be inserted using a subquery
This subquery replaces the keyword VALUES and the tuple
expression in the INSERT statement
Insertion
Database schema:
Studio(name, address, presC#)
Movie(title, year, length, filmType, studioName, producerC#)
Update the database:
Add to Studio, all studio names mentioned in the Movie relation