Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
Relational Model and Algebra: Introduction To Databases Compsci 316 Fall 2014
! "
4
Simplicity is a virtue!
5
Example
Group
User gid name
uid name age pop . $ ! .
# $ % % & / 0 12 /
# ' % % # 3 4 10
()* + ( % * - -
) , ( %
- - - -
Member uid gid
#
# /
Ordering of rows doesn’t matter
()* .
(even though output is
always in some order) ()* /
) .
) /
- -
6
Example
• Schema
• User (uid int, name string, age int, pop float)
• Group (gid string, name string)
• Member (uid int, gid string)
• Instance
• User: {〈 #, $ , %, % &〉, 〈()*, ' , %, % #〉,…}
• Group: {〈 . , $ ! .〉, 〈 /, 0 2 / 〉, …}
• Member: {〈 #, 〉, 〈 # , /〉, …}
8
Relational algebra
A language for querying relational data
based on “operators”
RelOp
RelOp
• Core operators:
• Selection, projection, cross product, union, difference,
and renaming
• Additional, derived operators:
• Join, natural join, intersection, etc.
• Compose operators to make complex queries
9
Selection
• Input: a table
• Notation:
• is called a selection condition (or predicate)
• Purpose: filter rows according to some criteria
• Output: same columns as , but only rows or that
satisfy
10
Selection example
• Users with popularity higher than 0.5
.
More on selection
• Selection condition can include any column of ,
constants, comparison (=, ≤, etc.) and Boolean
connectives (∧: and, ∨: or, ¬: not)
• Example: users with popularity at least 0.9 and age
under 10 or above 12
. ∧ ∨
Projection
• Input: a table
• Notation: '(
• ) is a list of columns in
• Purpose: output chosen columns
• Output: same rows, but only the columns in )
13
Projection example
• IDs and names of all users
'*#+,$ -
More on projection
• Duplicate output rows are removed (by definition)
• Example: user ages
'
Cross product
• Input: two tables R and S
• Natation: × /
• Purpose: pairs rows from two tables
• Output: for each row in and each in /, output
a row (concatenation of and )
16
Join example
• Info about users, plus IDs of their groups
⋈%& !.*#+45 -6 !.*#+ 0 12 uid gid
# ' % % # ()* .
()* + ( % * ()* /
⋈
- - - - ×
%& !.*#+4
5 -6 !.*#+ - -
()* + ( % * ()* .
()* + ( % * ()* /
- - - - - -
22
Union
• Input: two tables and /
• Notation: ∪ /
• and / must have identical schema
• Output:
• Has the same schema as and /
• Contains all rows in and all rows in / (with duplicate
rows removed)
23
Difference
• Input: two tables and /
• Notation: − /
• and / must have identical schema
• Output:
• Has the same schema as and /
• Contains all rows in that are not in /
24
Renaming
• Input: a table and /
• Notation: ;< , ; =>,=?,… , or ;< =>,=?,…
• Purpose: “rename” a table and/or its columns
• Output: a table with the same rows as , but called
differently
• Used to
• Avoid confusion caused by identical column names
• Create identical column names for natural joins
• As with all other relational operators, it doesn’t
modify the database
• Think of the renamed table as a copy of the original
26
Renaming example
• IDs of users who belong to at least two groups
0 12 ⋈? 0 12
'*#+ 0 12 ⋈5 -6 !.*#+45 -6 !.*#+ ∧ 0 12
5 -6 !. #+A5 -6 !. #+
; *#+>, #+> 0 12
'*#+> ⋈*#+> 4*#+? ∧ #+>A #+?
; *#+?, #+? 0 12
27
'*#+>
⋈*#+>4*#+? ∧ #+> A #+?
0 12 0 12
28
• Many more
• Semijoin, anti-semijoin, quotient, …
30
An exercise
• Names of users in Lisa’s groups
Writing a query bottom-up: Their names '$ -
⋈
Users in
Lisa’s groups '*#+
⋈
Lisa’s groups ' #+ 0 12
Who’s Lisa? ⋈
$ - 4B(#& B 0 12
31
Another exercise
• IDs of groups that Lisa doesn’t belong to
Writing a query top-down:
−
All group IDs IDs of Lisa’s groups
' #+ ' #+
C DE ⋈
0 12 $ - 4B(#& B
32
A trickier exercise
• Who are the most popular?
• Who do NOT have the highest pop rating?
• Whose pop is lower than somebody else’s?
−
'*#+ '%& !> .*#+
Monotone operators
RelOp What happens
Add more rows to the output?
to the input...
Relational calculus
• E. EHI E ∈ ∧
¬ ∃EG ∈ : E. D < EG . D }, or
• E. EHI E ∈ ∧
∀EG ∈ : E. D ≥ EG . D }
• Relational algebra = “safe” relational calculus
• Every query expressible as a safe relational calculus
query is also expressible as a relational algebra query
• And vice versa
• Example of an “unsafe” relational calculus query
• E. PQ1 ¬ E ∈
• Cannot evaluate it just by looking at the database
40
Turing machine
• A conceptual device that can
execute any computer algorithm
• Approximates what general-
purpose programming languages
can do
• E.g., Python, Java, C++, … Alan Turing (1912-1954)
5 6 "
41