5 Relational Algebra 1
5 Relational Algebra 1
Administrative notes
Tutorial 3 Normalization
5
CPSC 304
Introduction to Database Systems
Formal Relational Languages
Textbook Reference
Database Management Systems: 4 - 4.2
(skip the calculii)
Learning Goals
Identify the basic operators in Relational
Algebra (RA).
Use RA to create queries that include
combining RA operators.
Given an RA query and table schemas
and instances, compute the result of the
query.
7
Databases: the continuing saga
When last we left databases…
We learned that they’re excellent things
We learned how to conceptually model them
using ER diagrams
We learned how to logically model them using
relational schemas
We knew how to normalize our database
relations
We’re almost ready to use SQL to query it,
but first…
8
Balance, Daniel-san, is key
The mathematical foundations:
Relational Algebra
Clear way of describing core concepts
partially procedural: describe what you want
and how you want it, but the order of
operations matters
9
Relational Query Languages
Allow data manipulation and retrieval from a DB
Relational model supports simple, powerful QLs:
Strong formal foundation based on logic
Allows for much optimization via query optimizer
Query Languages != Programming Languages
QLs not intended for complex calculations
QLs provide easy access to large datasets
Users do not need to know how to navigate through
complicated data structures
10
Relational Algebra (RA)
All in one place
Basic operations:
Selection (σ): Selects a subset of rows from relation.
Projection (π): Deletes unwanted columns from relation.
Cross-product (x): Allows us to combine two relations.
Set-difference (-): Tuples in relation 1, but not in relation 2.
Union (È): Tuples in relation 1 and in relation 2.
Rename (ρ): Assigns a (another) name to a relation
Additional, inessential but useful operations:
Intersection (Ç), join (⋈), division (/), assignment(¬)
All operators take one or two relations as inputs and give a
new relation as a result
For the purposes of relational algebra, relations are sets
Operations can be composed. (Algebra is “closed”) 11
Example Movies Database
Movie(MovieID, Title, Year)
12
Example Instances
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
MovieID StarID Character
StarsIn: 1 1 Han Solo
4 1 Indiana Jones
2 2 Ilsa Lund
3 3 Dorothy Gale
StarID Name Gender
MovieStar: 1 Harrison Ford Male
2 Ingrid Bergman Female
13
3 Judy Garland Female
Selection (σ (sigma))
Notation: s p(r)
p is called the selection predicate Set of
v Defined as: tuples of r
satisfying p
sp(r) = {t | t Î r and p(t)}
Where p is a formula in propositional calculus
consisting of:
connectives : Ù (and), Ú (or), ¬ (not)
and
predicates:
<attribute> op <attribute> or
<attribute> op <constant>
where op is one of: =, ¹, >, ³, <, £
14
Selection Example
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
15
Selection Example
Find all male stars from the MovieStar table.
s Gender = ‘Male’MovieStar
16
Projection (π (pi))
Notation:
πA1, A2, …, Ak (r)
where A1, …,Ak are attributes (the
projection list) and r is a relation.
The result: a relation of the k attributes
A1, A2, …, AK obtained from r by erasing
the columns that are not listed
Duplicate rows removed from result
(relations are sets)
17
Projection Examples
Movie: pTitle, Year (Movie)
MovieID Title Year Title Year
1 Star Wars 1977 Star Wars 1977
2 Casablanca 1942 Casablanca 1942
3 The Wizard of Oz 1939 The Wizard of Oz 1939
4 Indiana Jones and the 1981 Indiana Jones and the 1981
Raiders of the Lost Ark Raiders of the Lost Ark
πStarID(StarsIn)
StarID
1
2
3
19
Clicker Projection Example
Suppose relation R(A,B,C) has the tuples:
A B C
1 2 3
4 2 3
4 5 6
2 5 3
1 2 6 Compute the projection πC,B(R), and
identify one of its tuples from the list below.
A. (2,3)
B. (4,2,3)
C. (6,4)
D. (6,5)
20
E. None of the above
Clicker Projection Example
Suppose relation R(A,B,C) has the tuples:
A B C
1 2 3
4 2 3
4 5 6
2 5 3
1 2 6 Compute the projection πC,B(R), and
identify one of its tuples from the list below.
A. (2,3) Wrong order C B
MovieID
2
3
22
Selection and Projection Example
Find the ids of movies made prior to 1950
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
23
Selection and Projection Example
Would this work?
A) Yes MovieID Title Year
1 Star Wars 1977
B) No 2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
24
Selection and Projection Example
Would this work?
A) Yes MovieID Title Year
1 Star Wars 1977
B) No 2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
25
Union, Intersection, Set-Difference
Notation: r È s rÇs r–s
Defined as:
r È s = {t | t Î r or t Î s}
r Ç s ={ t | t Î r and t Î s }
r – s = {t | t Î r and t Ï s}
For these operations to be well-defined:
1. r, s must have the same arity (same number of
attributes)
2. The attribute domains must be compatible
(e.g., 2nd column of r has same domain of values
as the 2nd column of s)
What is the schema of the result?
26
Union, Intersection, and
Set Difference Examples
MovieStar Singer
StarID Name Gender StarID SName Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Bergman Female 4 Sam Smith Non-binary
3 Judy Garland Female
Name
Sam Smith
28
Set Operator Example
MovieStar Singer
StarID Name Gender StarID Name Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Bergman Female 4 Sam Smith Non-binary
3 Judy Garland Female
29
Cartesian (or Cross)-Product
Notation: rxs
Defined as:
r x s = { t q | t Î r and q Î s}
It is possible for r and s to have attributes
with the same name, which creates a
naming conflict.
In this case, the attributes are referred to
solely by position.
30
Cartesian Product Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale
MovieStar x StarsIn
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
31
… … … … … …
Rename (ρ (rho))
Allows us to name results of relational-algebra expressions.
Notation
r (X, E)
returns the expression E under the name X
32
Rename (ρ (rho))
We can rename the resulting relation and the attributes in
that relation
r(GenderlessStars(ID,Nom), πStarID,Name(MovieStar))
MovieStar !StarID,Name(MovieStar)
StarID Name Gender StarID Name
1 Harrison Ford Male 1 Harrison Ford
2 Ingrid Female 2 Ingrid
Bergman Bergman
3 Judy Garland Female 3 Judy Garland
GenderlessStars
ID Nom
1 Harrison Ford
2 Ingrid Bergman
3 Judy Garland 33
ρ Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale
35
Joins (⋈)
Condition Join:
R ⋈cS = σc(R×S)
Result schema same as cross-product.
Fewer tuples than cross-product
might be able to compute more efficiently
Sometimes called a theta-join.
The reference to an attribute of a relation R
can be by position (R.i) or by name
(R.name)
36
Condition Join Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale
37
MovieStar ⋈ MovieStar.StarID < StarsIn.StarID StarsIn
MovieStar x StarsIn (first get the cross product)
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
1 Harrison Ford Male 2 2 Ilsa Lund
2 Ingrid Bergman Female 2 2 Ilsa Lund
3 Judy Garland Female 2 2 Ilsa Lund
1 Harrison Ford Male 3 3 Dorothy Gale
2 Ingrid Bergman Female 3 3 Dorothy Gale
3 Judy Garland Female 3 3 Dorothy Gale 38
MovieStar ⋈ MovieStar.StarID < StarsIn.StarID StarsIn
Now remove rows based on the condition stated above.
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
1 Harrison Ford Male 2 2 Ilsa Lund
2 Ingrid Bergman Female 2 2 Ilsa Lund
3 Judy Garland Female 2 2 Ilsa Lund
1 Harrison Ford Male 3 3 Dorothy Gale
2 Ingrid Bergman Female 3 3 Dorothy Gale
3 Judy Garland Female 3 3 Dorothy Gale 39
Condition Join Clicker Example
Compute R ⋈ R.A < S.C and R.B < S.DS where:
R(A,B): S(B,C,D):
A B B C D
1 2 2 4 6
3 4 4 6 8
5 6 4 7 9
Assume the schema of the result is (A, R.B, S.B, C, D).
Which tuple is in the result?
A. (1,2,2,6,8)
B. (1,2,4,4,6)
C. (5,6,2,4,6)
D. All are valid
E. None are valid
40
Condition Join Clicker Example
Compute R ⋈ R.A < S.C and R.B < S.DS where:
R(A,B): S(B,C,D):
A B B C D
1 2 2 4 6
3 4 4 6 8
5 6 4 7 9
Assume the schema of the result is (A, R.B, S.B, C, D).
Which tuple is in the result?
A. (1,2,2,6,8) (2,6,8) would have to be in S
B. (1,2,4,4,6) (4,4,6) would have to be in S
C. (5,6,2,4,6) Violates R.A < SC & R.B < S.D
D. All are valid (5 > 4, and 6 = 6)
E. None are valid Correct
41
One more thing you may find
helpful: Assignment Operation
Notation: t ¬ E
assigns the result of expression E to a temporary
relation t.
Used to break complex queries to small steps.
Assignment is always made to a temporary
relation variable.
Example: Write r Ç s in terms of È and/or –
temp1 ¬ r - s
result ¬ r – temp1
r s
49