0% found this document useful (0 votes)
4 views38 pages

5 Relational Algebra 1

The document outlines the administrative details and learning goals for CPSC 304, focusing on relational algebra and its operators for database systems. Key topics include selection, projection, and set operations, along with examples using a movies database. Important dates include Milestone 1 due on February 10, 2025, and Midterm 1 on February 24, 2025.

Uploaded by

Abhi Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views38 pages

5 Relational Algebra 1

The document outlines the administrative details and learning goals for CPSC 304, focusing on relational algebra and its operators for database systems. Key topics include selection, projection, and set operations, along with examples using a movies database. Important dates include Milestone 1 due on February 10, 2025, and Midterm 1 on February 24, 2025.

Uploaded by

Abhi Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

CPSC 304 – Feb 10, 2025

Administrative notes
Tutorial 3 Normalization

Milestone 1 is due on Feb 10th

Midterm 1 is on Feb 24th at 7 pm

5
CPSC 304
Introduction to Database Systems
Formal Relational Languages

Textbook Reference
Database Management Systems: 4 - 4.2
(skip the calculii)
Learning Goals
Identify the basic operators in Relational
Algebra (RA).
Use RA to create queries that include
combining RA operators.
Given an RA query and table schemas
and instances, compute the result of the
query.

7
Databases: the continuing saga
When last we left databases…
We learned that they’re excellent things
We learned how to conceptually model them
using ER diagrams
We learned how to logically model them using
relational schemas
We knew how to normalize our database
relations
We’re almost ready to use SQL to query it,
but first…
8
Balance, Daniel-san, is key
The mathematical foundations:
Relational Algebra
Clear way of describing core concepts
partially procedural: describe what you want
and how you want it, but the order of
operations matters

9
Relational Query Languages
Allow data manipulation and retrieval from a DB
Relational model supports simple, powerful QLs:
Strong formal foundation based on logic
Allows for much optimization via query optimizer
Query Languages != Programming Languages
QLs not intended for complex calculations
QLs provide easy access to large datasets
Users do not need to know how to navigate through
complicated data structures

10
Relational Algebra (RA)
All in one place
Basic operations:
Selection (σ): Selects a subset of rows from relation.
Projection (π): Deletes unwanted columns from relation.
Cross-product (x): Allows us to combine two relations.
Set-difference (-): Tuples in relation 1, but not in relation 2.
Union (È): Tuples in relation 1 and in relation 2.
Rename (ρ): Assigns a (another) name to a relation
Additional, inessential but useful operations:
Intersection (Ç), join (⋈), division (/), assignment(¬)
All operators take one or two relations as inputs and give a
new relation as a result
For the purposes of relational algebra, relations are sets
Operations can be composed. (Algebra is “closed”) 11
Example Movies Database
Movie(MovieID, Title, Year)

StarsIn(MovieID, StarID, Character)

MovieStar(StarID, Name, Gender)

12
Example Instances
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark
MovieID StarID Character
StarsIn: 1 1 Han Solo
4 1 Indiana Jones
2 2 Ilsa Lund
3 3 Dorothy Gale
StarID Name Gender
MovieStar: 1 Harrison Ford Male
2 Ingrid Bergman Female
13
3 Judy Garland Female
Selection (σ (sigma))
Notation: s p(r)
p is called the selection predicate Set of
v Defined as: tuples of r
satisfying p
sp(r) = {t | t Î r and p(t)}
Where p is a formula in propositional calculus
consisting of:
connectives : Ù (and), Ú (or), ¬ (not)
and
predicates:
<attribute> op <attribute> or
<attribute> op <constant>
where op is one of: =, ¹, >, ³, <, £
14
Selection Example
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark

σyear > 1945(Movie)


MovieID Title Year
1 Star Wars 1977
4 Indiana Jones and the 1981
Raiders of the Lost Ark

15
Selection Example
Find all male stars from the MovieStar table.

StarID Name Gender


1 Harrison Ford Male

s Gender = ‘Male’MovieStar
16
Projection (π (pi))
Notation:
πA1, A2, …, Ak (r)
where A1, …,Ak are attributes (the
projection list) and r is a relation.
The result: a relation of the k attributes
A1, A2, …, AK obtained from r by erasing
the columns that are not listed
Duplicate rows removed from result
(relations are sets)
17
Projection Examples
Movie: pTitle, Year (Movie)
MovieID Title Year Title Year
1 Star Wars 1977 Star Wars 1977
2 Casablanca 1942 Casablanca 1942
3 The Wizard of Oz 1939 The Wizard of Oz 1939
4 Indiana Jones and the 1981 Indiana Jones and the 1981
Raiders of the Lost Ark Raiders of the Lost Ark

pYear(Movie) What is pTitle,Year(σ year > 1945(Movie))?


Year
Title Year
1977
Star Wars 1977
1939
Indiana Jones and the 1981
1942 Raiders of the Lost Ark
1981 18
Projection Example #2
Find the IDs of actors who have starred in
movies

πStarID(StarsIn)
StarID
1
2
3

19
Clicker Projection Example
Suppose relation R(A,B,C) has the tuples:
A B C
1 2 3
4 2 3
4 5 6
2 5 3
1 2 6 Compute the projection πC,B(R), and
identify one of its tuples from the list below.
A. (2,3)
B. (4,2,3)
C. (6,4)
D. (6,5)
20
E. None of the above
Clicker Projection Example
Suppose relation R(A,B,C) has the tuples:
A B C
1 2 3
4 2 3
4 5 6
2 5 3
1 2 6 Compute the projection πC,B(R), and
identify one of its tuples from the list below.
A. (2,3) Wrong order C B

B. (4,2,3) Not projected 3 2


6 5
C. (6,4) Wrong attributes 3 5
D. (6,5) right 6 2
21
E. None of the above
Selection and Projection Example
Find the ids of movies made prior to 1950
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark

MovieID
2
3

22
Selection and Projection Example
Find the ids of movies made prior to 1950
Movie: MovieID Title Year
1 Star Wars 1977
2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark

πMovieID (σ year < 1950 (Movie))


MovieID
2
3

23
Selection and Projection Example
Would this work?
A) Yes MovieID Title Year
1 Star Wars 1977
B) No 2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark

σ year < 1950 (πMovieID (Movie))


MovieID
2
3

24
Selection and Projection Example
Would this work?
A) Yes MovieID Title Year
1 Star Wars 1977
B) No 2 Casablanca 1942
3 The Wizard of Oz 1939
4 Indiana Jones and the 1981
Raiders of the Lost Ark

σ year < 1950 (πMovieID (Movie))


MovieID
2
3

25
Union, Intersection, Set-Difference
Notation: r È s rÇs r–s
Defined as:
r È s = {t | t Î r or t Î s}
r Ç s ={ t | t Î r and t Î s }
r – s = {t | t Î r and t Ï s}
For these operations to be well-defined:
1. r, s must have the same arity (same number of
attributes)
2. The attribute domains must be compatible
(e.g., 2nd column of r has same domain of values
as the 2nd column of s)
What is the schema of the result?
26
Union, Intersection, and
Set Difference Examples
MovieStar Singer
StarID Name Gender StarID SName Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Bergman Female 4 Sam Smith Non-binary
3 Judy Garland Female

MovieStar ∪ Singer MovieStar ∩ Singer


StarID Name Gender StarID Name Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Female
Bergman MovieStar – Singer
3 Judy Garland Female StarID Name Gender
4 Sam Smith Non-binary 1 Harrison Ford Male
2 Ingrid Bergman Female
27
Set Operator Example
MovieStar Singer
StarID Name Gender StarID Name Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Bergman Female 4 Sam Smith Non-binary
3 Judy Garland Female

Find the names of stars that are Singers but


not MovieStars

Name
Sam Smith

28
Set Operator Example
MovieStar Singer
StarID Name Gender StarID Name Gender
1 Harrison Ford Male 3 Judy Garland Female
2 Ingrid Bergman Female 4 Sam Smith Non-binary
3 Judy Garland Female

Find the names of stars that are Singers but


not MovieStars
πName(Singer – MovieStar)
Name
Sam Smith

29
Cartesian (or Cross)-Product
Notation: rxs
Defined as:
r x s = { t q | t Î r and q Î s}
It is possible for r and s to have attributes
with the same name, which creates a
naming conflict.
In this case, the attributes are referred to
solely by position.

30
Cartesian Product Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale
MovieStar x StarsIn
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
31
… … … … … …
Rename (ρ (rho))
Allows us to name results of relational-algebra expressions.
Notation
r (X, E)
returns the expression E under the name X

We can rename part of an expression, e.g.,


r((StarID→ID), πStarID,Name(MovieStar))

We can also refer to positions of attributes, e.g.,


r((1→ID) , πStarID,Name(MovieStar))
Is the same as above

32
Rename (ρ (rho))
We can rename the resulting relation and the attributes in
that relation
r(GenderlessStars(ID,Nom), πStarID,Name(MovieStar))
MovieStar !StarID,Name(MovieStar)
StarID Name Gender StarID Name
1 Harrison Ford Male 1 Harrison Ford
2 Ingrid Female 2 Ingrid
Bergman Bergman
3 Judy Garland Female 3 Judy Garland
GenderlessStars
ID Nom
1 Harrison Ford
2 Ingrid Bergman
3 Judy Garland 33
ρ Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale

r((1àStarID1, 5àStarID2), MovieStar x StarsIn)


StarID1 Name Gender MovieID StarID2 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
… … … … … …
34
Additional Operations
They can be defined in terms of the
primitive operations
They are added for convenience
They are:
Join (Condition, Equi-, Natural) (⋈)
Division (/)
Assignment (¬)

35
Joins (⋈)

Condition Join:
R ⋈cS = σc(R×S)
Result schema same as cross-product.
Fewer tuples than cross-product
might be able to compute more efficiently
Sometimes called a theta-join.
The reference to an attribute of a relation R
can be by position (R.i) or by name
(R.name)
36
Condition Join Example
MovieStar StarsIn
StarID Name Gender MovieID StarID Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Female 4 1 Indiana Jones
Bergman 2 2 Ilsa Lund
3 Judy Garland Female 3 3 Dorothy Gale

MovieStar ⋈ MovieStar.StarID < StarsIn.StarID StarsIn


1 Name Gender MovieID 5 Character
1 Harrison Ford Male 2 2 Ilsa Lund
1 Harrison Ford Male 3 3 Dorothy Gale
2 Ingrid Bergman Female 3 3 Dorothy Gale

37
MovieStar ⋈ MovieStar.StarID < StarsIn.StarID StarsIn
MovieStar x StarsIn (first get the cross product)
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
1 Harrison Ford Male 2 2 Ilsa Lund
2 Ingrid Bergman Female 2 2 Ilsa Lund
3 Judy Garland Female 2 2 Ilsa Lund
1 Harrison Ford Male 3 3 Dorothy Gale
2 Ingrid Bergman Female 3 3 Dorothy Gale
3 Judy Garland Female 3 3 Dorothy Gale 38
MovieStar ⋈ MovieStar.StarID < StarsIn.StarID StarsIn
Now remove rows based on the condition stated above.
1 Name Gender MovieID 5 Character
1 Harrison Ford Male 1 1 Han Solo
2 Ingrid Bergman Female 1 1 Han Solo
3 Judy Garland Female 1 1 Han Solo
1 Harrison Ford Male 4 1 Indiana Jones
2 Ingrid Bergman Female 4 1 Indiana Jones
3 Judy Garland Female 4 1 Indiana Jones
1 Harrison Ford Male 2 2 Ilsa Lund
2 Ingrid Bergman Female 2 2 Ilsa Lund
3 Judy Garland Female 2 2 Ilsa Lund
1 Harrison Ford Male 3 3 Dorothy Gale
2 Ingrid Bergman Female 3 3 Dorothy Gale
3 Judy Garland Female 3 3 Dorothy Gale 39
Condition Join Clicker Example
Compute R ⋈ R.A < S.C and R.B < S.DS where:
R(A,B): S(B,C,D):
A B B C D
1 2 2 4 6
3 4 4 6 8
5 6 4 7 9
Assume the schema of the result is (A, R.B, S.B, C, D).
Which tuple is in the result?
A. (1,2,2,6,8)
B. (1,2,4,4,6)
C. (5,6,2,4,6)
D. All are valid
E. None are valid
40
Condition Join Clicker Example
Compute R ⋈ R.A < S.C and R.B < S.DS where:
R(A,B): S(B,C,D):
A B B C D
1 2 2 4 6
3 4 4 6 8
5 6 4 7 9
Assume the schema of the result is (A, R.B, S.B, C, D).
Which tuple is in the result?
A. (1,2,2,6,8) (2,6,8) would have to be in S
B. (1,2,4,4,6) (4,4,6) would have to be in S
C. (5,6,2,4,6) Violates R.A < SC & R.B < S.D
D. All are valid (5 > 4, and 6 = 6)
E. None are valid Correct
41
One more thing you may find
helpful: Assignment Operation
Notation: t ¬ E
assigns the result of expression E to a temporary
relation t.
Used to break complex queries to small steps.
Assignment is always made to a temporary
relation variable.
Example: Write r Ç s in terms of È and/or –
temp1 ¬ r - s
result ¬ r – temp1
r s
49

You might also like