chapter_03C Rel Algebra and SQL
chapter_03C Rel Algebra and SQL
1
Relational Query Languages
• Languages for describing queries on a
relational database
• Structured Query Language (SQL)
– Predominant application-level query language
– Declarative
• Relational Algebra
– Intermediate language used within DBMS
– Procedural
2
What is an Algebra?
• A language based on operators and a domain of values
• Operators map values taken from the domain into
other domain values
• Hence, an expression involving operators and
arguments produces a value in the domain
• When the domain is a set of all relations (and the
operators are as described later), we get the relational
algebra
• We refer to the expression as a query and the value
produced as the query result
3
Relational Algebra
• Domain: set of relations
• Basic operators: select, project, union, set
difference, Cartesian product
• Derived operators: set intersection, division, join
• Procedural: Relational expression specifies query
by describing an algorithm (the sequence in which
operators are applied) for determining the result of
an expression
4
The Role of Relational Algebra in a DBMS
5
Select Operator
• Produce table containing subset of rows of
argument table satisfying condition
condition (relation)
• Example:
Person Hobby=‘stamps’(Person)
Id Name Address Hobby Id Name Address Hobby
1123 John 123 Main stamps 1123 John 123 Main stamps
1123 John 123 Main coins 9876 Bart 5 Pine St stamps
5556 Mary 7 Lake Dr cycling
9876 Bart 5 Pine St stamps
6
Selection Condition
• Operators: <, , , >, =,
• Simple selection condition:
– <attribute> operator <constant>
– <attribute> operator <attribute>
• <condition> AND <condition>
• <condition> OR <condition>
• NOT <condition>
7
Selection Condition - Examples
• NOT(Hobby=‘cycling’) (Person)
• Hobby‘cycling’ (Person)
8
Project Operator
• Produces table containing subset of columns
of argument table
attribute list(relation)
• Example:
Person Name,Hobby(Person)
10
Expressions
Id, Name ( Hobby=’cycling’ OR Hobby=’coins’ (Person) )
11
Set Operators
• Relation is a set of tuples, so set operations
should apply: , , (set difference)
• Result of combining two relations with a set
operator is a relation => all its elements
must be tuples having same structure
• Hence, scope of set operations limited to
union compatible relations
12
Union Compatible Relations
• Two relations are union compatible if
– Both have same number of columns
– Names of attributes are the same in both
– Attributes with the same name in both relations
have the same domain
• Union compatible relations can be
combined using union, intersection, and set
difference
13
Example
Tables:
Person (SSN, Name, Address, Hobby)
Professor (Id, Name, Office, Phone)
are not union compatible.
But
Name (Person) and Name (Professor)
are union compatible so
Name (Person) - Name (Professor)
makes sense.
14
Cartesian Product
• If R and S are two relations, R S is the set of all
concatenated tuples <x,y>, where x is a tuple in R
and y is a tuple in S
– R and S need not be union compatible.
– But R and S must have distinct attribute names. Why?
• R S is expensive to compute. But why?
A B C D A B C D
x1 x2 y1 y2 x1 x2 y1 y2
x3 x4 y3 y4 x1 x2 y3 y4
x3 x4 y1 y2
R S x3 x4 y3 y4
R S
15
Renaming
• Result of expression evaluation is a relation
• Attributes of relation must have distinct names.
This is not guaranteed with Cartesian product
• Renaming operator tidies this up. To assign the
names A1, A2,… An to the attributes of the n
column relation produced by expression expr use
expr [A1, A2, … An]
16
Example
19
Theta Join – Example
Employee(Name,Id,MngrId,Salary)
Manager(Name,Id,Salary)
Output the names of all employees that earn
more than their managers.
Employee.Name (Employee MngrId=Id AND Employee.Salary> Manager.Salary
Manager)
The join yields a table with attributes:
Employee.Name, Employee.Id, Employee.Salary, MngrId
Manager.Name, Manager.Id, Manager.Salary
20
Equijoin Join - Example
Equijoin: Join condition is a conjunction of equalities.
Name,CrsCode(Student Id=StudId Grade=‘A’ (Transcript))
Student Transcript
Id Name Addr Status StudId CrsCode Sem Grade
111 John ….. ….. 111 CSE305 S00 B
222 Mary ….. ….. 222 CSE306 S99 A
333 Bill ….. ….. 333 CSE304 F99 A
444 Joe ….. …..
The equijoin is used very
frequently since it combines
Mary CSE306 related data in different relations.
Bill CSE304
21
Natural Join
• Special case of equijoin:
– join condition equates all and only those attributes with the
same name (condition doesn’t have to be explicitly stated)
– duplicate columns eliminated from the result
Transcript Teaching =
StudId, Transcript.CrsCode, Transcript.Sem, Grade, ProfId
( Transcript Transcipt.CrsCode=Teaching.CrsCode
AND Transcirpt.Sem=Teaching.Sem Teaching )
[StudId, CrsCode, Sem, Grade, ProfId ] 22
Q: but why natural join is a derived operator? Because…
Natural Join (cont’d)
• More generally:
R S = attr-list (join-cond (R × S) )
where
attr-list = attributes (R) attributes (S)
(duplicates are eliminated) and join-cond has
the form:
R.A1 = S.A1 AND … AND R.An = S.An
where
{A1 … An} = attributes(R) attributes(S)
23
Natural Join Example
• List all Ids of students who took at least two
different courses:
25
Division (cont’d)
26
Division - Example
• List the Ids of students who have passed all
courses that were taught in spring 2000
• Numerator:
– StudId and CrsCode for every course passed by
every student:
StudId, CrsCode (Grade ‘F’ (Transcript) )
• Denominator:
– CrsCode of all courses taught in spring 2000
CrsCode (Semester=‘S2000’ (Teaching) )
• Result is numerator/denominator
27
End of Relational Algebra
28