Relational Algebra Operations in RDM: Tools Boot Camp
Relational Algebra Operations in RDM: Tools Boot Camp
S(B1,,Bn)
(R)
A relation identical to R, with new attributes B1,,Bn.
Emp(name, dept)
Name Dept
Jack Physics
Tom ICS
emp1(name1,dept1)
(Emp)
Name1 Dept1
Jack Physics
Tom ICS
Emp1(name1, dept1)
Renaming (cont)
List employees who work in the same department as Tom.
Name Dept Name1 Dept1
Jack Physics Jack Physics
Jack Physics Tom ICS
Jack Physics Mary ICS
Tom ICS Jack Physics
Tom ICS Tom ICS
Tom ICS Mary ICS
Mary ICS Jack Physics
Mary ICS Tom ICS
Mary ICS Mary ICS
Name Dept
Jack Physics
Tom ICS
Mary ICS
15
Emp (name, dept)
Name Dept
Jack Physics
Tom ICS
Mary ICS
H
emp1.name1
(
emp1(name1,dept1)
(emp) o
name=tom
(emp))
emp1.dept1=emp.dept
Name1
Tom
Mary
Emp1(name1, dept1)
Result
Emp Emp1
Outer Joins
Motivation: join can lose information
E.g.: natural join of R and S loses info about Tom and Mary, since they
do not join with other tuples.
Called dangling tuples.
R
Name Dept
Jack Physics
Tom ICS
S
Name Addr
Jack Irvine
Mike LA
Mary Riverside
Outer join: natural join, but use NULL values to fill in dangling
tuples. Remember natural join is similar to equi join
Three types: left, right, or full
Left Outer Join
R.name R.Dept S.name S.addr
Jack Physics Jack Irvine
Jack Physics Mike LA
Jack Physics Mary Riverside
Tom ICS Jack Irvine
Tom ICS Mike LA
Tom ICS Mary Riverside
R
Name Dept
Jack Physics
Tom ICS
S
Name Addr
Jack Irvine
Mike LA
Mary Riverside
Left outer join
R S
Name Dept Addr
Jack Physics Irvine
Tom ICS NULL
Pad null value for left dangling tuples.
R S
LEFT OUTER JOIN -- It is the
relation from which we wish
all rows returned, regardless
of whether there is a matching
address in the S relation.
List all names, depts and addresses
for all names listed in R
Assumes
equality
Right Outer Join
Name Addr
Jack Irvine
Mike LA
Mary Riverside
R
Name Dept
Jack Physics
Tom ICS
S
Right outer join
R S
Name Dept Addr
Jack Physics Irvine
Mike NULL LA
Mary NULL Riverside
Pad null value for right dangling tuples.
R.name R.Dept S.name S.addr
Jack Physics Jack Irvine
Jack Physics Mike LA
Jack Physics Mary Riverside
Tom ICS Jack Irvine
Tom ICS Mike LA
Tom ICS Mary Riverside
R S
RIGHT OUTER JOIN -- It is
the relation from which we
wish all rows returned,
regardless of whether there
is a matching dept in the R
relation.
List all names, depts and addresses
for all names listed in S
Assumes
equality
Full Outer Join
Name Addr
Jack Irvine
Mike LA
Mary Riverside
R
Name Dept
Jack Physics
Tom ICS
S
Full outer join
R S
Name Dept Addr
Jack Physics Irvine
Tom ICS NULL
Mike NULL LA
Mary NULL Riverside
Pad null values for both left and right dangling tuples.
R.name R.Dept S.name S.addr
Jack Physics Jack Irvine
Jack Physics Mike LA
Jack Physics Mary Riverside
Tom ICS Jack Irvine
Tom ICS Mike LA
Tom ICS Mary Riverside
R S
List all names, depts and addresses
for all names listed in R and S
Combining Different Operations
20
Construct general expressions using basic operations.
Schema of each operation:
, , -: same as the schema of the two relations
Selection o : same as the relations schema
Projection H: attributes in the projection
Cartesian product : attributes in two relations, use prefix to
avoid confusion
Theta Join : same as
Natural Join : union of relations attributes, merge
common attributes
Renaming: new renamed attributes
C
Equivalent Expressions
21
Expressions might be equivalent.
R S = R (R S)
How about the following?
(R S) T = R (S T)?
(R S) T = R (S T)?
H
A
(R S) = H
A
(R) H
A
(S)?
H
A
(R S) = H
A
(R) H
A
(S)?
R S = H
L
(R S)
R.A1=S.A1,,R.Ak=S.Ak
Example 1
customer(ssn, name, city)
account(custssn, balance)
List account balances of Tom.
[
= =
balance
tom name ssn custssn
customer account ))) (
( ( o o
account
customer
o
ssn custssn=
H
balance
o
name=tom
Tree representation
Example 1(cont)
customer(ssn, name, city)
account(custssn, balance)
List account balances of Tom.
account
customer
H
balance
o
name=tom
ssn=custssn
Assignment Operator
24
Motivation: expressions can be complicated
Introduce names for intermediate relations, using the assignment operator
:=
Then a query can be written as a sequential program consisting of a
series of assignments
[
= =
balance
tom name ssn custssn
customer account ))) (
( ( o o
R1(ssn,name,city) := s
name=tom
(customer)
R2(ssn,name,city,custssn,balance):= s
custssn=ssn
(accountR1)
Answer(balance) := H
balance
(R2)
This sequential
program is SQL is
called script
Example 2
Find names of customers in Irvine or having a balance > 50K.
account
customer
o
city=irvine
custssn=ssn
o
balance>50K customer
H
name
H
name
customer(ssn, name, city)
account(custssn, balance)
Example 3
List the highest balance of all the customers.
account
H
acct1.balance
account
acct1
acct1.balance < account.balance
H
balance
account
account(custssn, balance)
Custssn balance
111 20K
222 15K
333 10K
Example 3
List the highest balance of all the customers.
Custssn balance
111 20K
222 15K
333 10K
Custssn balance
111 20K
222 15K
333 10K
account
H
acct1.balance
account
acct1
acct1.balance < account.balance
H
balance
account
account(custssn, balance)
Custssn balance
111 20K
222 15K
333 10K
acct1
account
Acct1.
Custssn
Acct1.balance Account.
Custssn
Account.
balance
111 20K 111 20K
111 20K 222 15K
111 20K 333 10K
222 15K 111 20K
222 15K 222 15K
222 15K 333 10K
333 10K 111 20K
333 10K 222 15K
333 10K 333 10K
Acct1.
balance
15K
10K
Account.
balance
20K
15K
10K
20K
Example 3 (cont)
List the highest balance of all the customers.
Is the following expression correct?
account
H
account.balance
account
acct1
acct1.balance<account.balance
account(custssn, balance)
Custssn balance
111 20K
222 15K
333 10K
Example 3 (cont)
How about the lowest balance?
How about the highest balance of customers in Irvine?
Example 4
List the cities and names of the customers who have the highest balance
of all customers.
account
H
acct1.balance
account
acct1
acct1.balance<account.balance
H
balance
account
custssn=ssn
customer
H
name,city
balance=acct2.balance
account
acct2
highest balance
the customers
with this balance
Example 4: another expression?
List the cities and names of the customers who have the highest balance
of all customers.
account
H
acct1.custssn
account
acct1
acct1.balance<account.balance
custssn=ssn
customer
H
name,city
H
custssn
account
customer(ssn,name,city)
Ssn Name City
222 Tom irvine
account(custssno,balance)
Custssn balance
222 20K
222 50K
333 50K
Tom has two accounts. One
of them is not the highest!
WRONG!
Limitation of Relational Algebra
Some queries cannot be represented
Example, recursive queries:
Table R(Parent,Child)
How to find all the ancestors of Tom?
Impossible to write this query in relational algebra.
More expressive languages needed:
E.g., Datalog