0% found this document useful (0 votes)
196 views

Relational Algebra Problems

Uploaded by

Putta Swamy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views

Relational Algebra Problems

Uploaded by

Putta Swamy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Operation My HTML Symbol Operation My HTML Symbol

Projection PROJECT Cartesian product X

Selection SELECT Join JOIN

Left outer join LEFT OUTER JOIN


Renaming RENAME

Right outer join RIGHT OUTER JOIN


Union UNION

Intersection INTERSECTION Full outer join FULL OUTER JOIN

Assignment <- Semijoin SEMIJOIN

Relational Algebra Problems


Suppose there is a banking database which comprises following
tables:
Customer (Cust_name, Cust_street, Cust_city)
Branch (Branch_name, Branch_city, Assets)
Account (Branch_name, Account_number, Balance)
Loan (Branch_name, Loan_number, Amount)
Depositor (Cust_name, Account_number)
Borrower (Cust_name, Loan_number)
Query: Find the names of all the customers who have taken a
loan from the bank and also have an account at the bank.
Solution: Step1: Identify the relations that would be
required to frame the resultant query.
 First half of the query (i.e. names of customers who have
taken loan) indicates “borrowers” information.
So, Relation 1 —–> Borrower.
 Second half of the query needs Customer Name and Account
number which can be obtained from Depositor relation.
Hence, Relation 2——> Depositor.
Step-2: Identify the columns which you require from the
relations obtained in Step 1.
Column1: Cust_name from Borrower

Column2: Cust_name from Depositor

Step-3: Identify the operator to be used. We need to find out


the names of customers who are present in both Borrower table
and Depositor table.
Hence, operator to be used—-> Intersection.
Final Query will be

2. Solve all queries below using only select, project,


Cartesian product, and natural join. Do not use theta-join,
set operations, renaming or assignment.
1. Consider the following schema diagram:
Suppliers (sid: integer, sName: string, address: string)
Parts (pid: integer, pname: string, colour: string)
Catalog (sid: integer, pid: integer, cost: real)
Find the name of suppliers who supply some red parts
Step 1: R1 = πpid (σcolor = “red” parts)
Step 2: R2 = πsid (R1 $ Catalog)
Step 3: R3=πname (R2 $ Suppliers)
Required answer is R3
a. Find the names of all red parts.
πpName(colour=”red" Parts)
b. Find all prices for parts that are red or green. (A part
may have different prices from different manufacturers
πprice((colour=”red" Parts) ⨝Catalog)
 colour=”green"
c. Find the sIDs of all suppliers who supply a part that is red
or green
Find the sids of suppliers who supply some red or green
parts
Step1: R1 = πpid(σcolor = “red” V “green” parts)
Step 2 : R2 = πsid(R1 $ Catalog)
Same as above one but here we have to choose red or green
parts and we have to have sids of suppliers so we can stop
after step 2 after choosing parts either in red colour or
green colour
πsID((colour=”red"  colour=”green" Parts) ⨝Catalog)
d. Find the sIDs of all suppliers who supply a part that is red
and green.
Answer: Trick question. Each tuple has only one colour, and
each part has only one tuple (since pID is a key), so no
part can be recorded as both red and green.
e. Find the names of all suppliers who supply a part that is
red or green.
πsName((πsID((colour=”red"_colour=”green"Parts) ⨝Catalog)) ⨝Suppliers)
Find the sids of suppliers who supply some red part or are at
221 packer Ave
Sids of suppliers who supply some red part
Step 1 : R1 = πpid(σcolor = ‘red’parts)
Step 2 : R2 = πsid(R1 $ Catalog )
Sids of suppiers who are at 221 packer Ave
Step 1 : R3 = πsidσaddress = ‘221 packer Ave(Suppliers)
Therefore sids of suppliers who supply some red part or are at
221 packer Ave
Is R2 U R3
4) Find the sids of suppliers who supply some red part and
some green part
A) R1 = πsid(πpid(σcolor = ‘red’ parts) $ Catalog)
R2 = πsid(πpid(σcolor = ‘green’parts) $ Catalog)
From question one we get the sids of suppliers who supply some
red part (R1)
Similarly R2 is the sids of suppliers who supply some green
part
Required list of sids who supply some red and some green part
is R1
Intersection R2
5) Find the sids of suppliers who supply every part
A) R1=πsid,pid Catalog
R2=πpidParts
R1/R2 give us the required list of sids of suppliers who
supply every part
6)Find the sids of suppliers who supply every red part
A)This is same as previous one but in R2 we consider only red
parts
R1= πsid,pid Catalog
R2=πpidσcolr=’red’parts
So required answer is R1/R2
7)Find the sids of suppliers who supply every red or green
part
A) R1= πsid,pid Catalog
R2=πpidσcolr=’red’ v ’green’ parts
R1/R2 gives the sids of suppliers who supply every part which
is either red in Color or green in color
Examples: Company database
Query 1: Retrieve the name and address of all employees who
work for the ‘Research’ department.
RESEARCH_DEPT ← σDname=‘Research’(DEPARTMENT)

RESEARCH_EMPS ← (RESEARCH_DEPT Dnumber=Dno EMPLOYEE)


RESULT ← πFname, Lname, (RESEARCH_EMPS)
Address

As a single in-line expression, this query becomes:


π (σ
Fname, Lname, Address Dname=‘Research’(DEPARTMENT Dnumber=Dno(E
MPLOYEE))

Query 2. For every project located in ‘Stafford’, list the


project number, the controlling department number, and the
department manager’s last name, address, and birth date.
STAFFORD_PROJS ← σPlocation=‘Stafford’(PROJECT)

CONTR_DEPTS ← (STAFFORD_PROJS Dnum=Dnumber DEPARTMENT)

PROJ_DEPT_MGRS ← (CONTR_DEPTS Mgr_ssn=Ssn EMPLOYEE)


RESULT ← π
Pnumber, Dnum, Lname, Address, Bdate(PROJ_DEPT_MGRS)
Query 3. Find the names of employees who work on all the
projects controlled by department number 5.
DEPT5_PROJS ← ρ(Pno)(πPnumber(σDnum=5(PROJECT)))
EMP_PROJ ← ρ(Ssn, Pno) (πEssn, (WORKS_ON))
Pno

RESULT_EMP_SSNS ← EMP_PROJ ÷ DEPT5_PROJS


RESULT ← πLname, Fname (RESULT_EMP_SSNS * EMPLOYEE)
Query 4. Make a list of project numbers for projects that
involve an employee whose last name is ‘Smith’, either as a
worker or as a manager of the department that controls the
project.
SMITHS(Essn) ← πSsn (σLname=‘Smith’(EMPLOYEE))
SMITH_WORKER_PROJS ← πPno(WORKS_ON * SMITHS)

MGRS ← πLname, Dnumber(EMPLOYEE Ssn=Mgr_ssn DEPARTMENT)


SMITH_MANAGED_DEPTS(Dnum) ← πDnumber (σLname=‘Smith’(MGRS))
SMITH_MGR_PROJS(Pno) ← πPnumber(SMITH_MANAGED_DEPTS * PROJECT)
RESULT ← (SMITH_WORKER_PROJS ∪ SMITH_MGR_PROJS)
As a single in-line expression, this query becomes:
π (WORKS_ON ∪
Pno Essn=Ssn(πSsn (σ
Lname=‘Smith’(EMPLOYEE))) π
Pno
((π (σ
Dnumber Lname=‘Smith’(πLname, Dnumber(EMPLOYEE)))

Ssn=Mgr_ssnDEPARTMENT)) Dnumber=DnumPROJECT)
Query 5. List the names of all employees with two or more
dependents.
Strictly speaking, this query cannot be done in
the basic (original) relational algebra. We have to use
the AGGREGATE FUNCTION operation with the COUNT aggregate
function. We assume that dependents of the same employee
have distinct Dependent_name values.
T1(Ssn, No_of_dependents)← Essn ℑ COUNT Dependent_name (DEPENDENT)
T2 ← σ
No_of_dependents>2(T1)
RESULT ← π (T2 EMPLOYEE)
Lname, Fname *
Query 6. Retrieve the names of employees who have no
dependents.
This is an example of the type of query that uses
the MINUS (SET DIFFERENCE) operation.
ALL_EMPS ← πSsn(EMPLOYEE)
EMPS_WITH_DEPS(Ssn) ← πEssn(DEPENDENT)
EMPS_WITHOUT_DEPS ← (ALL_EMPS – EMPS_WITH_DEPS)
RESULT ← πLname, Fname(EMPS_WITHOUT_DEPS * EMPLOYEE)
We first retrieve a relation with all employee Ssns
in ALL_EMPS. Then we create a table with the Ssns of employees
who have at least one dependent in EMPS_WITH_DEPS. Then we
apply the SET DIFFERENCE operation to retrieve employees Ssns
with no dependents in EMPS_WITHOUT_DEPS, and finally join this
with EMPLOYEE to retrieve the desired attributes. As a single
in-line expression, this query becomes:
π ((π (EMPLOYEE) – ρ (π (DEPENDENT))) EMPLOYEE)
Lname, Fname Ssn Ssn Essn *
Query 7. List the names of managers who have at least one
dependent.
MGRS(Ssn) ← πMgr_ssn(DEPARTMENT)
EMPS_WITH_DEPS(Ssn) ← πEssn(DEPENDENT)
MGRS_WITH_DEPS ← (MGRS ∩ EMPS_WITH_DEPS)
RESULT ← πLname, Fname(MGRS_WITH_DEPS * EMPLOYEE)
In this query, we retrieve the Ssns of managers in MGRS, and
the Ssns of employees with at least one dependent
in EMPS_WITH_DEPS, then we apply
the SET INTERSECTION operation to get the Ssns of managers who
have at least one dependent.
As we mentioned earlier, the same query can be specified in
many different ways in relational algebra. In particular, the
operations can often be applied in various orders. In
addition, some operations can be used to replace others; for
example, the INTERSECTION operation in Q7 can be replaced by
a NATURAL JOIN. As an exercise, try to do each of these sample
queries using different operations.12 We showed how to write
queries as single relational algebra expressions for
queries Q1, Q4, and Q6. Try to write the remaining queries as
single expressions. In Chapters 4 and 5 and in Sections 6.6
and 6.7, we show how these queries are written in other
relational languages.
Let us consider a sample database
Car(RegiNo, Make, ModelYear, Color)
Inspection(RegNo->Car, DateInspected, Period, Evaluation),
Problem((RegNo, DateInspected)->Inspection, ProblemCode)
Driver(RegNo->Car, Name, Accidents)

Example 1: Retrieve Information about cars of year 1996 model,


where faults have been found in the inspection for year 1999.
First we deduce ‘information about cars’ to mean the values of
all attributes of the relation Car. Information about
inspections is stored in table Inspection and if faults are
found they are registered in table Problem. Thus we need these
three tables to obtain the information we want.

Only cars of year 1996 model are interesting. The model year
of a car is represented as the value of attribute ModelYear in
table Car. Our first intermediate result consists of tuples
representing cars of model year 1996. This is obtained using
selection
C1996 = ModelYear=1996 (Car)
The cars we are interested in should have been inspected for
the period 1999. Thus we need only the rows that cover that
period. We use selection to retrieve them.
In1999 = Period=1999 (Inspection)

Now we have the cars and inspections we were interested in. We


have to connect the rows. We use the join operation. They
should be joined by the common register number. Because
register number is the only common column we may use natural
join.
CI = C1966 * In1999
= ModelYear=1996 (Car) * Period=1999 (Inspection)

To find out whether faults were found in the inspections we


need to connect the problem rows with the inspection. We have
already connected the inspection rows to the cars and may now
connect that result with the Problem table. Join should be
based on the common registration number and date inspected.
These are the only common columns in the tables so we may use
natural join.
CIP = CI * Problem

All the connections have been made. We have cars connected to


inspections and problems found in those inspections. Cars that
have not been inspected for year 1999 or cars that have been
inspected but had no problems are not included in the
intermediate result CIP. We wanted only the car data as the
result. Thus we have to use project to extract them.
FaultyCar = RegNo,Make,ModelYear,Color (CIP)
The expression without intermediate results is:
FaultyCar =
RegNo,Make,ModelYear,Color (
( ModelYear=1996 (Car) * Period=1999 (Inspection)) * Problem)
Example 2:
Information need: Driver’s name for the model year 1995 or
older cars that have not been inspected for year 2000.
Driver's name is in table Driver. Inspection are described in
table Inspection and cars in table Car. Thus we need those
three tables.

First, we should find out cars that have not been inspected
for year 2000. We cannot solve this by using the table
Inspection alone, because it contains data about those
inspections that have been done, not about nonexisting
inspections. Especially selecting rows where period is not
equal to 2000 results to inspections of all other years but
year 2000. This problem may be solved by finding out the
complement 'cars that are inspected for year 2000'. Actually
we need only their registration numbers.
InR2000= RegNo ( Period=2000 (Inspection))

The cars of model year 1995 and older are retrieved using
select on table Car. Actually we need only their regitration
numbers.
C1995R = RegNo ( ModelYear<=1995 (Car))

We may use difference operation to get the regitration numbers


of such cars that are not inspected.
CIR = CR1995 - InR2000

To find the divers we need to join the registration numbers


obtained and the Driver table and then project out the names.
We may use natural join.
Names = Name (CIR * DRIVER)
The expression without intermediate results is:
Names =
Name (Driver * ( RegNo ( ModelYear<=1995 Car) ) -
RegNo ( ModelYear<=1995 (Car)))
Example 3
Information need: What makes have provided similarly colored
cars as models 1999 and 2000?
The color and make of car is stored in table Car. To find out
that two cars have the same color we must construct pairs of
cars sharing the color. The restriction to model years is done
with selection as in the two previous examples. The instances
of Car relation are renames as C1 and C2.
Makes= Make (C2:= ModelYear=2000 (Car)
C2.Color=C1.Color
C1:= ModelYear=2000 (Car) )
Example:

Relation algebra query: (Loan X Borrower)

SQL: select * from Loan, Borrower;


write a relational algebra query and sql from above the relation Loan and

Borrower. To select all attributes after the cross product of both relation,

where Branch_name is SBI City.

Relational algebra query: σ (Loan x Borrower)


Branch_name=”SBI City”

SQL: Select * from Loan, Borrower where Branch_name=’SBI City’;

Result:
Write a relational algebra query and sql from above the relation Loan and

Borrower. To select Branch_name, Customer_name, Amount after the cross

product of both relation, where Branch_name is SBI City.

You might also like