rrelational 2
rrelational 2
relation are also kept in the result relation T(X, Y, Z). It is therefore the same as a
FULL OUTER JOIN on the common attributes.
Two tuples t1 in R and t2 in S are said to match if t1[X]=t2[X]. These will be com-
bined (unioned) into a single tuple in t. Tuples in either relation that have no
matching tuple in the other relation are padded with NULL values. For example, an
OUTER UNION can be applied to two relations whose schemas are STUDENT(Name,
Ssn, Department, Advisor) and INSTRUCTOR(Name, Ssn, Department, Rank). Tuples
from the two relations are matched based on having the same combination of values
of the shared attributes—Name, Ssn, Department. The resulting relation,
STUDENT_OR_INSTRUCTOR, will have the following attributes:
All the tuples from both relations are included in the result, but tuples with the same
(Name, Ssn, Department) combination will appear only once in the result. Tuples
appearing only in STUDENT will have a NULL for the Rank attribute, whereas tuples
appearing only in INSTRUCTOR will have a NULL for the Advisor attribute. A tuple
that exists in both relations, which represent a student who is also an instructor, will
have values for all its attributes.11
Notice that the same person may still appear twice in the result. For example, we
could have a graduate student in the Mathematics department who is an instructor
in the Computer Science department. Although the two tuples representing that
person in STUDENT and INSTRUCTOR will have the same (Name, Ssn) values, they
will not agree on the Department value, and so will not be matched. This is because
Department has two different meanings in STUDENT (the department where the per-
son studies) and INSTRUCTOR (the department where the person is employed as an
instructor). If we wanted to apply the OUTER UNION based on the same (Name, Ssn)
combination only, we should rename the Department attribute in each table to reflect
that they have different meanings and designate them as not being part of the
union-compatible attributes. For example, we could rename the attributes as
MajorDept in STUDENT and WorkDept in INSTRUCTOR.
11Note that OUTER UNION is equivalent to a FULL OUTER JOIN if the join attributes are all the com-
mon attributes of the two relations.
172 Chapter 6 The Relational Algebra and Relational Calculus
RESEARCH_DEPT ← σDname=‘Research’(DEPARTMENT)
RESEARCH_EMPS ← (RESEARCH_DEPT Dnumber=DnoEMPLOYEE)
RESULT ← πFname, Lname, Address(RESEARCH_EMPS)
This query could be specified in other ways; for example, the order of the JOIN and
SELECT operations could be reversed, or the JOIN could be replaced by a NATURAL
JOIN after renaming one of the join attributes to match the other join attribute
name.
Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name,
address, and birth date.
STAFFORD_PROJS ← σPlocation=‘Stafford’(PROJECT)
CONTR_DEPTS ← (STAFFORD_PROJS Dnum=DnumberDEPARTMENT)
PROJ_DEPT_MGRS ← (CONTR_DEPTS Mgr_ssn=SsnEMPLOYEE)
RESULT ← πPnumber, Dnum, Lname, Address, Bdate(PROJ_DEPT_MGRS)
In this example, we first select the projects located in Stafford, then join them with
their controlling departments, and then join the result with the department man-
agers. Finally, we apply a project operation on the desired attributes.
Query 3. Find the names of employees who work on all the projects controlled
by department number 5.
DEPT5_PROJS ← ρ(Pno)(πPnumber(σDnum=5(PROJECT)))
EMP_PROJ ← ρ(Ssn, Pno)(πEssn, Pno(WORKS_ON))
RESULT_EMP_SSNS ← EMP_PROJ ÷ DEPT5_PROJS
RESULT ← πLname, Fname(RESULT_EMP_SSNS * EMPLOYEE)
In this query, we first create a table DEPT5_PROJS that contains the project numbers
of all projects controlled by department 5. Then we create a table EMP_PROJ that
holds (Ssn, Pno) tuples, and apply the division operation. Notice that we renamed
the attributes so that they will be correctly used in the division operation. Finally, we
join the result of the division, which holds only Ssn values, with the EMPLOYEE
table to retrieve the desired attributes from EMPLOYEE.
Query 4. Make a list of project numbers for projects that involve an employee
whose last name is ‘Smith’, either as a worker or as a manager of the department
that controls the project.
SMITHS(Essn) ← πSsn (σLname=‘Smith’(EMPLOYEE))
SMITH_WORKER_PROJS ← πPno(WORKS_ON * SMITHS)
MGRS ← πLname, Dnumber(EMPLOYEE Ssn=Mgr_ssnDEPARTMENT)
SMITH_MANAGED_DEPTS(Dnum) ← πDnumber (σLname=‘Smith’(MGRS))
SMITH_MGR_PROJS(Pno) ← πPnumber(SMITH_MANAGED_DEPTS * PROJECT)
RESULT ← (SMITH_WORKER_PROJS ∪ SMITH_MGR_PROJS)
6.5 Examples of Queries in Relational Algebra 173
In this query, we retrieved the project numbers for projects that involve an
employee named Smith as a worker in SMITH_WORKER_PROJS. Then we retrieved
the project numbers for projects that involve an employee named Smith as manager
of the department that controls the project in SMITH_MGR_PROJS. Finally, we
applied the UNION operation on SMITH_WORKER_PROJS and
SMITH_MGR_PROJS. As a single in-line expression, this query becomes:
πPno (WORKS_ON Essn=Ssn(πSsn (σLname=‘Smith’(EMPLOYEE))) ∪ πPno
((πDnumber (σLname=‘Smith’(πLname, Dnumber(EMPLOYEE)))
Ssn=Mgr_ssnDEPARTMENT)) Dnumber=DnumPROJECT)
Query 5. List the names of all employees with two or more dependents.
Strictly speaking, this query cannot be done in the basic (original) relational
algebra. We have to use the AGGREGATE FUNCTION operation with the COUNT
aggregate function. We assume that dependents of the same employee have
distinct Dependent_name values.
T1(Ssn, No_of_dependents)← Essn ℑ COUNT Dependent_name(DEPENDENT)
T2 ← σNo_of_dependents>2(T1)
RESULT ← πLname, Fname(T2 * EMPLOYEE)
We first retrieve a relation with all employee Ssns in ALL_EMPS. Then we create a
table with the Ssns of employees who have at least one dependent in
EMPS_WITH_DEPS. Then we apply the SET DIFFERENCE operation to retrieve
employees Ssns with no dependents in EMPS_WITHOUT_DEPS, and finally join this
with EMPLOYEE to retrieve the desired attributes. As a single in-line expression, this
query becomes:
πLname, Fname((πSsn(EMPLOYEE) – ρSsn(πEssn(DEPENDENT))) * EMPLOYEE)
Query 7. List the names of managers who have at least one dependent.
MGRS(Ssn) ← πMgr_ssn(DEPARTMENT)
EMPS_WITH_DEPS(Ssn) ← πEssn(DEPENDENT)
MGRS_WITH_DEPS ← (MGRS ∩ EMPS_WITH_DEPS)
RESULT ← πLname, Fname(MGRS_WITH_DEPS * EMPLOYEE)
In this query, we retrieve the Ssns of managers in MGRS, and the Ssns of employees
with at least one dependent in EMPS_WITH_DEPS, then we apply the SET
INTERSECTION operation to get the Ssns of managers who have at least one
dependent.
174 Chapter 6 The Relational Algebra and Relational Calculus
As we mentioned earlier, the same query can be specified in many different ways in
relational algebra. In particular, the operations can often be applied in various
orders. In addition, some operations can be used to replace others; for example, the
INTERSECTION operation in Q7 can be replaced by a NATURAL JOIN. As an exercise,
try to do each of these sample queries using different operations. 12 We showed how
to write queries as single relational algebra expressions for queries Q1, Q4, and Q6.
Try to write the remaining queries as single expressions. In Chapters 4 and 5 and in
Sections 6.6 and 6.7, we show how these queries are written in other relational
languages.
12When queries are optimized (see Chapter 19), the system will choose a particular sequence of opera-
tions that corresponds to an execution strategy that can be executed efficiently.