0% found this document useful (0 votes)
2 views

8 part B Relational Database Design

The document discusses the concept of database normalization, focusing on various normal forms including 4NF, BCNF, and 5NF, emphasizing the importance of lossless decompositions and dependency preservation. It provides examples of multivalued dependencies and the issues that arise when decomposing relations, highlighting the need for careful design to avoid information loss. Additionally, it explains how to achieve lossless decompositions and the implications of functional dependencies in database design.

Uploaded by

f20230371
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

8 part B Relational Database Design

The document discusses the concept of database normalization, focusing on various normal forms including 4NF, BCNF, and 5NF, emphasizing the importance of lossless decompositions and dependency preservation. It provides examples of multivalued dependencies and the issues that arise when decomposing relations, highlighting the need for careful design to avoid information loss. Additionally, it explains how to achieve lossless decompositions and the implications of functional dependencies in database design.

Uploaded by

f20230371
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

4NF(no multivalued dependencies)

authorsubject and subjectbookpublisher


List the authors working with TMH
Prob: Silberschatz is displayed twice.
List the subjects published by TMH
Prob:Database systems is displayed thrice
Display the publisher of silberschatz.
Prob:TMH is displayed twice.
Subject Author Book publisher
Database systems Korth Tata McGraw Hill
Database systems Sudarshan Tata McGraw Hill

Database systems Silberschatz Tata McGraw Hill

Database systems Elmarsi Pearson Education

Database systems Navathe Pearson Education

Operating systems Silberschatz Tata McGraw Hill


4NF(no multivalued dependencies)
Subject Author

Database systems Korth

Database systems Sudarshan

Database systems Silberschatz

Database systems Elmarsi

Database systems Navathe

Operating systems Silberschatz


4NF(no multivalued dependencies)

Subject Book publisher

Database systems Tata McGraw Hill

Database systems Pearson Education

Operating systems Tata McGraw Hill


4NF(no multivalued dependencies)
Author Book publisher

Korth Tata McGraw Hill

Sudarshan Tata McGraw Hill

Silberschatz Tata McGraw Hill

Elmarsi Pearson Education

Navathe Pearson Education


Definition of Decomposition
Let R be a relation schema
A set of relation schemas { R1, R2,…, Rn } is a
decomposition of R if
 R = R1 U R2 U …..U Rn
 each Ri is a subset of R ( for i = 1,2…,n)
Example of Decomposition
For relation R(x,y,z) there can be 2 subsets:
R1(x,z) and R2(y,z)

If we union R1 and R2, we get R


R = R1 U R2
Goal of Decomposition
• Eliminate redundancy by decomposing a
relation into several relations in a higher
normal form.
• It is important to check that a decomposition
does not lead to bad design
Decomposition
 Consider our original “bad” attribute set
Stuff(sid, name, serno, subj, cid, exp-grade)

 We could decompose it into


Student(sid, name)
Course(serno, cid)
Subject(cid, subj)

 But this decomposition loses information about


the relationship between students and courses.
Why?

12
Lossless Join Decomposition
R1, … Rk is a lossless join decomposition of R w.r.t. an FD set F if
for every instance r of R that satisfies F,
R1(r) ⋈ ... ⋈ Rk(r) = r
Consider:
sid name serno subj cid exp-grade

1 Sam 570103 AI 570 B

23 Nitin 550103 DB 550 A

What if we decompose on
(sid, name) and (serno, subj, cid, exp-grade)?

13
Example of Lossy Decomposition
Problem with Decomposition
• Given instances of the decomposed relations,
we may not be able to reconstruct the
corresponding instance of the original relation
– information loss
Lossy decomposition
• In previous example, additional tuples are
obtained along with original tuples
• Although there are more tuples, this leads to
less information
• Due to the loss of information, decomposition
for previous example is called lossy
decomposition or lossy-join decomposition
Lossless Decomposition
A decomposition {R1, R2,…, Rn} of a relation R
is called a lossless decomposition for R if the
natural join of R1, R2,…, Rn produces exactly
the relation R.
Lossless Decomposition
A decomposition is lossless if we can recover:

R(A, B, C)
Decompose
R1(A, B) R2(A, C)
Recover
R’(A, B, C)

Thus, R’ = R
Lossless Decomposition Property
R : relation
F : set of functional dependencies on R
X,Y : decomposition of R
Decomposition is lossles if :
– X ∩ Y  X, that is: all attributes common to both X and Y
functionally determine ALL the attributes in X OR
– X ∩ Y  Y, that is: all attributes common to both X and Y
functionally determine ALL the attributes in Y
Testing for Lossless Join
R1, R2 is a lossless join decomposition of R with respect to F
iff at least one of the following dependencies is in F+
(R1  R2)  R1
(R1  R2)  R2
So for the FD set:
sid  name
serno  cid, exp-grade
cid  subj

Is (sid, name) and (sid, serno, subj, cid, exp-grade) a lossless


decomposition?

20
Lossless Decomposition Property
• In other words, if X ∩ Y forms a superkey of either
X or Y, the decomposition of R is a lossless
decomposition
Example
R(A1, A2, A3, A4, A5)
F = { A1  A3 A5,
A5  A1 A4,
A3 A4  A2 }
decomposed into
R1 (A1, A2, A3, A5)
R2 (A1, A3, A4)
R3 (A4, A5)
Example (con’t)

A1 A2 A3 A4 A5
R1 a(1) a(2) a(3) b(1,4) a(5)
R2 a(1) b(2,2) a(3) a(4) b(2,5)
R3 b(3,1) b(3,2) b(3,3) a(4) a(5)
Example (con’t)
By FD1: A1  A3 A5

A1 A2 A3 A4 A5
R1 a(1) a(2) a(3) b(1,4) a(5)
R2 a(1) b(2,2) a(3) a(4) b(2,5)
R3 b(3,1) b(3,2) b(3,3) a(4) a(5)
Example (con’t)
By FD1: A1  A3 A5
we have a new result table
A1 A2 A3 A4 A5
R1 a(1) a(2) a(3) b(1,4) a(5)
R2 a(1) b(2,2) a(3) a(4) a(5)
R3 b(3,1) b(3,2) b(3,3) a(4) a(5)
Example (con’t)
By FD2: A5  A1 A4

A1 A2 A3 A4 A5
R1 a(1) a(2) a(3) b(1,4) a(5)
R2 a(1) b(2,2) a(3) a(4) a(5)
R3 b(3,1) b(3,2) b(3,3) a(4) a(5)
Example (con’t)
FD2: A5  A1 A4
we have a new result table
A1 A2 A3 A4 A5
R1 a(1) a(2) a(3) a(4) a(5)
R2 a(1) b(2,2) a(3) a(4) a(5)
R3 a(1) b(3,2) b(3,3) a(4) a(5)
Conclusions
• Decompositions should always be lossless
Lossless decomposition ensure that the
information in the original relation can be
accurately reconstructed based on the
information represented in the decomposed
relations.
Dependency Preservation
Ensures we can “easily” check whether a FD X  Y
is violated during an update to a database:

 The projection of an FD set F onto a set of attributes Z,


FZ is
{X  Y | X  Y  F +, X  Y  Z}
i.e., it is those FDs local to Z’s attributes
 A decomposition R1, …, Rk is dependency preserving if
F + = (FR1 ... FRk)+

The decomposition hasn’t “lost” any essential FD’s, so we


can check without doing a join
30
Example of Lossless and
Dependency-Preserving Decompositions
Given relation scheme
R(name, street, city, st, pin, item, price)
And FD set name  street, city
street, city  st
street, city  pin
name, item  price
Consider the decomposition
R1(name, street, city, st, pin) and R2(name, item, price)
 Is it lossless?
 Is it dependency preserving?
What if we replaced the first FD by name, street  city?

31
Example of Decompositions in BCNF but
not Dependency-Preserving
Given relation scheme
R(sailorid, boatid, date)
Sailorid, boatid->date
(a sailor can reserve a boat for atmost one day)
Date->boatid
(on a give day at most one boat can be reserved)

We cannot decompose into


(sailorid, date) and (date, boatid) since date is not a key.
Thus R is not in BCNF
Since the decomposition do not preserve the dependency
Sailorid, boatid->date
Normal Forms Compared
 BCNF is preferable, but sometimes in conflict with
the goal of dependency preservation
 It’s strictly stronger than 3NF
 BCNF : lossless join decomposition
 3NF :lossless join, dependency preserving decomposition

36
Summary
 We can always decompose into 3NF and get:
 Lossless join
 Dependency preservation

 But with BCNF we are only guaranteed lossless joins


 BCNF is stronger than 3NF: every BCNF schema is
also in 3NF
 The BCNF algorithm is nondeterministic, so there is
not a unique decomposition for a given schema R

37
Fifth normal form (5NF):
Projection-Join Normal form
5NF is related to join dependency, which is the
term used to indicate the property of a relation
scheme that cannot be decomposed losslessly
into two simpler relation schemes, but can be
decomposed losslessly into three or more
simpler relation schemes.
5NF(Projection join normal form)
q1.Display the list of employees having project p2.
prob: E2 is displayed twice.
q2. Display the projects under the manager M1
prob:P1 is displayed twice.
q3. Display the managers of project P2.
prob: M2 is displayed twice.
q4.Display the employees working under M1
prob:E3 is displayed twice

Employee Project Manager


no. no. no.
E1 P1 M1
E1 P2 M2
E2 P2 M2
E2 P2 M4
E3 P1 M1
E3 P3 M1
5NF(Projection join normal form)

Employee Project
no. no.
E1 P1
E1 P2
E2 P2
E3 P1
E3 P3
5NF(Projection join normal form)

Project Manager
no. no.
P1 M1
P2 M2
P2 M4
P3 M1
5NF(Projection join normal form)

Employee Manager
no. no.
E1 M1
E1 M2
E2 M2
E2 M4
E3 M1
These dependencies are the translation of the enterprise’s need that the employees
involved in a given project must have certain expertise. Because of the expertise of
employees, they want to be involved in a given set of projects whose requirements
match their interests.
The expertise of an employee, not needed for
any project to which he or she is assigned, is
not shown in this relation.
• Suppose Smith has expertise in AI as well. As there is no
project in the tables that needs AI, such info was never stored.

• Similarly, if there is an employee Brent with expertise UI and


AI but there is no project yet in the table that needs this
expertise. Thus expert details of Brent are never stored.

• Now if a new project ‘Work Station’ needs to be added which


requires expertise in AI, we cannot find such information from
these two tables.

• Thus we need another table having information between


employees and expertise.
Now consider the relation scheme NEW_PROJECT_ASSIGNMENT. Perhaps after some
modifications in the enterprise involved, there has been a turnover in employees and
the expertise of new employees requires some changes in the assignment of projects.
Figure 7.8 gives a sample table for a relation defined on the scheme
NEW_PROJECT_ASSIGNMENT. As the figure indicates, we are assigning more than one
employee to a given project. Each employee is assigned a specific role in this project,
requiring knowledge that lies within her or his field of expertise. Thus, project Work
Station, which requires expertise in User Interface, Artificial Intelligence, VLSI
Technology, and Operating Systems, can be carried out by Brent, Mann, and Smith
combined. This flexibility was not exhibited in the data of Figure 7.6.

The relation does not


show any functional or
multivalued
dependencies; it is an all-
key relation and therefore
in fourth normal form.
• Unlike the relation PROJECT-ASSIGNMENT, the relation
NEW_PROJECT_ASSIGNMENT cannot be decomposed
losslessly into two relations.

• However, it can be decomposed losslessly into three relations


(thus 5NF). This decomposition is shown in Figure 7.9.
• Two of these relations, when joined, create a relation that
contains extraneous tuples; thus the corresponding
decomposition is not lossless. These superfluous tuples are
removed when the resulting relation is joined with the third
relation.
• ((a|X| b) |X| c)
OR
• ((b|X| c) |X| a)
Retrieves all information in the original table
But ((a|X| c) |X| b) does not because it gives extra records which
are incorrect.

• Note that the MVDs, similar to those exhibited in Figure 7.6,


are embedded in this example.

You might also like