0% found this document useful (0 votes)

3 views

Chapter 5 - B_DBDesign_II

asdad

Uploaded by

YouTubeATP

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Chapter 5 - B_DBDesign_II

asdad

Uploaded by

YouTubeATP

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

Chapter 5B.

Database
Normalization
COMP3278 Introduction to
Database Management Systems

Department of Computer Science, The University of Hong Kong

Outcome 2. Query Languages

Able to understand and use the languages designed for data access.

Outcome 3. System Design

Able to understand the design of an efficient and reliable database
system.

Outcome 4. Application Development

Able to implement a practical application on a real database.
2
Content
Decomposition
Lossless-join decomposition
Dependency preserving decomposition

Normal form
Boyce-Codd Normal Form (BCNF)

3
Motivating example
Let’s consider the following specifications
Employees have eid (key), name, parkingLot.
Departments have did (key), dname, budget.
An employee works in exactly one department, since some date.
Employees who work in the same department must park at
the same parkingLot.

name dname
since budget
eid parkingLot did

Employees Works_in Departments

4
Motivating example
Reduce to relational tables
Employees( eid, name, parkingLot, did, since)
Foreign key: did references Departments(did)
Departments( did, dname, budget)
Observation: In Employees table, whenever did is 1, parkingLot must be “A”!
Implication: The constraint “Employees who work in the same department
must park at the same parkingLot” is NOT utilized in the design!!!
There are some redundancy in the Employees table.

eid name parkingLot did since did dname budget

1 Kit A 1 1/9/2014 1 Human Resource 4M
2 Ben B 2 2/4/2010 2 Accounting 3.5M
3 Ernest B 2 30/5/2011
4 Betty A 1 22/3/2013 Yes! As parkingLot is
5 David A 1 4/11/2004 “functionally depend” on did, we
6 Joe B 2 12/3/2008
7 Mary B 2 14/7/2009
should not put parkingLot in the
8 Wandy A 1 9/8/2008 Employee table. 5
We are going to learn
Database normalization
The process of organizing the columns and tables of
a relational database to minimize redundancy and
dependency.
To make sure that every relation R is in a “good” form.
If R is not “good”, decompose it into a set of relations {R1,
R2, …, Rn}.
Question: How can we do Yes! The theories
the decomposition? can be explained
Are there any guidelines / through functional
theories developed to dependencies ☺.
decompose a relation? 6
6
Normalization goal
We would like to meet the following goals when we
decompose a relation schema R with a set of
functional dependencies F into R1, R2, …, Rn
1. Lossless-join – Avoid the decomposition result in
information loss.

2. Reduce redundancy – The decomposed relations Ri should

be in Boyce-Codd Normal Form (BCNF). (There are also other
normal forms like 3NF.)

3. Dependency preserving – Avoid the need to join the

decomposed relations to check the functional dependencies
when new tuples are inserted into the database. 7
Section 1

Lossless-join
Decomposition

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
The functional dependency B→C tells us
3 2 2 that for all tuples with the same value in B,
3 1 3
4 2 2 there should be at most one corresponding
4 1 3
value in C (E.g., If B=1, C =3 ; if B=2, C=2)
Decompose Question: Will decomposing R(A,B,C) into
R1(A,B) and R2(A,C) cause information lost?
R1 = A, B(R) R2 = A, C(R)
A B A C
1 1 1 3 Think in this way:
1 2 1 2 Is this decomposition “lossless join
2 1 2 3 decomposition”?
3 2 3 2 I.e., Is there any information lost if
3 1 3 3 we decompose R in this way?
4 2 4 2
4 1 4 3 9
Illustration 1
R Functional dependencies R1 ⋈ R2= A, B(R) ⋈ A, C(R)
A B C F = {B →C} A B C
1 1 3 1 1 3

≠
1 2 2
To check if the
1 1 2
2 1 3 1 2 3 decomposition will cause
3 2 2 1 2 2 information lost, let’s try to
3 1 3 2 1 3
4 2 2 3 2 2 join R1 and R2 and see if we
4 1 3 3 2 3 can recover R.
3 1 2 As we see that R1 ⋈ R2 ≠ R,
Decompose 3 1 3
4 2 2 the decomposition has
4 2 3 information lost.
R1 = A, B(R) R2 = A, C(R) 4 1 2
This is NOT a lossless-join
4 1 3
A B A C decomposition.
1 1 1 3
1 2 1 2
2 1 2 3
3 2 3 2
3 1
This is a bad
3 3
4 2 4 2 decomposition
4 1 4 3
10
Illustration 2
R Functional dependencies R1 ⋈ R2 = A, B(R) ⋈ B, C(R)
A B C F = {B →C} A B C How about
1 1 3 1 1 3
decomposing the

=
1 2 2 1 2 2
2 1 3
2
3
1
2
3
2 3 2 2 relation R(A,B,C)
3 1 3 3
4
1
2
3
2
into R1(A,B) and
4 2 2
4 1 3 4 1 3 R2(B,C)?
Decompose

R1 = A, B(R) R2 = B, C(R) Well done! Since

A B B C R1 ⋈ R2 = R, breaking down
1 1 1 3 R to R1 and R2 in this way
1 2 2 2
2 1 has no information lost.
3 2
3 1
This decomposition is
4 2 lossless-join decomposition.
4 1
11
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1 2 2
2
3
1
2
3
2
What is/are the condition(s)
3 1 3
4
4
2
1
2
3
for a decomposition to be
lossless-join?
NOT Lossless-join decomposition Lossless-join decomposition

R1 = A, B(R) R2 = A, C(R) R1 = A, B(R) R2 = B, C(R)

A B A C A B B C
1 1 1 3 1 1 1 3
1 2 1 2 1 2 2 2
2 1 2 3 2 1
3 2 3 2 3 2
3 1 3 3 3 1
4 2 4 2 4 2
4 1 4 3 4 1 12
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1
3 2 2
A B
1 1
3 1 3
4 2 2
Let’s consider the first
4 1 3
tuple (1,1,3) in R.

Note that there is only

ONE tuple in R1 with
NOT Lossless-join A=1, B=1.
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C
1 1 1 3
1 2 1 2
2 1 2 3
3 2 3 2
3 1 3 3
4 2 4 2 13
4 1 4 3
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2 A C
3 2 2
A B
1 1 1 3
3 1 3 1 2
4 2 2
Let’s consider the first
4 1 3 Since A →AC is NOT a
tuple (1,1,3) in R.
functional dependency
in F+, there can be more
Note that there is only
than one tuples with
ONE tuple in R1 with
NOT Lossless-join A=1, B=1.
A=1 in R2
(e.g., (1,3), (1,2) ) .
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C
1 1 1 3
1 2 1 2
2 1 2 3
3 2 3 2
3 1 3 3
4 2 4 2 14
4 1 4 3
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2 A C
3 A B C
A B
3
3
2
1
2
3
1 1 ⋈ 1
1
3
2
= 1 1 3
1 1 2
4 2 2
Let’s consider the first
4 1 3 Since A →AC is NOT a
tuple (1,1,3) in R. Therefore when we join
functional dependency
in F+, there can be more R1 and R2, more than one
Note that there is only tuples will be generated
than one tuples with
ONE tuple in R1 with (i.e., (1,1) in R1 combine
NOT Lossless-join A=1, B=1.
A=1 in R2
with (1,3) and (1,2) in R2 )
(e.g., (1,3), (1,2) ) .
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C Observation:
1 1 1 3 The decomposition of R(A,B,C) into R1(A,B) and R2(A,C)
1 2 1 2
2 1 2 3
is NOT lossless-join because
3 2 3 2 A→ AC
3 1 3 3
4 2 4 2
is NOT in F+ , and … (to be explained in the next slide)
4 1 4 3 15
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1
3 2 2
A C
1 3
3 1 3
4 2 2
Let’s consider the
4 1 3
first tuple (1,1,3) in R.

Note that there is

only ONE tuple in R2
NOT Lossless-join with A=1, C=3.
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C
1 1 1 3
1 2 1 2
2 1 2 3
3 2 3 2
3 1 3 3
4 2 4 2 16
4 1 4 3
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2 A B
3 2 2
A C
1 1
1 3
3 1 3 1 2
4 2 2
Let’s consider the Since A →AB is NOT a
4 1 3
first tuple (1,1,3) in R. functional dependency
in F+, there can be
Note that there is more than one tuples
only ONE tuple in R2 with A=1 in R1
NOT Lossless-join with A=1, C=3. (i.e., (1,1), (1,2) ) .
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C
1 1 1 3
1 2 1 2
2 1 2 3
3 2 3 2
3 1 3 3
4 2 4 2 17
4 1 4 3
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2 A B 3 A B C
A C
3
3
2
1
2
3
1 3 ⋈ 1
1
1
2
= 1
1
1
2
3
3
4 2 2
Let’s consider the Since A →AB is NOT a
4 1 3 Therefore when we join
first tuple (1,1,3) in R. functional dependency
in F+, there can be R1 and R2, more than one
Note that there is more than one tuples tuples will be generated
only ONE tuple in R2 with A=1 in R1 (i.e., (1,3) in R2 combine
NOT Lossless-join with A=1, C=3. (i.e., (1,1), (1,2) ) . with (1,1) and (1,2) in R1 )
decomposition
R1 = A, B(R) R2 = A, C(R)
A B A C Observation:
1 1 1 3 The decomposition of R(A,B,C) into R1(A,B) and R2(A,C)
1 2 1 2
2 1 2 3
is NOT lossless-join because
3 2 3 2 A→ AC (explained in previous slide), and
3 1 3 3 A→ AB
4 2 4 2
4 1 4 3 are NOT in F+ . 18
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1
3 2 2
A B
1 1
3 1 3
4 2 2
4 1 3 Let’s consider the
first tuple (1,1,3) in R.
Note that there is
only ONE tuple in R1
Lossless-join with A=1, B=1.
decomposition
R1 = A, B(R) R2 = B, C(R)
A B B C
1 1 1 3
1 2 2 2
2 1
3 2
3 1
4 2 19
4 1
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2
A B B C
3 2 2
1 1 1 3
3 1 3
4 2 2
4 1 3 Let’s consider the Since B →BC is a
first tuple (1,1,3) in R. functional dependency
Note that there is in F+, there is only one
only ONE tuple in R1 tuple with B=1 in R2.
Lossless-join with A=1, B=1.
decomposition
R1 = A, B(R) R2 = B, C(R)
A B B C
1 1 1 3
1 2 2 2
2 1
3 2
3 1
4 2 20
4 1
Lossless-join decomposition
R Functional dependencies

A B C F = {B →C}
1 1 3
1
2
2
1
2
3
1 2 3
A B B C A B C
3
3
2
1
2
3
1 1 ⋈ 1 3 = 1 1 3
4 2 2
4 1 3 Let’s consider the Since B →BC is a Therefore when we join R1
first tuple (1,1,3) in R. functional dependency and R2, there will be ONLY
Note that there is in F+, there is only one ONE tuple generated, and
only ONE tuple in R1 tuple with B=1 in R2. that must be the
Lossless-join with A=1, B=1. corresponding tuple (1,1,3)
decomposition in R.

R1 = A, B(R) R2 = B, C(R)

A B B C Observation:
1 1 1 3
1 2 2 2
The decomposition of R(A,B,C) into R1(A,B) and
2 1 R2(B,C) is lossless-join because
3 2
3 1
B→ BC
+
is in F .
4 2
4 1 21
Testing for lossless-join decomposition
Consider a decomposition of R into R1 and R2.
Schema of R = schema of R1  schema of R2.

Let schema of R1  schema of R2 be R1 and R2’s

common attributes.
A decomposition of R into R1 and R2 is lossless-join if and
only if at least one of the following dependencies is in F+ .

Schema of R1  schema of R2 → schema of R1

OR
Schema of R1  schema of R2 → schema of R2
22
Example
Question: Given R(A,B,C), F={B→C}, is the following
a lossless join decomposition of R?
R1(A, B) , R2(B, C)
Answer: To see if (R1, R2) is a lossless join
decomposition of R, we do the following:
Find common attributes of R1 and R2 : B
Verify if any of the FD below holds in F+, if one of the FD
holds, then the decomposition is lossless join.
B → R1 (i.e., B → AB?)
B → R2 (i.e., B → BC?)
Since B → BC (by Augmentation rule on B→C ), R1 and R2 are
lossless join decomposition of R. 23
Section 2

Dependency preserving
Decomposition

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Dependency preserving
When decomposing a relation, we also want to keep
the functional dependencies.
A FD X → Y is preserved in a relation R if R contains all the
attributes of X and Y.
If a dependency is lost when R is decomposed into R1
and R2:
When we insert a new record in R1 and R2, we have to
obtain R1⋈ R2 and check if the new record violates the lost
dependency before insertion.
It could be very inefficient because joining is required in
every insertion! 25
Dependency preserving
Note that A→CD is in F+ because R
of the Transitivity axiom.
A B C D
Consider R(A,B,C,D), F = {A → B, B →CD} 1
2
1
1
3
3
4
4
F+ = {A → B, B →CD, A →CD, trivial FDs} 3
4
2
1
2
3
3
4

If R is decomposed to R1(A,B) , R2(B,C,D): Decompose

R1 = A, B(R) R2=B, C, D(R)

F1 = {A → B, trivials}, the projection of on R1
F+
A B B C D
F2 = {B → CD, trivials}, the projection of F+ on R2 1 1 1 3 4
2 1 2 2 3
3 2
4 1
This is a dependency preserving
decomposition as:
(F1  F2)+ = F+
Let us illustrate the implication of
dependency preserving in the next slide. 26
Dependency preserving
R
A B C D
Consider R(A,B,C,D), F = {A → B, B →CD} 1
2
1
1
3
3
4
4
F+ = {A → B, B →CD, A →CD, trivial FDs} 3
4
2
1
2
3
3
4
Is this a lossless join decomposition?
Decompose
Yes! As B→R2 (i.e., B→BCD) holds in F+. R1 = A, B(R) R2=B, C, D(R)
That mean we can recover R by R1⋈ R2.
A B B C D
Why it is dependency preserving? 1 1 1 3 4
2 1 2 2 3
Think about it… 3 2
A B C D
If we insert a new record 5 1 4 4 into R1 and R2: 4 1
A B B C D
R1 5 1 R2 1 4 4

We need to check if the new record will make the database

violate any FDs in F+.
Is such decomposition allow us to do the validation on R1
and R2 ONLY? (But no need to join R1 and R2 to validate it?) 27
Dependency preserving
R
A B C D
F+ = { A → B, B →CD, A →CD , trivials} 1 1 3 4
2 1 3 4
Inserting tuple (5,1,4,4) violates B →CD. 3
4
2
1
2
3
3
4
5 1 4 4

The decomposition is dependency Decompose

R1 = A, B(R) R2=B, C, D(R)

preserving as we only need to check:
A B B C D
A B
Inserting 5 1violate any F1 in R1? 1
2
1
1
1 3 4
2 2 3
3 2
This involves checking F1={A→B}. 4 1
1 4 4

B C D 5 1
Inserting 1 4 4 violate any F2 in R2? Although among the two
validations we haven’t checked
This involves checking F2={B →CD}. A→CD, but since A→B is
checked in F1, and B →CD is
We can check F1 on R1 and F2 on R2 only because checked in F2, if we pass both F1
(F1  F2)+ = F+ and F2, it implies A →CD. 28
Dependency preserving
R
A B C D
What about decompose R to R1(A,B), 1
2
1
1
3
3
4
4
R2(A,C,D) ? 3
4
2
1
2
3
3
4
R is decomposed to R1(A,B) , R2(A,C,D)
Decompose
F+ = {A → B, B →CD, A →CD, trivial FDs} R1 = A, B(R) R2=A, C, D(R)
F1 = {A → B, trivials}, the projection of F+ on R1 A B A C D
1 1 1 3 4
F2 = {A → CD , trivials}, the projection of F+ on R2 2 1 2 3 4
3 2 3 2 3
4 1 4 3 4
This is NOT a dependency preserving
decomposition as:
(F1  F2)+ ≠ F+
Let us illustrate the implication of NOT
dependency preserving in the next slide. 29
Dependency preserving
R
A B C D
What about decompose R to R1(A,B), 1
2
1
1
3
3
4
4
R2(A,C,D) ? 3
4
2
1
2
3
3
4

Is this a lossless join decomposition? Decompose

Yes! As A→R1 (i.e., A→AB) holds in F+. R1 = A, B(R) R2=A, C, D(R)
That mean we can recover R by R1⋈ R2. A B A C D
Is it dependency preserving? 1
2
1
1
1
2
3
3
4
4
3 2 3 2 3
Think about it… 4 1 4 3 4
A B C D
If we insert a new record 5 1 4 4 into R1 and R2:
A B A C D
R1 5 1 R2 5 4 4

We need to check if the new record will make the database

violate any FDs in F+. Is such decomposition allow us to do the
validation on R1 and R2 only (but no need to join R1 and R2)? 30
Dependency preserving
R
A B C D
F+ = { A → B, B →CD, A →CD } 1 1 3 4
2 1 3 4
Inserting tuple (5,1,4,4) violates B →CD. 3 2 2 3
4 1 3 4
The decomposition is NOT dependency 5 1 4 4
Decompose
preserving as if we only check: R1 = A, B(R) R2=A, C, D(R)
A B
Inserting 5 1 violate any F1 in R1? A B A C D
1 3 4
This involves checking F1={A→B}. 1
2
1
1 2 3 4
3 2 3 2 3
Inserting A
5
C
4
D
4 violate any F2 in R2? 4 1 4 3 4
5 1 5 4 4
This involves checking F2={A →CD}.
Although we passed F1 and F2,
We CANNOT check F1 on R1 and F2 on R2 only because it doesn’t mean that we
(F1  F2)+  F+ passed all FDs in F!
Decomposition in this way requires joining tables to It is because we lost the FD
validate B →CD for EVERY INSERTION! B →CD in the decomposition.
31
Dependency preserving
What is the condition(s) for a decomposition
to be dependency preserving?

Let F be a set of functional dependencies on R.

R1, R2, …, Rn be a decomposition of R.
Fi be the set of FDs in F+ that include only attributes in Ri.

A decomposition is dependency preserving if and

only if
(F1  F2  …  Fn)+ = F+
Where Fi is the set of FDs in F+ that include only attributes in Ri.
32
Example 1
Given R(A, B, C) , F = {A → B , B → C}
Is R1(A, B), R2(B, C) a dependency preserving decomposition?

First we need to find F+ , F1 and F2.

F+ = {A→B , B→C, A→C, some trivial FDs}
F1 = {A→B and trivial FDs } Note that A→C is in F+ because of
the Transitivity axiom.
F2 = {B→C and trivial FDs }
Then we check if (F1  F2)+ = F+ is true.
Since F1  F2 = F ,this implies (F1  F2)+ = F+.

This decomposition is dependency preserving.

33
Example 2
Given R(A, B, C) , F = {A → B , B → C}
Is R1(A, B), R2(A, C) a dependency preserving decomposition?

First we need to find F+ , F1 and F2.

F+ = {A→B , B→C, A→C, some trivial FDs}
F1 = {A→B and trivial FDs } Note that A→C is in F+ because of
the Transitivity axiom.
F2 = {A→C and trivial FDs }
Then we check if (F1  F2)+ = F+ is true.
Since B→C disappears in R1 and R2, (F1  F2)+  F+ .

This decomposition is NOT dependency preserving.

34
Section 3

Boyce-Codd
Normal Form

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
FD and redundancy
Consider the following relation: Customer
id name dptID
Customer( id, name, dptID ) 1 Kit 1
2 David 1
F = { {id} → {name, dptID} } 3 Betty 2
4 Helen 2

{id} is a key in Customer.

Because the attribute closure of {id} (i.e., {id}+ = {id, name,
dptID} ), which covers all attributes of Customer.
Observation: All non-trivial FDs in F form a
key in the relation Customer.
This implies that there are no other FD that is just
involve a subset of columns in the relation.
This implies that Customer has no redundancy.
36
FD and redundancy
As another example: Customer
id name dptID building
Customer( id, name, dptID, building) 1 Kit 1 CYC
2 David 1 CYC
F = { {id} → {name, dptID , building} 3 Betty 2 HW
{dptID} → {building} } 4 Helen 2 HW

{dptID} → {building} brings redundancy. Why?

Tuples have the same dptID must have the same building
(e.g., dptID=1, building=“CYC”).
But those tuples can have different values in id and name.
For each different id values with the same dptID, building will
be repeated (redundancy). For example, for tuples with (id=1,
dptID=1) and (id=2, dptID=1) , building
must equal “CYC” (redundancy).
37
FD and redundancy
As another example: Customer
id name dptID building
Customer( id, name, dptID, building) 1 Kit 1 CYC
2 David 1 CYC
F = { {id} → {name, dptID , building} 3 Betty 2 HW
{dptID} → {building} } 4 Helen 2 HW

How to check?
Check if the attribute set closure of {dptID} covers all
attributes in Customer. ({dptID}+ = {dptID, building} ≠ Customer)

Redundancy is related to FDs. If there is an FD

→ , where {}+ does not cover all attributes in
R, then we will have redundancy in R!
38
Boyce-Codd Normal Form
Summarizing the observations, a relation R has no
redundancy, or in Boyce-Codd Normal Form (BCNF),
if the following is satisfied:
For all FDs in F+ of the form  → , where   R and   R,
at least one of the following holds:
We won’t border with trivial
 →  is trivial (i.e.,   ) FDs such as A→A, AB→A …etc

i.e., The attribute set closure of

 is a key (superkey) for R , represented as {}+ , covers
all attributes in R.

In another word, in BCNF, every

non-trivial FD forms a key. 39
How to test for BCNF?
Formally, for verifying if R is in BCNF
For each non-trivial dependency  →  in F+ (the
functional dependency closure), check if + covers the
whole relation (i.e., whether  is a superkey).
If any + does not cover the whole relation, R is not in BCNF.
Simplified test:
It is suffices to check only the dependencies in the given F for
violation of BCNF, rather than check all dependencies in F+

For example, given R(A,B,C); F = {A→B, B→C},

we only need to check if both {A}+ and {B}+ cover {A,B,C}.
We do not need to derive F+ = {A→B, B→C, A→C, …etc} and check
each FD because A→C already considered when computing {A}+.
40
How to test for BCNF?
However, if we decompose R into R1 and R2, we cannot
use only F to check if the “decomposed” relations (i.e.,
R1 and R2) is BCNF, we have to use F+ instead.
Illustration R
A B C D

R(A, B, C, D), F = {A → B, B → C} 1
1
1
1
1
1
1
2
1 1 1 3
1 1 1 4
To test if R is in BCNF, it is suffices to check 1 1 1 5
only the dependencies in F (but not F+) An example R that satisfies F

{A}+ covers all {A,B,C,D}? As illustrated through this instance, since

Since {A}+ = {A,B,C} ≠ {A,B,C,D}, {A}+ = {A,B,C} ≠ {A,B,C,D}, this implies
that it will cause redundancy when we
R is not in BCNF. have tuples with the same value across
{ABC} but different values in D.
41
How to test for BCNF?
To illustrate why we cannot use only F to test
decomposed relations for BCNF, let’s try to
decompose R into R1(A, B) and R2(A, C, D)

Illustration R
A B C D

R(A, B, C, D), F = {A → B, B → C} 1
1
1
1
1
1
1
2
1 1 1 3
Is R2(A, C, D) in BCNF? 1 1 1 4
1 1 1 5

When we check R2, none of FDs in F is R1(A, B) R2(A, C, D)

contained in R2. Does this mean no non-trivial A B A C D
FDs are in R2, and R2 is in BCNF? 1 1 1 1 1
1 1 2
1 1 3
No! We need to use F+ to verify if R2 is BCNF 1
1
1
1
4
5
42
How to test for BCNF?
In R2(A, C, D), A→C is in F+, because:
A→C can be obtained by transitivity rule on A→B and B→C
There is a non trivial FD A→C in R2 that we have missed!
R
Therefore in R2 we check {A}+ = {A,C} ≠ {A,C,D} A
1
B
1
C
1
D
1
1 1 1 2
Thus, A is not a key in R2 1 1 1 3
1 1 1 4
R2 is NOT in BCNF. 1 1 1 5

R1(A, B) R2(A, C, D)
Conclusion: When we test whether a A B A C D
decomposed relation is in BCNF, we must 1 1 1 1 1
1 1 2
project F+ onto the relation (e.g., R2), not F! 1 1 3
1 1 4
1 1 5
43
Section 4

Normalization

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]
Normalization goal
When we decompose a relation R with a set of
functional dependencies F into R1, R2, …, Rn, we try
to meet the following goals:
1. Lossless-join – Avoid the decomposition result in
information loss.

2. No Redundancy – The decomposed relations Ri should be

in Boyce-Codd Normal Form (BCNF). (There are also other
normal forms.)

3. Dependency preserving – Avoid the need to join the

decomposed relations to check the functional dependencies.
45
Illustration
Consider R(A, B, C), F = {A→B , B→C}, is R in BCNF?
If not, decompose R into relations that are in BCNF.
R
A B C
Is R in BCNF? 1 1 2
2 1 2
Because {B}+={B,C} ≠ {A,B,C} 3 1 2
4 1 2
Since {B}+ does not cover all attributes in R, R is NOT in BCNF.
Think in this way: How should we decompose R such that
the decomposed relations are always lossless join?
Note: A decomposition is lossless join if at least one of the
following dependencies is in F+
Schema of R1  schema of R2 → schema of R1
OR
Schema of R1  schema of R2 → schema of R2 46
Illustration
Idea: To make the decomposition always lossless join, we can
pick the FD A→B and make the decomposed relation as:
R1(A,B) – the attributes in the L.H.S. and R.H.S. of the FD.
R2(A,C) – the attribute(s) in the L.H.S. of the FD, and
the remaining attributes that does not appear in R1.
If we decompose the relation R in this way the
following must be true:
Schema of R1  schema of R2 → schema of R1
Schema of R1  schema of R2 is A.
A→R1= A→AB must be true because R1 must consists of the
L.H.S. and R.H.S. of the FD A→B in F.
47
Illustration R1(A, B) R2(A, C)
F = {A→B , B→C} F+ = {A→B , B→C, A→C, trivial FDs}
Fx A→B A→C
Is R1(A, B) in BCNF?
F1 = {A→B, trivial FDs}, it is a projection of F+ on R1.
R
Since {A}+ = {A,B} = R1, {A} is a key in R1. A B C
1 1 2
Since all FDs in F1 forms a key, R1 is in BCNF. 2 1 2
3 1 2
4 1 2
Is R2(A, C) in BCNF?
F2 = {A→C, trivial FDs}, it is a projection of F+ on R2. R1 R2
A B A C
Since {A}+
= {A,C} = R2, {A} is a key in R2. 1 1 1 2
2 1 2 2
Since all FDs in F2 forms a key, R2 is in BCNF. 3 1 3 2
4 1 4 2
Therefore, decomposing R(A, B, C) with F = {A→B , B→C} to
R1(A, B) and R2(A, C) result in a lossless join decomposition
(no information lost), and BCNF relations (no redundancy) 48
Illustration
Is the decomposition dependency preserving ?
F = {A → B , B → C}
(F1  F2) = (A → B , A→C)

Since B→ C disappears in R1 and R2, (F1  F2)+  F+ .

The decomposition is NOT dependency preserving.

Note: Although the decomposition is

not dependency preserving, but it is
lossless join, so we can join R1 and R2 to
test B→C.
49
BCNF decomposition algorithm
result = {R};
done = false;
compute F+;  is not a key;
while (done == false) { →  causes Ri
to violate BCNF
if (there is a schema Ri in result and Ri is not in BCNF)
let  →  be a non-trivial FD that holds on Ri s.t. {}+  Ri
result = (result – Ri)  ( )  (Ri – )
else
done = true; 3. Create a relation containing
} Ri but with  removed.

1. Delete Ri 2. Create a relation with only  and 

Each Ri is in BCNF, and the

decomposition must be lossless-join 50
Example 1 R1(B, C) R2(A, B)

Fx B→C A→B

Consider R(A, B, C), F = {A→B , B→C},

decompose R into relations that are in BCNF.
R
A B C
Alternative decomposition: To make the 1 1 2
2 1 2
decomposition always lossless join, we can pick the FD 3 1 2
4 1 2
B→C and make the decomposed relation as:
R1(B,C) – the attributes in the L.H.S. and R.H.S. of R1 R2
the FD. B C A B
R2(A,B) – the attribute(s) in the L.H.S. of the FD, and 1 2 1 1
2 1
the remaining attributes that does not appear in R1. 3 1
4 1

51
Example 1 R1(B, C) R2(A, B)

F = {A→B , B→C} F+ = {A→B , B→C, A→C, trivial FDs} Fx B→C A→B

Decomposition: R1(B, C), R2(A, B)

Is R1(B, C) in BCNF? R
A B C
F+
F1 = {B→C, trivial FDs}, it is a projection of on R1. 1 1 2
2 1 2
Since {B}+ = {B,C} = R1, {B} is a key in R1. 3 1 2
4 1 2
Since all FDs in F1 forms a key, R1 is in BCNF.
Is R2(A, B) in BCNF? R1 R2
B C A B
F2 = {A→B, trivial FDs}, it is a projection of F+ on R2. 1 2 1 1
Since {A}+ = {A,B} = R2, {A} is a key in R2. 2
3
1
1
4 1
Since all FDs in F2 forms a key, R2 is in BCNF.

52
Example 1 R1(B, C) R2(A, B)

Fx B→C A→B

Is the decomposition lossless join?

From the illustration in example 1, the R
A B C
decomposition must be lossless join. 1 1 2
2 1 2
Is the decomposition dependency preserving ? 3 1 2
4 1 2
F = {A→B , B→C}
(F1  F2) = (B → C , A→B) R1 R2

Since F = (F1  F2) , this implies (F1  F2)+ = F+ . B C

1 2
A
1
B
1
2 1
The decomposition is dependency preserving. 3 1
4 1
That means if we insert a new tuple, if the new tuple does
not violate F1 in R1, and F2 in R2, it won’t violate F+ in R. 53
Example 2
Consider a relation R in a bank:
R (b_name, b_city, assets, c_name, l_num, amount)
Each specific value in bname is
F = { {b_name} → {assets, b_city}, corresponds to at most one at most
one {asset , b_city} value
{l_num} → {amount, b_name}, Each l_num corresponds to at most
{l_num, c_name} → everything } one at most one {amount, b_name}
value.

Each { l_num, c_name} corresponds

to at most one {b_name, b_city,
Decomposition assets, amount} value.

With {b_name} → {assets, b_city}, {b_name}+ ≠ R,

R is not in BCNF.
Decompose R into R1(b_name, assets, b_city) and
R2(b_name, c_name, l_num, amount). 54
Example 2
Is R1(b_name, assets, b_city) in BCNF?
Projection of F+
F1 = { {b_name} → {assets, b_city}, trivial FDs} on F1.

{b_name}+ = {b_name, assets, b_city} = R1,

so {b_name} is a key in R1.
Since all FD in F1 forms a key in R1, R1 is in BCNF.
Is R2(b_name, c_name, l_num, amount) in BCNF?
F2 = { {l_num} → {amount, b_name} , Projection of F+
{l_num, c_name} → {all attributes} } on F2.

{l_num}+ = {l_num, amount, b_name} ≠ R2,

so {l_num} is NOT a key in R2.
Since NOT all FD in F2 forms a key in R2, R2 is NOT in BCNF.
55
Example 2
Picking {l_num} → {amount, b_name}, R2 is further
decomposed into:
R3(l_num, amount, b_name)
R4(c_name, l_num)

Is R3(l_num, amount, b_name) in BCNF?

F3 = {{l_num} → {amount, b_name}, trivial FDs}
{l_num}+ = {l_num, amount, b_name} = R3, so {l_num} is a
key in R3.
Since all FD in F3 forms a key in R3, R3 is in BCNF.
56
Example 2
Is R4(c_name, l_num) in BCNF?
F4 = {trivial FDs}
Since all FD in F4 forms a key in R4, R4 is in BCNF.
Now, R1, R3 and R4 are in BCNF;

The decomposition is also lossless-join.

57
Example 2

The decomposition is also dependency preserving.

F1 = { {b_name} → {assets, b_city}, trivial FDs}
F3 = {{l_num} → {amount, b_name}, trivial FDs}
{l_num} → {b_name} … (i)
by Decomposition of {l_num} → {amount, b_name}
{l_num} → {assets, b_city} … (ii)
by Transitivity of (i) and {b_name} → {assets, b_city}
{l_num} → {b_name ,assets, b_city, amount} by Union of F3 and (ii)

{l_num, c_name} → {l_num ,c_name, b_name ,assets, b_city, amount} by

Augmentation

Therefore F1  F3  F4 = F, which implies (F1  F3  F4)+ = F+ .

The decomposition is dependency preserving. 58
BCNF doesn’t imply dependency preserving
R
A B C
1 1 2
It is not always possible to get a BCNF 2 1 2
1 2 3
decomposition that is dependency preserving.
R1 R2
Consider R(A, B, C); F = { AB→C, C→B } A B B C
Not lossless
1 1 1 2
2 1 decomposition
There are two candidate keys: 1 2
2 3

{AB}, and {AC}. R1 R2

A B A C
{AB}+ = {A,B,C} = R 1 1 1 2 Not lossless
2 1 2 2 decomposition
{AC}+ = {A,B,C} = R 1 1 1 3

R is not in BCNF, since C is not a key. R1 R2 lossless

A C B C F1= {Ø}
Decomposition of R must fail to 1 2 1 2 F2= {C→B}
2 2 2 3
preserve AB→C. 1 3 Not dependency
preserving
59
Motivating example
Back to our motivating example, we have:
Employees( eid, name, parkingLot, did, since)
Departments( did, dname, budget)

“Employees who work in the same department must

park at the same parkingLot.” implies the following FD:
FD: did → parkingLot
Is Employees in BCNF?
{did}+ = {did, parkingLot} ≠ {eid, name, parkingLot, did, since}
Since did is not a key, Employees is NOT in BCNF.
60
Normalization
Employees( eid, name, parkingLot, did, since) is
decomposed to
Employees2( eid, name, did, since)
Dept_Lots( did, parkingLot)
With Departments( did, dname, budget), the above
two decomposed relations are further refined to
Employees2( eid, name, did, since)
Departments( did, dname, parkingLot, budget)

Good design: parking lots for all employees can be updated

by changing their department-specific parkingLot. 61
Summary
Relational database design goals
Lossless-join
No redundancy (BCNF)
Dependency preservation

It is not always possible to satisfy the three goals.

A lossless join, dependency preserving decomposition into
BCNF may not always be possible.

SQL does not provide a direct way of specifying FDs

other than superkeys.
Can use assertions to check FD, but it is quite expensive.
62
Chapter 5B.

END
COMP3278 Introduction to
Database Management Systems

Department of Computer Science, The University of Hong Kong

Slides prepared by - Dr. Chui Chun Kit, http://www.cs.hku.hk/~ckchui/ for students in COMP3278
For other uses, please email : [email protected]

Assignment 2
100% (2)
Assignment 2
2 pages
Wireless Lab
No ratings yet
Wireless Lab
38 pages
Dengue Test RPT 2 Edit
No ratings yet
Dengue Test RPT 2 Edit
3 pages
Habitat 2 23e
No ratings yet
Habitat 2 23e
44 pages
Module - 4 Notes - 06-11-2021
No ratings yet
Module - 4 Notes - 06-11-2021
19 pages
8 part B Relational Database Design
No ratings yet
8 part B Relational Database Design
51 pages
De 2023
No ratings yet
De 2023
1 page
IT 220 - 2018 - Unit - 5
No ratings yet
IT 220 - 2018 - Unit - 5
50 pages
Relational Database Design - FDs
No ratings yet
Relational Database Design - FDs
42 pages
DBMS Module 3 Complete Solutions
No ratings yet
DBMS Module 3 Complete Solutions
1 page
Lt17 Decomposition
No ratings yet
Lt17 Decomposition
30 pages
Extra Normalization (Self Learning) - Tagged
No ratings yet
Extra Normalization (Self Learning) - Tagged
9 pages
Dev Ia 1 QP
No ratings yet
Dev Ia 1 QP
2 pages
2.9 Logarithmic Expressions
No ratings yet
2.9 Logarithmic Expressions
4 pages
Normal Forms
No ratings yet
Normal Forms
69 pages
Scheduled Management - Regina Wayi
No ratings yet
Scheduled Management - Regina Wayi
6 pages
Week 5
No ratings yet
Week 5
27 pages
Table of Specification
No ratings yet
Table of Specification
3 pages
MTP Solution
No ratings yet
MTP Solution
19 pages
Bcs301 Pue_session 2023-24
No ratings yet
Bcs301 Pue_session 2023-24
5 pages
R-21 Iat QP CD2
No ratings yet
R-21 Iat QP CD2
4 pages
Bilal DBMS Unit-3
No ratings yet
Bilal DBMS Unit-3
37 pages
DBMS Chap 3
No ratings yet
DBMS Chap 3
13 pages
Mathematics 9: Long Test # 1: Radicals
No ratings yet
Mathematics 9: Long Test # 1: Radicals
3 pages
Mathematics 8 Quarter 1 Week 4: NAME: - GR & SEC: - Competencies
No ratings yet
Mathematics 8 Quarter 1 Week 4: NAME: - GR & SEC: - Competencies
10 pages
RDBMSUNIT3
No ratings yet
RDBMSUNIT3
15 pages
Chapter 6 - Pole - Placement - N - PID
No ratings yet
Chapter 6 - Pole - Placement - N - PID
89 pages
MT
No ratings yet
MT
9 pages
AI202- Set1
No ratings yet
AI202- Set1
2 pages
DBMS QB
No ratings yet
DBMS QB
5 pages
CT2 Set B
No ratings yet
CT2 Set B
4 pages
17 Decomposition
No ratings yet
17 Decomposition
20 pages
05.0 PP 50 68 A Little Bit of Model Theory
No ratings yet
05.0 PP 50 68 A Little Bit of Model Theory
19 pages
CT2 Set B
No ratings yet
CT2 Set B
4 pages
TEE_CSE3001__DBMS_100237_Dr. Harihrasitaraman.S_Winter21-22-Block1- QP
No ratings yet
TEE_CSE3001__DBMS_100237_Dr. Harihrasitaraman.S_Winter21-22-Block1- QP
3 pages
Lossless Join
No ratings yet
Lossless Join
25 pages
Nco Sample Paper Class-9
No ratings yet
Nco Sample Paper Class-9
2 pages
Nco Sample Paper Class-9
No ratings yet
Nco Sample Paper Class-9
2 pages
DBMS
No ratings yet
DBMS
8 pages
SS ZG518
No ratings yet
SS ZG518
5 pages
550_lecture15
No ratings yet
550_lecture15
28 pages
Deloitte Sample Aptitude Questions and Answers
No ratings yet
Deloitte Sample Aptitude Questions and Answers
29 pages
Cup Data Template
No ratings yet
Cup Data Template
5 pages
Office of The Controller of Examinations Continuous Internal Evaluation Test - I
No ratings yet
Office of The Controller of Examinations Continuous Internal Evaluation Test - I
3 pages
Unit 3 NEP DBMS
No ratings yet
Unit 3 NEP DBMS
27 pages
AQA 3421F W QP Jun03
No ratings yet
AQA 3421F W QP Jun03
32 pages
MCS 023
No ratings yet
MCS 023
118 pages
Mathematics: Quarter 2 - Module 6 Writes Expressions With Rational Exponents As Radicals and Vice Versa
80% (5)
Mathematics: Quarter 2 - Module 6 Writes Expressions With Rational Exponents As Radicals and Vice Versa
17 pages
IAT Paper Jan-June 22 DMBI DIV A&B Solution
No ratings yet
IAT Paper Jan-June 22 DMBI DIV A&B Solution
10 pages
1. Lect 9 Decomposition
No ratings yet
1. Lect 9 Decomposition
35 pages
Batch 1 Set2
No ratings yet
Batch 1 Set2
7 pages
T1010 - Mathematics N3 April QP 2021
No ratings yet
T1010 - Mathematics N3 April QP 2021
9 pages
Normalization New 1
No ratings yet
Normalization New 1
60 pages
CT3 Set C - Qwerty
No ratings yet
CT3 Set C - Qwerty
4 pages
DBMS Module4 Questions with Answers
No ratings yet
DBMS Module4 Questions with Answers
17 pages
Unit II
No ratings yet
Unit II
48 pages
CS PQMS
No ratings yet
CS PQMS
9 pages
SIMPLIFYING-RATIONAL-EXPRESSIONS-1-1-1-1
No ratings yet
SIMPLIFYING-RATIONAL-EXPRESSIONS-1-1-1-1
9 pages
Pue DBMS 2022-2023
No ratings yet
Pue DBMS 2022-2023
2 pages
Let's Practise: Maths Workbook Coursebook 8
From Everand
Let's Practise: Maths Workbook Coursebook 8
ExcelSoft Technologies Pvt. Ltd.
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Let's Practise: Maths Workbook Coursebook 6
From Everand
Let's Practise: Maths Workbook Coursebook 6
ExcelSoft Technologies Pvt. Ltd.
No ratings yet
Biography Raymond Catell
No ratings yet
Biography Raymond Catell
8 pages
Contactors and Contactor Assemblies: Sirius
No ratings yet
Contactors and Contactor Assemblies: Sirius
16 pages
Balloon Theory PDF
No ratings yet
Balloon Theory PDF
9 pages
Using Google Earth Engine With R
No ratings yet
Using Google Earth Engine With R
14 pages
Biotechnology and Biochemical Engineering: Prasanna B.D. Sathyanarayana N. Gummadi Praveen V. Vadlani Editors
No ratings yet
Biotechnology and Biochemical Engineering: Prasanna B.D. Sathyanarayana N. Gummadi Praveen V. Vadlani Editors
233 pages
Mid 234
No ratings yet
Mid 234
38 pages
SL Swagger
No ratings yet
SL Swagger
23 pages
Group F: Information Engineering: 4F1 - Control System Design
No ratings yet
Group F: Information Engineering: 4F1 - Control System Design
4 pages
Cloud Computing Note (1)
No ratings yet
Cloud Computing Note (1)
15 pages
Chapter 05 - Electric Motor and Generator
No ratings yet
Chapter 05 - Electric Motor and Generator
75 pages
DLP Math Iv
No ratings yet
DLP Math Iv
4 pages
A Survey of Non - Relational Databases With Big Data: Bansari H. Kotecha Prof. Hetal Joshiyara
No ratings yet
A Survey of Non - Relational Databases With Big Data: Bansari H. Kotecha Prof. Hetal Joshiyara
6 pages
CAT-4003 MRI-M500 Series Intelligent Modules
100% (1)
CAT-4003 MRI-M500 Series Intelligent Modules
2 pages
Phontech 8300 MkII User Manual/installation
No ratings yet
Phontech 8300 MkII User Manual/installation
42 pages
DVR2000E Installation Operation Maintenance Manual PDF
100% (1)
DVR2000E Installation Operation Maintenance Manual PDF
93 pages
Moderation Feedback Report On SBA Physics
No ratings yet
Moderation Feedback Report On SBA Physics
1 page
Seminar Paper
No ratings yet
Seminar Paper
8 pages
CBSE Class 8 Mathematics Worksheet - Mensuration
33% (6)
CBSE Class 8 Mathematics Worksheet - Mensuration
9 pages
DJ-Tech-Tools - Midi-Fighter - 64-User-Guide 2017 - Englisch
No ratings yet
DJ-Tech-Tools - Midi-Fighter - 64-User-Guide 2017 - Englisch
16 pages
XII maths practical QB 2024-25
No ratings yet
XII maths practical QB 2024-25
1 page
Rc-2 Midterm Exam Part 1
No ratings yet
Rc-2 Midterm Exam Part 1
3 pages
LL Seminar Romania WE Are GE
No ratings yet
LL Seminar Romania WE Are GE
62 pages
Carrom: Read Out
No ratings yet
Carrom: Read Out
4 pages
Failure Analysis For Dummies
No ratings yet
Failure Analysis For Dummies
79 pages
Design, Full-Scale Testing and CE Certification of Anti-Seismic Devices According To The New European Norm EN 15129: Elastomeric Isolators
No ratings yet
Design, Full-Scale Testing and CE Certification of Anti-Seismic Devices According To The New European Norm EN 15129: Elastomeric Isolators
10 pages
From Scheele and Berzelius To Müller
No ratings yet
From Scheele and Berzelius To Müller
14 pages
Eletromagnetismo 3
No ratings yet
Eletromagnetismo 3
10 pages