Functional Dependencies and Normalization For Relational Databases
Functional Dependencies and Normalization For Relational Databases
GUIDELINE 2:
Design a schema that does not suffer from the
insertion, deletion and update anomalies.
If there are any anomalies present, then note them
so that applications can be made to take them into
account.
GUIDELINE 4:
The relations should be designed to satisfy the
lossless join condition.
No spurious tuples should be generated by doing
a natural-join of any relations.
Note that:
Property (a) is extremely important and cannot be
sacrificed.
Property (b) is less stringent and may be sacrificed.
SSN ENAME
AG I
1. Does A R? == Is (A)+ R
2. Does G R? == Is (G)+ R
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Uses of Attribute Closure
There are several uses of the attribute closure algorithm:
Testing for superkey:
To test if is a superkey, we compute +, and check if + contains
all attributes of R.
Testing functional dependencies
To check if a functional dependency holds (or, in other
words, is in F+), just check if +.
That is, we compute + by using attribute closure, and then check
if it contains .
Is a simple and cheap test, and very useful
Computing closure of F
For each R, we find the closure +, and for each S +, we
output a functional dependency S.
32
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Algorithm for Finding a Key K for R Given a set F of Functional
Dependencies
Find a set K of all candidate keys from the super key set by
removing any super key that is not minimum
34
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Algorithm for Finding all candidate keys for R (cont.)
Example: Given Q(R), R = {A, B, C, D, E, G} and
F = { f1: AE C, f2 : CG A, f3 : BD G, f4: GA E}
Find all candidate keys of Q(R).
- Phase 1:
computing closures of (2n -1) not null subsets of R.
ABCDEG XR X+F Siêu khóa
000001 G G
110100 ABD ABCDEG ABD
...
110110 ABDE ABDE
35
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Algorithm for Finding all candidate keys for R (cont.)
- The result of phase 1:
S = { ABD, BCD, ABCD, ABDE, BCDE, ABCDE, ABDG,
BCDG, ABCDG, ABDEG, BCDEG, ABCDEG}
- The result of phase 2 : choose minimum sets in S.
K = { ABD, BDC}
36
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Improved Algorithm for Finding all candidate keys for R
Comment:
If R is large, it takes a long time for computing closures of
(2n –1) subsets of R.
The above algorithm is improved as follows:
Given A R,
- A is called a source attribute if A is not in the right hand
side of any non trivial FD of F
The set of all source attributes denotes by N
- A is called a destination attribute if A is not a source
attribute and A is not in the left hand side of any non
trivial FD of F.
The set of all destination attributes denotes by D.
37
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Improved Algorithm for Finding all candidate keys for R
- The set of attributes which are not source or destination
attributes denotes by L.
38
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Improved Algorithm for Finding all candidate keys for R
Example: Given Q(R), R = {A, B, C, D, E, G} và
F = { f1: AE C, f2 : CG A, f3 : BD G, f4: GA E}
Find all candidate keys of Q
N = {B,D}, D= , L = {A,C, E, G}
- Phase 1:
Compute closure of subsets of R.
L= X= N U li X+F Siêu
ACEG L khóa
0000 BD
0001 BDG
Normal form:
Condition using keys and FDs of a relation to
certify whether a relation schema is in a particular
normal form
we cannot
reconstruct the
original
employee
relation -- we
lose information
A B C A B B C
1 A 1 1 A
2 B 2 2 B
r A,B(r) B,C(r)
A B C
A (r) B (r)
1 A
2 B
R1 R2 R2
R1 R2 = {B} and B BC
Dependency preserving
R1 R2 = {A} and A AB
Not dependency preserving
Repeat
for each Ri in the decomposition
t = (result Ri)+ Ri
result = result t
Until (result does not change)
If result contains all attributes in , then the functional
dependency is preserved.
We apply the test on all dependencies in F to check if a
decomposition is dependency preserving
This procedure takes polynomial time, instead of the exponential
time required to compute F+ and (F1 F2 … Fn)+
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
Goals of Normalization
Let R be a relation scheme with a set F of functional
dependencies.
Decide whether a relation scheme R is in “good”
form.
In the case that a relation scheme R is not in “good”
form, decompose it into a set of relation scheme
{R1, R2, ..., Rn} such that
each relation scheme is in good form
the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency
preserving.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
4. Normal Forms
2. employee_id branch_name
dependency
No other attribute is extraneous, so we get FC =
considered
The resultant simplified 3NF schema is:
R1(customer_id, employee_id, type)
R2(customer_id, branch_name, employee_id)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe
4.4. BCNF (Boyce-Codd Normal Form)
A relation schema R is in Boyce-Codd Normal Form (BCNF) if
whenever an FD X A holds in R, then X is a superkey of R
Each normal form is strictly stronger than the previous one
Every 2NF relation is in 1NF
= building, budget
dependency
(+ - ) Ri
can be shown to hold on Ri, and Ri violates BCNF.
We use above dependency to decompose Ri
class-1.
We replace class-1 by:
Lossless join.
Dependency preservation.