ch7
ch7
Database System Concepts - 7th Edition 7.2 ©Silberschatz, Korth and Sudarshan
ER versus Normalization
Features of ER
Entity sets and Relationship sets
Mapping to tables
Features of Normalization
Sets of all attributes used in the database
Distribution of the attributes to various tables
Database System Concepts - 7th Edition 7.3 ©Silberschatz, Korth and Sudarshan
Overview of Normalization
Normalization Goal
Database System Concepts - 7th Edition 7.5 ©Silberschatz, Korth and Sudarshan
Features of Good Relational Designs
Consider the new relation in_dep that combines the instructor and
department tables
Database System Concepts - 7th Edition 7.6 ©Silberschatz, Korth and Sudarshan
Good Form
Database System Concepts - 7th Edition 7.7 ©Silberschatz, Korth and Sudarshan
Decomposition
Database System Concepts - 7th Edition 7.8 ©Silberschatz, Korth and Sudarshan
A Lossy Decomposition
Database System Concepts - 7th Edition 7.9 ©Silberschatz, Korth and Sudarshan
Lossless Decomposition
Database System Concepts - 7th Edition 7.10 ©Silberschatz, Korth and Sudarshan
Lossless Decomposition Example
Decomposition of R = (A, B, C)
Into
R1 = (A, B) R2 = (B, C)
Example of a database instance
Database System Concepts - 7th Edition 7.11 ©Silberschatz, Korth and Sudarshan
Lossless Decomposition
Database System Concepts - 7th Edition 7.12 ©Silberschatz, Korth and Sudarshan
Functional Dependencies
Database System Concepts - 7th Edition 7.13 ©Silberschatz, Korth and Sudarshan
Functional Dependencies (Cont.)
Database System Concepts - 7th Edition 7.14 ©Silberschatz, Korth and Sudarshan
Functional Dependencies Definition
Database System Concepts - 7th Edition 7.15 ©Silberschatz, Korth and Sudarshan
Functional Dependencies Example
A B
1 4
1 6
3 7
On this instance,
B A hold;
A B does NOT hold,
Database System Concepts - 7th Edition 7.16 ©Silberschatz, Korth and Sudarshan
Keys and Functional Dependencies
Database System Concepts - 7th Edition 7.17 ©Silberschatz, Korth and Sudarshan
Use of Functional Dependencies
Database System Concepts - 7th Edition 7.18 ©Silberschatz, Korth and Sudarshan
Closure of a Set of Functional Dependencies
Database System Concepts - 7th Edition 7.19 ©Silberschatz, Korth and Sudarshan
Trivial Functional Dependencies
Database System Concepts - 7th Edition 7.20 ©Silberschatz, Korth and Sudarshan
Lossless Decomposition
Database System Concepts - 7th Edition 7.21 ©Silberschatz, Korth and Sudarshan
Example
R = (A, B, C)
F = {A B, B C)
R1 = (A, B), R2 = (B, C)
Lossless decomposition:
R1 R2 = {B} and B BC
R3 = (A, B), R4 = (A, C)
Lossless decomposition:
R3 R4 = {A} and A AB
Notational Note:
B BC
is a shorthand notation for
B {B, C}
Database System Concepts - 7th Edition 7.22 ©Silberschatz, Korth and Sudarshan
Dependency Preservation
Database System Concepts - 7th Edition 7.23 ©Silberschatz, Korth and Sudarshan
Example
R = (A, B, C)
F = {A B, B C}
R1 = (A, B), R2 = (B, C)
Lossless-join decomposition:
Dependency preserving
R3 = (A, B), R4 = (A, C)
Lossless-join decomposition:
Not dependency preserving
Cannot check B C without computing R1 R2
Database System Concepts - 7th Edition 7.24 ©Silberschatz, Korth and Sudarshan
University Example
Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
With function dependencies:
i_ID dept_name
s_ID, dept_name i_ID
In the above design we are forced to repeat the department
name once for each time an instructor participates in a
dept_advisor relationship.
To fix this, we need to decompose dept_advisor
Any decomposition will not include all the attributes in
s_ID, dept_name i_ID
Thus, the composition is NOT dependency preserving
Database System Concepts - 7th Edition 7.25 ©Silberschatz, Korth and Sudarshan
Design Goals when Decomposing a Relation
Database System Concepts - 7th Edition 7.26 ©Silberschatz, Korth and Sudarshan
Normal Forms
Boyce-Codd Normal Form
Database System Concepts - 7th Edition 7.28 ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form Examples
Database System Concepts - 7th Edition 7.29 ©Silberschatz, Korth and Sudarshan
Decomposing a Schema into BCNF
Database System Concepts - 7th Edition 7.30 ©Silberschatz, Korth and Sudarshan
Decomposing a Schema into BCNF (Cont.)
Database System Concepts - 7th Edition 7.31 ©Silberschatz, Korth and Sudarshan
BCNF and Dependency Preservation
Database System Concepts - 7th Edition 7.32 ©Silberschatz, Korth and Sudarshan
Third Normal Form
Database System Concepts - 7th Edition 7.33 ©Silberschatz, Korth and Sudarshan
3NF Example
Consider a schema:
dept_advisor(s_ID, i_ID, dept_name)
With function dependencies:
i_ID dept_name
s_ID, dept_name i_ID
Two candidate keys = {s_ID, dept_name}, {s_ID, i_ID }
We have seen before that dept_advisor is not in BCNF
It is, however, is in 3NF
s_ID, dept_name is a superkey
i_ID dept_name and i_ID is NOT a superkey, but:
{ dept_name} – {i_ID } = {dept_name } and
dept_name is contained in a candidate key
Database System Concepts - 7th Edition 7.34 ©Silberschatz, Korth and Sudarshan
Redundancy in 3NF
• Repetition of information
• Need to use null values (e.g., to represent the relationship l2, k2
where there is no corresponding value for J)
Database System Concepts - 7th Edition 7.35 ©Silberschatz, Korth and Sudarshan
Comparison of BCNF and 3NF
Database System Concepts - 7th Edition 7.36 ©Silberschatz, Korth and Sudarshan
How good is BCNF?
Database System Concepts - 7th Edition 7.37 ©Silberschatz, Korth and Sudarshan
How good is BCNF? (Cont.)
Database System Concepts - 7th Edition 7.38 ©Silberschatz, Korth and Sudarshan
Higher Normal Forms
It is better to decompose inst_info into:
inst_child:
inst_phone:
This suggests the need for higher normal forms, such as Fourth
Normal Form (4NF).
Database System Concepts - 7th Edition 7.39 ©Silberschatz, Korth and Sudarshan
Functional-Dependency Theory
Functional-Dependency Theory Roadmap
Database System Concepts - 7th Edition 7.41 ©Silberschatz, Korth and Sudarshan
Closure of a Set of Functional Dependencies
Database System Concepts - 7th Edition 7.42 ©Silberschatz, Korth and Sudarshan
Closure of a Set of Functional Dependencies
Database System Concepts - 7th Edition 7.43 ©Silberschatz, Korth and Sudarshan
Example of F+
R = (A, B, C, G, H, I)
F={AB
AC
CG H
CG I
BH}
Some members of F+
AH
By transitivity from A B and B H
AG I
By augmenting A C with G, to get AG CG
and then transitivity with CG I
CG HI
By augmenting CG I to infer CG CGI,
Database System Concepts - 7th Edition 7.44 ©Silberschatz, Korth and Sudarshan
Closure of Functional Dependencies (Cont.)
Additional rules:
Union rule: If holds and holds, then holds.
Decomposition rule: If holds, then holds and
holds.
Pseudotransitivity rule: If holds and holds, then
holds.
The above rules can be inferred from Armstrong’s axioms.
Database System Concepts - 7th Edition 7.45 ©Silberschatz, Korth and Sudarshan
Procedure for Computing F+
Database System Concepts - 7th Edition 7.46 ©Silberschatz, Korth and Sudarshan
Closure of Attribute Sets
Database System Concepts - 7th Edition 7.47 ©Silberschatz, Korth and Sudarshan
Example of Closure of Attribute Sets
R = (A, B, C, G, H, I)
F={AB
AC
CG H
CG I
BH}
(AG)+
1. result = AG
2. result = ABCG (A C and A B and A AG )
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
Is AG a super key?
Does AG R? Is (AG)+ R
Is any subset of AG a superkey?
Does A R? Is (A)+ R
Does G R? Is (G)+ R
In general: check for each subset of size n-1
Database System Concepts - 7th Edition 7.48 ©Silberschatz, Korth and Sudarshan
Uses of Attribute Closure
Database System Concepts - 7th Edition 7.49 ©Silberschatz, Korth and Sudarshan
Algorithms for Decomposition Using FD
BCNF
BCNF Decomposition Algorithm
result := {R };
done := false;
compute F+;
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let be a nontrivial functional
dependency that
holds on Ri such that Ri is not in F+,
and = ;
result := (result – Ri ) (, ) (Ri – ) ;
end
else done := true;
Database System Concepts - 7th Edition 7.52 ©Silberschatz, Korth and Sudarshan
Example of BCNF Decomposition
Database System Concepts - 7th Edition 7.53 ©Silberschatz, Korth and Sudarshan
BCNF Decomposition (Cont.)
course is in BCNF
How do we know this?
building, room_number → capacity holds on class-1
But {building, room_number} is not a superkey for class-1.
We replace class-1 by:
classroom (building, room_number, capacity)
section (course_id, sec_id, semester, year, building
room_number, time_slot_id)
classroom and section are in BCNF.
Database System Concepts - 7th Edition 7.54 ©Silberschatz, Korth and Sudarshan
3NF
Canonical Cover
Database System Concepts - 7th Edition 7.56 ©Silberschatz, Korth and Sudarshan
3NF Decomposition Algorithm
Database System Concepts - 7th Edition 7.57 ©Silberschatz, Korth and Sudarshan
3NF Decomposition Algorithm (Cont.)
Database System Concepts - 7th Edition 7.58 ©Silberschatz, Korth and Sudarshan
3NF Decomposition: An Example
Relation schema:
cust_banker_branch = (customer_id, employee_id, branch_name, type )
The functional dependencies for this relation schema are:
customer_id, employee_id branch_name, type
employee_id branch_name
customer_id, branch_name employee_id
We first compute a canonical cover
branch_name is extraneous in the r.h.s. of the 1st dependency
No other attribute is extraneous, so we get FC =
customer_id, employee_id type
employee_id branch_name
customer_id, branch_name employee_id
Database System Concepts - 7th Edition 7.59 ©Silberschatz, Korth and Sudarshan
3NF Decompsition Example (Cont.)
Database System Concepts - 7th Edition 7.60 ©Silberschatz, Korth and Sudarshan
Comparison of BCNF and 3NF
Database System Concepts - 7th Edition 7.61 ©Silberschatz, Korth and Sudarshan
Design Goals
Goal for a relational database design is:
BCNF.
Lossless join.
Dependency preservation.
If we cannot achieve this, we accept one of
Lack of dependency preservation
Redundancy due to use of 3NF
Interestingly, SQL does not provide a direct way of specifying
functional dependencies other than superkeys.
Can specify FDs using assertions, but they are expensive to test,
(and currently not supported by any of the widely used databases!)
Even if we had a dependency preserving decomposition, using SQL
we would not be able to efficiently test a functional dependency whose
left-hand side is not a key.
Database System Concepts - 7th Edition 7.62 ©Silberschatz, Korth and Sudarshan
Multivalued Dependencies
Higher Normal Forms
Database System Concepts - 7th Edition 7.64 ©Silberschatz, Korth and Sudarshan
Multivalued Dependencies (MVD)
Database System Concepts - 7th Edition 7.65 ©Silberschatz, Korth and Sudarshan
MVD -- Tabular representation
Tabular representation of
Database System Concepts - 7th Edition 7.66 ©Silberschatz, Korth and Sudarshan
MVD (Cont.)
Database System Concepts - 7th Edition 7.67 ©Silberschatz, Korth and Sudarshan
Example
In our example:
ID child_name
ID phone_number
The above formal definition is supposed to formalize the notion
that given a particular value of Y (ID) it has associated with it a set
of values of Z (child_name) and a set of values of W
(phone_number), and these two sets are in some sense
independent of each other.
Note:
If Y Z then Y Z
Indeed, we have (in above notation) Z1 = Z2
Database System Concepts - 7th Edition 7.68 ©Silberschatz, Korth and Sudarshan
Use of Multivalued Dependencies
We use multivalued dependencies in two ways:
1. To test relations to determine whether they are legal under a given
set of functional and multivalued dependencies
2. To specify constraints on the set of legal relations. We shall concern
ourselves only with relations that satisfy a given set of functional and
multivalued dependencies.
If a relation r fails to satisfy a given multivalued dependency, we can
construct a relations r that does satisfy the multivalued dependency by
adding tuples to r.
Database System Concepts - 7th Edition 7.69 ©Silberschatz, Korth and Sudarshan
Theory of MVDs
Database System Concepts - 7th Edition 7.70 ©Silberschatz, Korth and Sudarshan
Fourth Normal Form
A relation schema R is in 4NF with respect to a set D of functional and
multivalued dependencies if for all multivalued dependencies in D + of the
form , where R and R, at least one of the following hold:
is trivial (i.e., or = R)
is a superkey for schema R
If a relation is in 4NF it is in BCNF
Database System Concepts - 7th Edition 7.71 ©Silberschatz, Korth and Sudarshan
Restriction of Multivalued Dependencies
Database System Concepts - 7th Edition 7.72 ©Silberschatz, Korth and Sudarshan
4NF Decomposition Algorithm
result: = {R};
done := false;
compute D+;
Let Di denote the restriction of D+ to Ri
while (not done)
if (there is a schema Ri in result that is not in 4NF) then
begin
let be a nontrivial multivalued dependency that holds
on Ri such that Ri is not in Di, and ;
result := (result - Ri) (Ri - ) (, );
end
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless-join
Database System Concepts - 7th Edition 7.73 ©Silberschatz, Korth and Sudarshan
Example
R =(A, B, C, G, H, I)
D ={ A B
B HI
CG H }
R is not in 4NF since A B and A is not a superkey for R
Decomposition
a) R1 = (A, B) (R1 is in 4NF)
b) R2 = (A, C, G, H, I) (R2 is not in 4NF, decompose into R3 and
R 4)
c) R3 = (C, G, H) (R3 is in 4NF)
d) R4 = (A, C, G, I) (R4 is not in 4NF, decompose into R5 and
R 6)
A B and B HI A HI, (MVD transitivity), and
and hence A I (MVD restriction to R4)
e) R5 = (A, I) (R5 is in 4NF)
f) R6 = (A, C, G) (R6 is in 4NF)
Database System Concepts - 7th Edition 7.74 ©Silberschatz, Korth and Sudarshan
Additional Issues
Further Normal Forms
Database System Concepts - 7th Edition 7.76 ©Silberschatz, Korth and Sudarshan
Overall Database Design Process
Database System Concepts - 7th Edition 7.77 ©Silberschatz, Korth and Sudarshan
ER Model and Normalization
When an E-R diagram is carefully designed, identifying all entities
correctly, the tables generated from the E-R diagram should not need
further normalization.
However, in a real (imperfect) design, there can be functional
dependencies from non-key attributes of an entity to other attributes of
the entity
Example: an employee entity with
Attributes
department_name and building,
Functional dependency
department_name building
Good design would have made department an entity
Functional dependencies from non-key attributes of a relationship set
possible, but rare --- most relationships are binary
Database System Concepts - 7th Edition 7.78 ©Silberschatz, Korth and Sudarshan
Denormalization for Performance
Database System Concepts - 7th Edition 7.79 ©Silberschatz, Korth and Sudarshan
Other Design Issues
Database System Concepts - 7th Edition 7.80 ©Silberschatz, Korth and Sudarshan
Other Design Issues (Cont.)
Database System Concepts - 7th Edition 7.81 ©Silberschatz, Korth and Sudarshan
Modeling Temporal Data
Database System Concepts - 7th Edition 7.82 ©Silberschatz, Korth and Sudarshan
Modeling Temporal Data (Cont.)
In practice, database designers may add start and end time attributes
to relations
E.g., course(course_id, course_title) is replaced by
course(course_id, course_title, start, end)
Constraint: no two tuples can have overlapping valid times
Hard to enforce efficiently
Foreign key references may be to current version of data, or to data at
a point in time
E.g., student transcript should refer to course information at the
time the course was taken
Database System Concepts - 7th Edition 7.83 ©Silberschatz, Korth and Sudarshan
Additional Material
Proof of Correctness of 3NF Decomposition Algorithm
Correctness of 3NF Decomposition Algorithm
Database System Concepts - 7th Edition 7.86 ©Silberschatz, Korth and Sudarshan
Correctness of 3NF Decomposition (Cont.)
Database System Concepts - 7th Edition 7.87 ©Silberschatz, Korth and Sudarshan
Correctness of 3NF Decomposition (Cont.)
Case 1: If B in :
If is a superkey, the 2nd condition of 3NF is satisfied
Otherwise must contain some attribute not in
Since B is in F+ it must be derivable from Fc, by using attribute
closure on .
Attribute closure not have used . If it had been used, must be
contained in the attribute closure of , which is not possible, since we
assumed is not a superkey.
Now, using (- {B}) and B, we can derive B
(since , and B since B is non-trivial)
Then, B is extraneous in the right-hand side of ; which is not
possible since is in Fc.
Thus, if B is in then must be a superkey, and the second
condition of 3NF must be satisfied.
Database System Concepts - 7th Edition 7.88 ©Silberschatz, Korth and Sudarshan
Correctness of 3NF Decomposition (Cont.)
Case 2: B is in .
Since is a candidate key, the third alternative in the definition of
3NF is trivially satisfied.
In fact, we cannot show that is a superkey.
This shows exactly why the third alternative is present in the
definition of 3NF.
Q.E.D.
Database System Concepts - 7th Edition 7.89 ©Silberschatz, Korth and Sudarshan
Extra
First Normal Form
Database System Concepts - 7th Edition 7.91 ©Silberschatz, Korth and Sudarshan
First Normal Form (Cont.)
Database System Concepts - 7th Edition 7.92 ©Silberschatz, Korth and Sudarshan
End of Chapter 7