0% found this document useful (0 votes)
12 views

Ch10-Functional Dependencies and Normalization For Relational Databases

Uploaded by

Marnie Omar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Ch10-Functional Dependencies and Normalization For Relational Databases

Uploaded by

Marnie Omar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Normalization for Relational

Databases

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe


Semantics of the Relation Attributes
■ GUIDELINE 1: Only foreign keys should be used to refer
to other entities
Entity and relationship attributes should be kept apart as
much as possible.

■ Attributes of different entities (EMPLOYEEs,


■ DEPARTMENTs, PROJECTs) should not be mixed in the
■ same relation

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 2


Two relation schemas suffering from update
anomalies

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 3


Example States for EMP_DEPT and
EMP_PROJ

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 4


example

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 5


Functional Dependencies
■ Functional dependencies (FDs)
■ Are used to specify formal measures of the "goodness" of
relational designs

And keys are used to define normal forms for relations

■ A set of attributes X functionally determines a set of


attributes Y if the value of X determines a unique value for Y

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 6


Examples of FD constraints
■ Social security number determines employee name
■ SSN -> ENAME
■ Project number determines project name and location
■ PNUMBER -> {PNAME, PLOCATION}
■ Employee ssn and project number determines the hours per
week that the employee works on the project
■ {SSN, PNUMBER} -> HOURS

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 7


3 Normal Forms Based on Primary Keys
■ 1. Normalization of Relations
■ 2. Practical Use of Normal Forms
■ 3.Definitions of Keys and Attributes Participating in Keys
4. First Normal Form

5. Second Normal Form

6. Third Normal Form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 8


3.1 Normalization of Relations (1)
■ Normalization:
■ The process of decomposing unsatisfactory "bad" relations by
breaking up their attributes into smaller relations

■ Normal form:
■ Condition using keys and FDs of a relation to certify whether a
relation schema is in a particular normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 9


Normalization of Relations (2)
■ 2NF, 3NF, BCNF
■ based on keys and FDs of a relation schema
■ 4NF
■ based on keys, multi-valued dependencies : MVDs; 5NF based
on keys, join dependencies : JDs (Chapter 11)
■ Additional properties may be needed to ensure a good relational
design (lossless join, dependency preservation; Chapter 11)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 10


3.2 Practical Use of Normal Forms
■ Normalization is carried out in practice so that the
resulting designs are of high quality and meet the
desirable properties
■ The practical utility of these normal forms becomes
questionable when the constraints on which they are
based are hard to understand or to detect
■ The database designers need not normalize to the highest
possible normal form
■ (usually up to 3NF, BCNF or 4NF)
■ Denormalization:
■ The process of storing the join of higher normal form relations as
a base relation—which is in a lower normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 11


3.3 Definitions of Keys and Attributes
Participating in Keys (1)
■ A superkey of a relation schema R = {A1, A2, ...., An} is a
set of attributes S subset-of R with the property that no two
tuples t1 and t2 in any legal relation state r of R will have t1[S]
= t2[S]

■ A key K is a superkey with the additional property that


removal of any attribute from K will cause K not to be a
superkey any more.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 12


Definitions of Keys and Attributes
Participating in Keys (2)
■ If a relation schema has more than one key, each is called a
candidate key.
■ One of the candidate keys is arbitrarily designated to be the
primary key, and the others are called secondary keys.
■ A Prime attribute must be a member of some candidate
key
■ A Nonprime attribute is not a prime attribute—that is, it is
not a member of any candidate key.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 13


3.2 First Normal Form
■ Disallows
■ composite attributes
■ multivalued attributes
■ nested relations; attributes whose values for an individual
tuple are non-atomic

■ Considered to be part of the definition of relation

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 14


Figure 10.8 Normalization into 1NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 15


Figure 10.9 Normalization nested
relations into 1NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 16


3.3 Second Normal Form (1)
■ Uses the concepts of FDs, primary key
■ Definitions
■ Prime attribute: An attribute that is member of the primary
key K
■ Full functional dependency: a FD Y -> Z where
removal of any attribute from Y means the FD does not hold
any more
■ Examples:
■ {SSN, PNUMBER} -> HOURS is a full FD since neither SSN
-> HOURS nor PNUMBER -> HOURS hold
■ {SSN, PNUMBER} -> ENAME is not a full FD (it is called
a partial dependency ) since SSN -> ENAME also holds

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 17


Second Normal Form (2)
■ A relation schema R is in second normal form (2NF) if
every non-prime attribute A in R is fully functionally
dependent on the primary key

■ R can be decomposed into 2NF relations via the process of


2NF normalization

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 18


Figure 10.10 Normalizing into 2NF and
3NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 19


Figure 10.11 Normalization into 2NF and
3NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 20


3.4 Third Normal Form (1)
■ Definition:
■ Transitive functional dependency: a FD X -> Z that
can be derived from two FDs X -> Y and Y -> Z
■ Examples:
■ SSN -> DMGRSSN is a transitive FD
■ Since SSN -> DNUMBER and DNUMBER ->
DMGRSSN hold
■ SSN -> ENAME is non-transitive
■ Since there is no set of attributes X where SSN -> X and
X -> ENAME

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 21


Third Normal Form (2)
■ A relation schema R is in third normal form (3NF) if it is
in 2NF and no non-prime attribute A in R is transitively
dependent on the primary key
■ R can be decomposed into 3NF relations via the process of
3NF normalization
■ NOTE:
■ In X -> Y and Y -> Z, with X as the primary key, we consider
this a problem only if Y is not a candidate key.
■ When Y is a candidate key, there is no problem with the
transitive dependency .
■ E.g., Consider EMP (SSN, Emp#, Salary ).
■ Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 22


Normal Forms Defined Informally
■ 1st normal form
■ All attributes depend on the key
■ 2nd normal form
■ All attributes depend on the whole key
■ 3rd normal form
■ All attributes depend on nothing but the key

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 23


4 General Normal Form Definitions (For
Multiple Keys) (1)
■ The above definitions consider the primary key only
■ The following more general definitions take into account
relations with multiple candidate keys
■ A relation schema R is in second normal form (2NF) if
every non-prime attribute A in R is fully functionally
dependent on every key of R

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 24


General Normal Form Definitions (2)
■ Definition:
■ Superkey of relation schema R - a set of attributes S of R that
contains a key of R
■ A relation schema R is in third normal form (3NF) if
whenever a FD X -> A holds in R, then either:
■ (a) X is a superkey of R, or
■ (b) A is a prime attribute of R
■ NOTE: Boyce-Codd normal form disallows condition (b)
above

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 25


5 BCNF (Boyce-Codd Normal Form)
■ A relation schema R is in Boyce-Codd Normal Form
(BCNF) if whenever an FD X -> A holds in R, then X is a
superkey of R
■ Each normal form is strictly stronger than the previous one

■ Every 2NF relation is in 1NF


■ Every 3NF relation is in 2NF
■ Every BCNF relation is in 3NF
■ There exist relations that are in 3NF but not in BCNF
■ The goal is to have each relation in BCNF (or 3NF)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 26


Figure 10.12 Boyce-Codd normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 27


Figure 10.13 a relation TEACH that is in
3NF but not in BCNF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 28


Achieving the BCNF by Decomposition
(1)
■ Two FDs exist in the relation TEACH:
■ fd1: { student, course} -> instructor
■ fd2: instructor -> course
■ {student, course} is a candidate key for this relation and that
the dependencies shown follow the pattern in Figure
10.12 (b).
■ So this relation is in 3NF but not in BCNF
■ A relation NOT in BCNF should be decomposed so as to
meet this property, while possibly forgoing the preservation
of all functional dependencies in the decomposed relations.
■ (See Algorithm 11.3)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 29


Achieving the BCNF by Decomposition
(2)
■ Three possible decompositions for relation TEACH
■ {student, instructor} and {student, course}

■ {course, instructor } and {course, student}

■ {instructor, course } and {instructor, student}

■ All three decompositions will lose fd1.


■ We have to settle for sacrificing the functional dependency

preservation. But we cannot sacrifice the non-additivity property


after decomposition.
■ Out of the above three, only the 3rd decomposition will not generate
spurious tuples after join.(and hence has the non-additivity property).
■ A test to determine whether a binary decomposition (decomposition into
two relations) is non-additive (lossless) is discussed in section
11.1.4 under Property LJ1. Verify that the third decomposition above
meets the property.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 30


Chapter Outline
■ Informal Design Guidelines for Relational Databases
■ Functional Dependencies (FDs)
■ Definition, Inference Rules, Equivalence of Sets of FDs,
Minimal Sets of FDs
■ Normal Forms Based on Primary Keys
■ General Normal Form Definitions (For Multiple Keys)
■ BCNF (Boyce-Codd Normal Form)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 10- 31

You might also like