Normalisation
Normalisation
Normalisation
Database Normalization
Proposed by Codd (1972)
Introduced 3 normal forms, the first, second and
third normal form
A stronger definition of 3NF - called Boyce-Codd
normal form (CDNF) was proposed later
Later, 4NF and 5NF were proposed
The minimum, and most common, goal is to achieve 3NF.
Database Normalization
Normalization Is the process of analyzing the given
Y is Functionally dependent on X.
10 B2 C2
X − − − − −> Y Y − − − − −> Z
11 B4 C1
Y − − − − −> X 12 B3 C4
Z− − − −− > Y
13 B1 C1
14 B3 C4
Database Normalization
• Functional Dependency is “good”. With functional
dependency the primary key (Attribute A) determines the
value of all the other non-key attributes (Attributes
B,C,D,etc.)
• Transitive dependency is “bad”. Transitive dependency
exists if the primary/candidate key (Attribute A)
determines non-key Attribute B, and Attribute B
determines non-key Attribute C.
• If a relation schema has more than one key, each is called a
candidate key
• An attribute in a relation schema R is called prim if it is a
member of some candidate key of R
•
First Normal Form (1NF)
20 Research Hundredfold
30 Marketing Leeds
Deptno Location
10 Leeds
Deptno Dname 10 Bradfprd
10 IT
10 Kent
20 Research 20 Hundredfold
30 Marketing 30 Leeds
Second Normal Form (2NF)
Each attribute must be functionally dependent on
the primary key.
•If the primary key is a single attribute, then the relation is in 2NF
•The test for 2NF involves testing for FDs whose left-hand-side
attribute are part of the primary key
•Disallow partial dependency, where non-keys attributes depend on
part of a composite primary key
•In short, remove partial dependencies
•
2NF improves data integrity.
•Prevents update, insert, and delete anomalies.
2NF
PNo PName PLoc EmpNo EName Salary Address HoursNo
Examples:
§Area code attribute based on City attribute of a customer
§Total price attribute of order entry based on quantity
attribute and unit price attribute (calculated value)
§
Solution:
•Any transitive dependencies are moved into a smaller table.
Transitive Dependence
Give a relation R, EmpNo EName Salary Address
Assume the following FD hold:
Ename − − − − > Address
Note : Both Ename and Address attributes are non-key attributes in R, and since
Address depends on a non-Prime attribute Name, which depends on the primary
key(EmpNo), a transitive dependency exists
R1 R2
EmpNo EName Salary Ename Address
Figure 5.7
The Decomposition of a Table Structure to Meet
BCNF Requirements
Figure 5.8
Sample Data for a BCNF Conversion
Table 5.2
Decomposition into BCNF
Figure 5.9