8 Normalization
8 Normalization
Normalization
We discuss four normal forms: first, second, third, and
Boyce-Codd normal forms
1NF, 2NF, 3NF, and BCNF
91.2914 1
Normalization
There is a sequence to normal forms:
1NF is considered the weakest,
2NF is stronger than 1NF,
3NF is stronger than 2NF, and
BCNF is considered the strongest
Also,
any relation that is in BCNF, is in 3NF;
any relation in 3NF is in 2NF; and
any relation in 2NF is in 1NF.
91.2914 2
Normalization
91.2914 3
Normalization
We consider a relation in BCNF to be fully normalized.
A design that has a lower normal form than another design has
more redundancy. Uncontrolled redundancy can lead to data
integrity problems.
91.2914 6
Functional Dependencies
EmpNum EmpEmail
EmpNum EmpFname 3 different ways
EmpNum EmpLname you might see FDs
depicted
EmpEmail
EmpNum EmpFname
EmpLname
91.2914 7
Determinant
Functional Dependency
EmpNum EmpEmail
91.2914 8
Transitive dependency
Transitive dependency
91.2914 9
Transitive dependency
EmpNum DeptNum
DeptNum DeptName
91.2914 10
Partial dependency
A partial dependency exists when an attribute B is
functionally dependent on an attribute A, and A is a
component of a multipart candidate key.
91.2914 12
First Normal Form
The following in not in 1NF
91.2914 13
First Normal Form
EmpNum EmpPhone EmpDegrees
123 233-9876
333 233-1231 BA, BSc, PhD
679 233-1231 BSc, MSc
91.2914 14
First Normal Form
EmployeeDegree
Employee
EmpNum EmpDegree
EmpNum EmpPhone
333 BA
123 233-9876
333 BSc
333 233-1231
333 PhD
679 233-1231
679 BSc
679 MSc
91.2914 15
Boyce-Codd Normal Form
Boyce-Codd Normal Form
91.2914 16
Second Normal Form
Second Normal Form
A relation is in 2NF if it is in 1NF, and every non-key attribute
is fully dependent on each candidate key. (That is, we dont
have any partial functional dependency.)
2NF(and3NF)bothinvolvetheconceptsofkeyand
nonkeyattributes.
Akeyattributeisanyattributethatispartofakey;
anyattributethatisnotakeyattribute,isanonkey attribute.
RelationsthatarenotinBCNFhavedataredundancies
Arelationin2NFwillnothaveanypartialdependencies
91.2914 17
Second Normal Form
Consider this InvLine table (in 1NF):
InvNum LineNum ProdNum Qty InvDate
InvNum, LineNum ProdNum, Qty
There are two
candidate keys.
Qty is the only non-
key attribute, and it is
InvNum InvDate
dependent on InvNum
Since there is a determinant that is not a
candidate key, InvLine is not BCNF
InvLine is
InvLine is not 2NF since there is a partial only in 1NF
dependency of InvDate on InvNum
91.2914 18
Second Normal Form
InvLine
InvNum LineNum ProdNum Qty InvDate
The above relation has redundancies: the invoice date is
repeated on each invoice line.
We can improve the database by decomposing the relation
into two relations:
InvNum LineNum ProdNum Qty
InvNum InvDate
91.2914 20
2NF, but not in 3NF, nor in BCNF:
EmployeeDept
ename ssn bdate address dnumber dname
dnumber dname.
91.2914 21
Third Normal Form
Third Normal Form
A relation is in 3NF if the relation is in 1NF and all
determinants of non-key attributes are candidate keys
That is, for any functional dependency: X Y, where Y is
a non-key attribute (or a set of non-key attributes), X is a
candidate key.
This definition of 3NF differs from BCNF only in the
specification of non-key attributes - 3NF is weaker than
BCNF. (BCNF requires all determinants to be candidate
keys.)
A relation in 3NF will not have any transitive dependencies
of non-key attribute on a candidate key through another
non-key attribute.
91.2914 22
Third Normal Form
Consider this Employee relation Candidate keys
are?
91.2914 23
Third Normal Form
EmpNum EmpName DeptNum DeptName
91.2914 24
In 3NF, but not in BCNF:
91.2914 25
student_no course_no instr_no
BC
NF
student_no instr_no
course_no instr_no