Advance Database Systems - Lec 4
Advance Database Systems - Lec 4
Normalization
Overall Database Design Process
STUDENT
Partial
Dependency
CUSTOMER
Transitive
Dependency
EMPLOYEE
The SSN alone as primary key is in conflict with the second rule of first normal form. Hence
we combine SSN and products to form primary key to uniquely identify each row.
•Each table cell should contain single value.
•Each record needs to be unique.
Second Normal Form
Second normal form (2NF) further addresses the concept
of removing duplicative data:
Meet all the requirements of the first normal form.
Remove partial dependencies
Remove subsets of data that apply to multiple rows of
a table and place them in separate tables.
Create relationships between these new tables and
their predecessors through the use of foreign keys.
2nd NF
Movie Category
• Rule 1- Be in 1NF
• Rule 2- Single Column Primary Key
SSN UserName
332345432 Amy
666666666 Kevin
919919919 Raj
SSN Product
332345432 M
666666666 A
666666666 B
666666666 C
666666666 D
919919919 D
Third Normal Form
Third normal form (3NF) goes one large step further:
Meet all the requirements of the second normal form.
Remove columns that are not directly dependent upon the
primary key.
i.e. remove transitive dependencies
Third Normal Form
DB(Patno,PatName,appNo,time,doctor)
Determinants:
Patno -> PatName
Patno,appNo -> Time,doctor
Time -> appNo
Two options for 1NF primary key selection:
DB(Patno,PatName,appNo,time,doctor) (example 1a)
DB(Patno,PatName,appNo,time,doctor) (example 1b)
Boyce-Codd Normal Form
Patient No Patient Name Appointment Id Time Doctor
DB(Patno,PatName,appNo,time,doctor)
No repeating groups, so in 1NF
2NF – eliminate partial key dependencies:
DB(Patno,appNo,time,doctor)
R1(Patno,PatName)
3NF – no transient dependences so in 3NF
Now try BCNF.
Boyce-Codd Normal Form
Patient No Patient Name Appointment Id Time Doctor
DB(Patno,appNo,time,doctor)
R1(Patno,PatName)
Is determinant a candidate key?
Patno -> PatName
Patno is present in DB, but not PatName, so irrelevant.
Goals of Normalization
Normalization guidelines are cumulative. For a database to be
in 2NF, it must first fulfill all the criteria of a 1NF database.
Should I Normalize?
While database normalization is often a good idea, it's not an absolute
requirement. In fact, there are some cases where deliberately violating
the rules of normalization is a good practice.
Let R be a relation scheme with a set F of functional
dependencies.
Decide whether a relation scheme R is in “good” form.
In the case that a relation scheme R is not in “good” form,
decompose it into a set of relation scheme {R1, R2, ..., Rn}
such that
each relation scheme is in good form
the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency preserving.
Part II
Some Normalization Examples
Example 1: Determine NF
BOOK
BOOK
BOOK
BOOK
Product_ID Description
ORDER
Product_ID Description
The relation is at least in 1NF.
There is a COMPOSITE Primary Key (PK) (Order_No,
Product_ID), therefore there can be partial
dependencies. Product_ID, which is a part of PK,
determines Description; hence, there is a partial
dependency. Therefore, the relation is not 2NF. No
sense to check for transitive dependencies!
ORDER
Product_ID Description
ORDER
Product_ID
Description
In your solution you will write the
following justification:
1) No M/V attributes, therefore at least 1NF
2) There is a partial dependency
(Product_ID Description), therefore
not in 2NF
Conclusion: The relation is in 1NF
ORDER
PART
PART
STUDENT
Composite
Primary Key
STUDENT
Stud_ID Name
101 Lennon
125 Jonson
STUDENT_COURSE
Composite
Primary Key
STUDENT
STUDENT
STUDENT COURSE
Transitive
Dependency
EMPLOYEE
EMPLOYEE
EMPLOYEE
DEPARTMENT
Dept_ID Dept_Name
1 Acct
2 Mktg
?