5 Normalization
5 Normalization
NORMALIZATION
CONSTRAINT VIOLATIONS
• Whenever a database is modified, the relational database constraints must not be violated.
• 3 modifications exist as,
Insert – A new tuple is inserted into the relation
Update – An existing tuple is modified.
Delete – An existing tuple is removed.
• Schema based constraints which can be violated in each operation are,
Insert: Domain constraint, key constraint, null constraint, Entity integrity, Referential integrity.
Update: Domain constraint, key constraint, null constraint ,Entity integrity, Referential integrity.
Delete: Referential Integrity.
• SQL does not allow schema based constraints to violate in all insert, update and delete operations. But, data
anomalies such as loss of data, having null values can occur due to referential integrity. SQL ensures that the
database stays at least in 1NF.
• But, data constraints such as partial functional dependencies and transitive dependencies can be violated in all
insert, update, delete operations causing data anomalies. There is no way to prevent data anomalies unless the
database is normalized.
NORMALIZATION
Aims of Normalization are ;
1. To reduce insert, update, deletion anomalies; that is to reduce schema constraint violations during those
operations.
Ex: a) Reduce data anomalie during insert, update, delete operations in BCNF by not allowing multiple
candidate keys, partial functional dependencies, transitive dependencies within a relation. b) Even if
anomalies occur due to referential integrity; only few data is affected as data is divided into different
tables.
2. To minimize redundancy (repetition of data) thus efficiently storing data – Ex: in 4NF; non-trivial multivalued
dependencies are not allowed.
• Normalization is informally goodness of relational design. Higher the normal form; higher is the goodness.
• A normal form must meet certain conditions.
• During normalization, a given relation is broken down into multiple relations based on multiple factors such
as data dependencies and schema-based constraints.
NORMAL FORMS
1NF
White
Grey
VH002 White
VH002 Grey
X 1NF 1NF
FIRST NORMAL FORM – 1NF
• If there is a composite attribute, they should be broken into atomic attributes in the same relation.
Ex: The table STUDENT (Student_ID, name, Address (Street, city, country)) is not in 1NF.
STUDENT STUDENT
Student_ID name Address Student_ID name Street City Country
X 1NF 1NF
SECOND NORMAL FORM – 2NF
• Conditions required for a relation to be in Second Normal Form (2NF) are,
Be in first normal form
Every non-prime attribute must be only fully functionally dependent on any prime attribute/atrributes. That
is every non-prime attribute is not partially functional dependent on any key (composite key).
Note: If there is only one attribute as the candidate key and that attribute is primary key, then table is automatically
in 2NF.
Ex: The relation STUDENT(Stu_ID, Course_ID, Stu_name, Course_name, Project) is in 1NF.
Normalization
PROJECT
STUDENT
COURSE
STUDENT_COURSE
Stu_ID Course_ID Stu_name Course_name Project Student_ID Stu_name Course_ID Course_name Stu_ID Course_ID Project
EG_001 EE4202 Gihan Database Airport EG_001 Gihan EE4202 Database EG_001 EE4202 Airport
EG_001 EE4302 Gihan Machines NULL EG_002 Tony EE4302 Machines EG_001 EE4302 NULL
EG_002 EE3201 Tony Electronic prj NULL EG_003 Akalanka EE3201 Electronic_prj EG_002 EE3201 NULL
EG_003 EE3201 Akalanka Electronic prj Fire Alarm EG_003 EE3201 Fire Alarm
Ex: The relation EMPLOYEE (Empl_ID, Dept_ID, Area, Designation, Dept_Head) is in 2NF.
EMPLOYEE Normalization
Empl_ID Dept_ID Area Designation Dept_Head Empl_ID Dept_ID Designation Area Dept_ID
EE001 EIE Software Lecturer Sajitha Dept_ID Dept_Head EE001 EIE Lecturer Software EIE
EE005
CEE
EIE
Lecturer
Lecturer
EE005 EIE Electronics Lecturer Sajitha
2NF
1NF 2NF 3NF
1NF 2NF X 3NF 3NF
BOYCE CODD NORMAL FORM – BCNF
• Conditions required for a relation to be in Boyce-Codd Normal Form (BCNF) are,
Be in third normal form.
For all the nontrivial functional dependencies X Y, X must be a superkey and Y must be a non-prime
attribute (Multiple candidate keys are not allowed within a relation).
If X → Y and Y is not a subset of X, then it is called Non-trivial functional
dependency.
Ex: The relation EMPLOYEE (Department, Empl_Bday, Empl_name, city, UPF_no,Expert_Area) is in 3NF.
EMPLOYEE Normalization
Department Empl_Bday Empl_name city UPF_no Expert_Area Electrical 28/03/1995 Wimal Negombo Electronics R07651
Electrical 13/08/1990 Wimal Kandy Electronics R05623
Electrical 28/03/1995 Wimal Negombo R07651 Electronics
Electrical 13/08/1990 Piyal Ampara Power R03265
Electrical 13/08/1990 Wimal Kandy R05623 Electronics Civil 23/03/1993 Sunil Kandy Environment R01410
Electrical 13/08/1990 Piyal Ampara R03265 Power
EMPLOYEE
X non-trivial
Multivalued Dependency Emp_name Proj_name
Emp_name Proj_name Dep_name
Silva P1
Silva
Silva
P1
P2
Sunil
Kamal
Normalization
Trivial Silva P2