Normalization: Compiled by Samiksha Singla

Normalization is the process of organizing data in a database to minimize redundancy and dependency. It involves splitting tables into multiple tables and linking them together through primary keys. There are various normal forms that organize data to different degrees, from 1NF to 5NF. Higher normal forms have fewer anomalies and dependencies but may require more joins to retrieve related data. Normalization improves data integrity, storage efficiency, and scalability.

Uploaded by

noorie_00183771

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

433 views

Normalization: Compiled by Samiksha Singla

Uploaded by

noorie_00183771

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Normalization

Compiled by
Samiksha Singla
Database Normalization
Database normalization is the process of removing
redundant data from your tables in to improve storage
efficiency, data integrity, and scalability.
In the relational model, methods exist for quantifying
how efficient a database is. These classifications are
called normal forms (or NF), and there are algorithms
for converting a given database between them.
Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined or
linked each time a query is issued.
Normal Forms
First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)
Boyce-Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
Functional
dependency
No transitive
of nonkey
dependency
attributes on
between
the primary
nonkey
attributes
Boyce- key - Atomic
Codd and values only
Higher
All Full
determinants Functional
are candidate dependency
keys - Single of nonkey
multivalued attributes on
dependency the primary
key

IS 257 – Fall 2006

First Normal Form (1NF)
A relation is in first normal form (1NF) if all its
attribute values are atomic.
That is, a 1NF relation cannot have an attribute value
that is:
 a set of values (multi-valued attribute)
 a set of tuples (nested relation)
A relation that is not in 1NF is an unnormalized
relation.
A non-1NF Relation

Two ways to convert a non-1NF relation to a 1NF relation:

1) Splitting Method - Divide the existing relation into two relations: non-
repeating attributes and repeating attributes.
􀃖Make a relation consisting of the primary key of the original relation and the
repeating attributes. Determine a primary key for this new relation.
􀃖Remove the repeating attributes from the original relation.
2) Flattening Method - Create new tuples for the repeating data combined
with the data that does not repeat.
􀃖Introduces redundancy that will be later removed by normalization.
􀃖Determine primary key for this flattened relation.
Converting a non-1NF Relation
to 1NF Using Splitting
Converting a non-1NF Relation
to 1NF Using Flattening
Second Normal Form (2NF)
 A relation is in second normal form (2NF) if it is in 1NF and every
non-primary key (non-prime) attribute is fully functionally
dependent on the primary key.
 Alternative definition from your text: every nonkey column
depends on all candidate keys, not a subset of any candidate
key
 Violations:
 Part of key -> nonkey
Note: By definition, any relation with a single primary key attribute is
always in 2NF.
 If a relation is not in 2NF, we will divide it into separate relations each
in 2NF by insuring that the primary key of each new relation
functionally determines all the attributes in the relation.
Second Normal Form (2NF) Example

fd1 and fd4 are partial functional dependencies.

Normalize to:
 Emp (eno, ename, title, bdate, salary, supereno, dno)
 WorksOn (eno, pno, resp, hours)
 Proj (pno, pname, budget)
Second Normal Form (2NF) Example
Third Normal Form (3NF)
 Third normal form (3NF) is based on the notion of transitive
dependency. A transitive dependency A → C is a FD that can be
inferred from existing FDs A → B and B → C.
 Note that a transitive dependency may involve more than 2 FDs.
 A relation is in third normal form (3NF) if it is in 2NF and there is
no non-primary key (non-prime) attribute that is transitively
dependent on the primary key.
 Alternate definition from your text: A table is in 3NF if it is in 2NF
and each nonkey column depends only on candidate keys, not on
other nonkey columns
 Violations: Nonkey  Nonkey
 Converting a relation to 3NF from 2NF involves the removal of
transitive dependencies. If a transitive dependency exists, we remove
the transitively dependent attributes from the relation and put them
in a new relation along with a copy of the determinant (LHS of FD).
Third Normal Form (3NF) Example

fd2 results in a transitive dependency eno → salary. Remove it.

General Definitions of 2NF and 3NF
We have defined 2NF and 3NF in terms of primary
keys. However, a more general definition considers all
candidate keys (just not the primary key we have
chosen).
General definition of 2NF:
 A relation is in 2NF if it is in 1NF and every non-prime
attribute is fully functionally dependent on any candidate
key.
General definition of 3NF:
 A relation is in 3NF if it is in 2NF and there is no non-prime
attribute that is transitively dependent on any candidate
key.
Note that a prime attribute is an attribute that is in any
key (candidate or primary).
Boyce-Codd Normal Form (BCNF)
 A relation is in Boyce-Codd normal form (BCNF) if and only if
every determinant is a candidate key.
 The difference between 3NF and BCNF is that 3NF allows a FD X →
Y to remain in the relation if X is a superkey or Y is a prime
attribute. BCNF only allows this FD if X is a superkey.
 Thus, BCNF is more restrictive than 3NF. However, in practice most
relations in 3NF are also in BCNF.
Boyce-Codd Normal Form (BCNF)
Consider the WorksOn relation where we have the
added constraint that given the hours worked, we
know exactly the employee who performed the work.
(i.e. each employee is FD from the hours that they
work on projects). Then:

Note that we lose the FD eno,pno → resp, hours.

Multi-Valued Dependencies
 A multi-valued dependency (MVD) occurs when two independent,
multi-valued attributes are present in the schema.
 A MVD occurs when two independent 1:N relationships are in the
relational schema.
 When these multi-valued attributes are flattened into a 1NF
relation, we must have a tuple for every combination of the values
in the two attributes.
 It may seem strange why we would want to do this as it obviously
increases the number of tuples and redundancy.
 The reason is that since the two attributes are independent it does
not make sense to store some combinations and not the others
because all combinations are equally valid. By leaving out some
combination, we are unintentionally favoring one combination over
the other which should not be the case.
Employee may:
Multi-Valued Dependencies Example
- work on many projects
- be in many departments
Multi-Valued Dependencies (MVDs)
A multi-valued dependency (MVD) is a dependency
between attributes A, B, C in a relation such that for
each value of A there is a set of values B and a set of
values C where the set of values B and C are
independent of each other.
A MVD is denoted as A → → B and A → → C or
abbreviated as A → → B | C.
Fourth Normal Form (4NF)
 Fourth normal form (4NF) is based on the idea of multi-valued
dependencies.
 A relation is in fourth normal form (4NF) if it is in BCNF and
contains no non-trivial multi-valued dependencies.
 Formal definition: A relation schema R is in 4NF with respect to a
set of dependencies F if, for every nontrivial multi-valued
dependency X → → Y, X is a superkey of R.
 If X → → Y is a 4NF violation for relation R, we can decompose R
using the same technique as for BCNF:
 XY is one of the decomposed relations.
 All but Y – X is the other.
Fourth Normal Form (4NF) Example
Lossless-join Dependency
The lossless-join property refers to the fact that
whenever we decompose relations using
normalization we can rejoin the relations to produce
the original relation.
A lossless-join dependency is a property of
decomposition which ensures that no spurious tuples
are generated when relations are natural joined.
There are cases where it is necessary to decompose a
relation into more than two relations to guarantee a
lossless-join.
Fifth Normal Form (5NF)
Fifth normal form (5NF) is based on join
dependencies.
A relation is in fifth normal form (5NF) if nad only if
every nontrivial join dependency is implied by the
superkeys of R.
A join dependency (JD) denoted by JD(R1, R2, …, Rn)
on relational schema R specifies a constraint on the
states r of R. The constraint states that every legal state
r of R is equal to the join of its projections on R1, R2,
…, Rn. That is for every such r we have:
 ΠR1(r) ∗ ΠR2(r) ∗ … ∗ ΠRn(r) = r
Fifth Normal Form (5NF) Example
Consider a relation Supply (sname, partName, projName).
Add the additional constraint that:
If project j requires part p
and supplier s supplies part p
and supplier s supplies at least one item to project j Then
supplier s also supplies part p to project j
Fifth Normal Form (5NF) Example

Let R be in BCNF and let R have no composite keys. Then R is in 5NF

Note: That only joining all three relations together will get you back to the original
relation. Joining any two will create spurious tuples!
Normalizing to death
Normalization splits database information across
multiple tables.
To retrieve complete information from a normalized
database, the JOIN operation must be used.
JOIN tends to be expensive in terms of processing
time, and very large joins are very expensive.

IS 257 – Fall 2006

THANK YOU

Normalisation
No ratings yet
Normalisation
29 pages
Ch4 Normalization
No ratings yet
Ch4 Normalization
27 pages
BES Dbms UNIT IV Notes
No ratings yet
BES Dbms UNIT IV Notes
7 pages
Database Normalisation: (WEEK 5) Outline
No ratings yet
Database Normalisation: (WEEK 5) Outline
5 pages
4 5NF
No ratings yet
4 5NF
45 pages
Normalization Notes
No ratings yet
Normalization Notes
14 pages
Normalization
No ratings yet
Normalization
15 pages
ADBMSUnit2pptx__2023_07_31_11_36_38
No ratings yet
ADBMSUnit2pptx__2023_07_31_11_36_38
40 pages
Rps Bahan Ajar SI13013 825 12
No ratings yet
Rps Bahan Ajar SI13013 825 12
39 pages
Review - Normal Forms2
No ratings yet
Review - Normal Forms2
17 pages
ch4 Slide
No ratings yet
ch4 Slide
32 pages
FALLSEM2018-19 ITE1003 ETH SJTG04 VL2018191004346 Reference Material I Normalization
No ratings yet
FALLSEM2018-19 ITE1003 ETH SJTG04 VL2018191004346 Reference Material I Normalization
31 pages
NORMALIZATION
No ratings yet
NORMALIZATION
3 pages
Normalization
No ratings yet
Normalization
35 pages
Unit-III Part - I
No ratings yet
Unit-III Part - I
35 pages
Normalization
No ratings yet
Normalization
49 pages
Unit 3
No ratings yet
Unit 3
23 pages
Data Base: Normalization
No ratings yet
Data Base: Normalization
5 pages
Unit 3 Updated FG
No ratings yet
Unit 3 Updated FG
16 pages
Database1 Lecture4
No ratings yet
Database1 Lecture4
22 pages
Functional Dependencies and Normalization For Relational Databases
No ratings yet
Functional Dependencies and Normalization For Relational Databases
36 pages
US - OOAD - 06 - Normal Form
No ratings yet
US - OOAD - 06 - Normal Form
38 pages
OOAD: Normalization: Presenter: Dr. Ha Viet Uyen Synh
No ratings yet
OOAD: Normalization: Presenter: Dr. Ha Viet Uyen Synh
40 pages
Database Management System-question bank-2 Answer Key
No ratings yet
Database Management System-question bank-2 Answer Key
10 pages
DBMS UNIT 4 - Class
No ratings yet
DBMS UNIT 4 - Class
14 pages
DBMS Module II
No ratings yet
DBMS Module II
14 pages
Normal Forms: First Normal Form (1NF)
No ratings yet
Normal Forms: First Normal Form (1NF)
45 pages
Two Marks
No ratings yet
Two Marks
11 pages
Normalization
No ratings yet
Normalization
2 pages
Relational Database Design
No ratings yet
Relational Database Design
52 pages
CIS340 Lecture 15-3
No ratings yet
CIS340 Lecture 15-3
42 pages
Normalization: Mrs. CH - Swathi
No ratings yet
Normalization: Mrs. CH - Swathi
24 pages
ch 5 relational database design
No ratings yet
ch 5 relational database design
49 pages
NORMALIZATION
No ratings yet
NORMALIZATION
51 pages
Chapter3 - Session2-Normal Forms
No ratings yet
Chapter3 - Session2-Normal Forms
32 pages
Unit 2 Normalization-3
No ratings yet
Unit 2 Normalization-3
39 pages
Data Normalization
No ratings yet
Data Normalization
39 pages
DBMS2
No ratings yet
DBMS2
7 pages
ISM-22 (DBMS - Normalization)
No ratings yet
ISM-22 (DBMS - Normalization)
34 pages
Normalization
No ratings yet
Normalization
28 pages
DBMS Module 4
No ratings yet
DBMS Module 4
8 pages
Normalisation: by E. Siva Sankari, M.E., Sl/It, Nec
No ratings yet
Normalisation: by E. Siva Sankari, M.E., Sl/It, Nec
12 pages
Normalization
No ratings yet
Normalization
62 pages
Normalisation
No ratings yet
Normalisation
29 pages
Unit 4 Design_a Designing Database
No ratings yet
Unit 4 Design_a Designing Database
54 pages
Normalization
No ratings yet
Normalization
8 pages
Relational Database Design
No ratings yet
Relational Database Design
43 pages
DBMS Unit 3.0 Functional Dependencies
No ratings yet
DBMS Unit 3.0 Functional Dependencies
44 pages
Integrity Constraints
No ratings yet
Integrity Constraints
9 pages
Databse Design
No ratings yet
Databse Design
9 pages
Normalization
No ratings yet
Normalization
3 pages
Databases Normalization - Lecture 8
No ratings yet
Databases Normalization - Lecture 8
20 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
From Simple IO to Monad Transformers
From Everand
From Simple IO to Monad Transformers
J Adrian Zimmer
2/5 (1)
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
ADVANCED DATA STRUCTURES FOR ALGORITHMS: Mastering Complex Data Structures for Algorithmic Problem-Solving (2024)
From Everand
ADVANCED DATA STRUCTURES FOR ALGORITHMS: Mastering Complex Data Structures for Algorithmic Problem-Solving (2024)
VIOLET CASTRO
No ratings yet
Lecture Notes in Elementary Real Analysis
From Everand
Lecture Notes in Elementary Real Analysis
Rohan Dalpatadu
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
5/5 (1)
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
ETL Resume
No ratings yet
ETL Resume
7 pages
Oracle Advanced Compression
No ratings yet
Oracle Advanced Compression
2 pages
Chapter 2 Theoretical Framework
No ratings yet
Chapter 2 Theoretical Framework
10 pages
Tourism Management System Synopsis
No ratings yet
Tourism Management System Synopsis
10 pages
WM-PP Interface
No ratings yet
WM-PP Interface
2 pages
IP 065
No ratings yet
IP 065
5 pages
Practical File (2)-Output (1)
No ratings yet
Practical File (2)-Output (1)
37 pages
Bulk Processing in Oracle PL/SQL
No ratings yet
Bulk Processing in Oracle PL/SQL
10 pages
Chapter5-The Memory System
No ratings yet
Chapter5-The Memory System
78 pages
9.1.1.7 Lab - Encrypting and Decrypting Data Using A Hacker Tool
No ratings yet
9.1.1.7 Lab - Encrypting and Decrypting Data Using A Hacker Tool
6 pages
Data Modeling: Extended Star Schema & Aggregates
No ratings yet
Data Modeling: Extended Star Schema & Aggregates
25 pages
Class - Xii: Split-Up Syllabus Sub: Computer Science
No ratings yet
Class - Xii: Split-Up Syllabus Sub: Computer Science
3 pages
BCDE103 Keys Class File
No ratings yet
BCDE103 Keys Class File
24 pages
Internship Presentation by Nikita Rathi Roll No: I1-73 Department of Electronics Engineering
No ratings yet
Internship Presentation by Nikita Rathi Roll No: I1-73 Department of Electronics Engineering
27 pages
Information Lifecycle Management in An SAP Environment: February 2008
No ratings yet
Information Lifecycle Management in An SAP Environment: February 2008
30 pages
(NagpurStudents - Org) Database Management System PDF
No ratings yet
(NagpurStudents - Org) Database Management System PDF
6 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
LSMW PTD 344MM
No ratings yet
LSMW PTD 344MM
19 pages
AWS Solutin Architect
No ratings yet
AWS Solutin Architect
3 pages
Smart Data Access - Data Virtualization in SAP HANA - ERPDocs
No ratings yet
Smart Data Access - Data Virtualization in SAP HANA - ERPDocs
97 pages
DB2 SQL Tuning
No ratings yet
DB2 SQL Tuning
53 pages
How To Create Eit in Oracle Hrms
No ratings yet
How To Create Eit in Oracle Hrms
5 pages
BABOK-v2 TASK-to-TECHNIQUE Matrix 11x17
No ratings yet
BABOK-v2 TASK-to-TECHNIQUE Matrix 11x17
2 pages
CS Practical File SOMYA 2017-18
No ratings yet
CS Practical File SOMYA 2017-18
64 pages
File System in Linux
No ratings yet
File System in Linux
7 pages
Mongo MCQ
No ratings yet
Mongo MCQ
13 pages
NetMaster System Manager User Manual
No ratings yet
NetMaster System Manager User Manual
148 pages
Extract Essbase Outline To SQL Database
No ratings yet
Extract Essbase Outline To SQL Database
21 pages
Mohanlal Sukhadia University, Udaipur: Statement of Marks
No ratings yet
Mohanlal Sukhadia University, Udaipur: Statement of Marks
1 page
CT-2-Batch 2 - Set C - Answer Key
No ratings yet
CT-2-Batch 2 - Set C - Answer Key
7 pages