Normalization

The document discusses database design and normalization. It introduces key concepts like functional dependencies, normal forms, and anomalies. The goals of normalization are to eliminate redundant data and ensure only logically related data is stored together. This is achieved by decomposing relations into smaller relations based on their attributes and dependencies. The document covers normal forms up to fifth normal form and discusses how normalization can improve data quality by reducing anomalies and inconsistencies.

Uploaded by

jprakash25

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views

Normalization

Uploaded by

jprakash25

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

Database Design

Introduction
Functional Dependencies
Functional Dependency Theory
Normalization
Dependency Preservation
Boyce Codd Normal Form
Multivalued dependencies and Fourth Normal
Form
Join Dependencies and Fifth Normal Form
Introduction
Design Goal:
oDecide whether a particular relation R is in
good form.
oIn the case that a relation R is not in good
form, decompose it into a set of relations
{R
1
, R
2
, ..., R
n
} such that
oeach relation is in good form
othe decomposition is a lossless-join
decomposition - On decomposition of a
relation into smaller relations with fewer
attributes the resulting relations
whenever joined must result in the
same relation without any extra rows.
The join operations can be performed in
any order.
Introduction
A bad design may lead to
Repetition of information- that leads to insert,
delete and update anomalies.
Inability to represent some information
Anomalies: unexpected results from an
operation.
delete: when deleting a value for an attribute, you
inadvertently lose the value for some other attribute
insert: you need to store a value for a particular
attribute but can't because you need some other value
to include that occurrence (don't have key value)
update: like insert but to change a value, you need to
know all instances which may be hard to find.
Functional Dependency
Constraints on the set of legal relations.
Require that the value for a certain set of
attributes determines uniquely the value for
another set of attributes.
Definition of FD: Given a relation R, a set of
attributes X in R is said to functionally
determine another attribute Y, also in R,
(written X Y) if and only if each X value is
associated with at most one Y value.

Functional Dependency
Determinant - Attribute X can be defined as
determinant if it uniquely defines the value Y
in a given relationship or entity .
Determinant attribute need NOT be a key
attribute .
Represented as X->Y ,which means attribute
X decides attribute Y

Example of FD
Employee
SSN Name JobType DeptName
557-78-6587 Lance Smith Accountant Salary
214-45-2398 Lance Smith Engineer Product
SSN Name
Note: Name is functionally dependent on SSN because an
employees name can be uniquely determined from their
SSN. Name does not determine SSN, because more than
one employee can have the same name..
Keys
key: a unique attribute (or field) which can be
used to identify the entire tuple (or record) as
unique
key attributes are determinants but not all the
determinants are key attributes. Eg:marks
Grade
candidate key: the set of all attributes (or
combinations) which might serve as a key
primary key: key selected by the database
administrator as the key we will use for that
relation
composed (or composite) key: a key of two or
more fields
FD Contd..

Consider the following relation :
REPORT (Student#, Course#, CourseName, IName,
Room#, Marks, Grade)
Where:
Student#-Student Number
Course#-Course Number
CourseName -CourseName
IName- Name of the instructor who delivered the course
Room#-Room number which is assigned to respective
instructor
Marks- Scored in Course Course# by student Student #
Grade Obtained by student Student# in course Course #

FD Contd..
Student#,Course# together (called composite attribute)
defines EXACTLY ONE value of marks . This can be
symbolically represented as Student#Course# Marks
Other Functional dependencies in above examples are:
Course# -> CourseName
Course#-> IName(Assuming one course is taught by one
and only one instructor )
IName -> Room# (Assuming each instructor has his /her
own and non- shared room)
Marks ->Grade
Formal definition of FD: In a given relation R, X and Y are
attributes. Attribute Y is functional dependent on attribute X if
each value of X determines exactly one value of Y. This is
represented as : X->Y
However X may be composite in nature.
FD Contd..
A functional dependency is trivial if it is satisfied
by all instances of a relation
Example:
customer_name, loan_number
customer_name
customer_name customer_name
In general, is trivial if
Full functional dependency: In a given relation
R ,X and Y are attributes. Y is fully functionally
dependent on attribute X only if it is not
functionally dependent on sub-set of X.
However X may be composite in nature.
FD Contd..
Full functional dependency :
Eg: Marks is fully functional dependent on
student# Course# and not on the sub set of
Student#Course# .
CourseName is not fully functionally
dependent on student#course# because one of
the subset course# determines the course name

FD Contd..
Partial dependency:
In a given relation R, X and Y are attributes
.Attribute Y is partially dependent on the attribute
X only if it is dependent on subset attribute X
.However X may be composite in nature.
Eg:CourseName, IName,Room# are partially
dependent on composite attribute
Student#Course# because Course# alone can
defines the coursename, IName,Room#.

FD-Partial Dependency
FD- Transitive Dependency
Transitive Dependency:
Room# depends on IName and in turn depends on
Course# . Here Room# transitively depends on
Course#.
Similarly Grade depends on Marks,in turn Marks
depends on Student#Course# hence Grade Fully
transitively depends on Student#Course#.

Closure
Given a set F set of functional dependencies,
there are certain other functional dependencies
that are logically implied by F.
For example: If A B and B C, then we
can infer that A C
The set of all functional dependencies logically
implied by F is the closure of F.
We denote the closure of F by F
+
.
F
+
is a superset of F.
Axioms
Developed by Armstrong in 1974, there are six rules
(axioms) that all possible functional dependencies
may be derived from them.
1. Reflexivity Rule --- If X is a set of attributes and Y is a
subset of X, then X Y holds.
each subset of X is functionally dependent on X.
2. Augmentation Rule --- If X Y holds and W is a set
of attributes, then WX WY holds.
3. Transitivity Rule --- If X Y and Y Z holds, then X
Z holds.
These rules are
sound (generate only functional dependencies that
actually hold) and
complete (generate all functional dependencies that
hold).

Derived Theorems from Axioms
4. Union Rule --- If X Y and X Z holds, then X
YZ holds.
5. Decomposition Rule --- If X YZ holds, then so
do X Y and X Z.
6. Pseudotransitivity Rule --- If X Y and WY Z
hold then so does WX Z.

Example
R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
B H}
some members of F
+

A H
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG
and then transitivity with CG I
Introduction to Normalization
Normalization: Process of decomposing unsatisfactory
"bad" relations by breaking up their attributes into
smaller relations
Goals:
Eliminating redundant data
Ensuring data dependencies make sense (only storing
related data in a table).
These goals reduce the amount of space a database
consumes and ensure that data is logically stored.

Need for Normalization
Minimize data redundancy i.e. no unnecessarily
duplication of data.
To make database structure flexible i.e. it should be
possible to add new data values and rows without
reorganizing the database structure.
Data should be consistent throughout the database
i.e. it should not suffer from following anomalies.
Insert anomaly
Update anomaly
Delete anomaly ADVANTAGES OF
NORMALIZATION

More efficient data structure.
Avoid redundant fields or columns.
More flexible data structure i.e. we should be able to
add new rows and data values easily
Better understanding of data.
Ensures that distinct tables exist when necessary.
o Easier to maintain data structure i.e. it is easy to
perform operations and complex queries can be easily
handled.
o Minimizes data duplication.
o Close modeling of real world entities, processes and
their relationships.

DISADVANTAGES OF
NORMALIZATION

We cannot start building the database before you
know what the user needs.
On Normalizing the relations to higher normal
forms i.e. 4NF, 5NF the performance degrades.
It is very time consuming and difficult process in
normalizing relations of higher degree.
Careless decomposition may leads to bad design of
database which may leads to serious problems.

Normal Form
Initially Codd (1972) presented three normal
forms (1NF, 2NF and 3NF) all based on
functional dependencies among the attributes
of a relation.
Later Boyce and Codd proposed another
normal form called the Boyce-Codd normal
form (BCNF).
The fourth and fifth normal forms are based
on multi-value and join dependencies and
were proposed later.
The primary objective of normalization is to
avoid anomalies.
Normal Forms: Review
Unnormalized There are multivalued
attributes or repeating groups
1 NF No multivalued attributes or
repeating groups.
2 NF 1 NF plus no partial
dependencies
3 NF 2 NF plus no transitive
dependencies
Example Relation Record

First Normal Form(1NF)

First Normal Form(1NF)
A relation R is said to be in first normal form
(1NF) if and only if all the attributes of the
relation R, are atomic in nature.

That means only one piece of data can be
stored within the field (attribute) of a particular
record (tuple).

Non-atomic values complicate storage and
encourage redundant (repeated) storage of
data
Example Relation Record
First Normal Form(1NF)
Eg:Student details are repeated for each course
and course details are repeated for each student.
To avoid this Student Details, Course Details and
Result Details can be further divided.
Student Details attribute is divided into
Student#(Student Number) , Student Name and date
of birth.
Course Details is divided into Course#, Course
Name,Prerequisites and duration.
Results attribute is divided into
Student#,Course#,DateOfexam, Marks and Grade.
Student Table
Course Table
Result Table

Second Normal Form (2NF)

A relation is said to be in Second Normal Form if
and only If:
It is in the first normal form ,and
No partial dependency exists between non-key
attributes and key attributes.

Let us re-visit 1NF table structure.
Student# is key attribute for Student ,
Course# is key attribute for Course
Student#Course# together form the composite
key attributes for result.
Other attributes are non-key attributes.

Second Normal Form (2NF)
To make this table 2NF complaint, we have to remove
all the partial dependencies.
StudentName and DateOfBirth depend only on
student#.
CourseName,PreRequisite and DurationInDays
depends only on Course#
DateOfExam depends only on Course#.
To remove this partial dependency we need to split
the table Result into two table
1. Result(Student#,Course#,Marks,Grade)
2. Exam(Course#,DateofExam)
Result and Exam Table
Second Normal Form (2NF)
In the first table (STUDENT), the key attribute is Student#
and all other non-key attributes, StudentName and
DateOfBirth are fully functionally dependant on the key
attribute.
In the Second Table (COURSE) , Course# is the key
attribute and all the non-key attributes, CourseName,
DurationInDays are fully functional dependant on the key
attribute.
In third table (RESULT) Student#Course# together are
key attributes and all other non-key attributes, Marks and
Grade are fully functional dependant on the key attributes.
In the fourth Table (EXAM DATE) Course# is the key
attribute and the non-key attribute, DateOfExam is fully
functionally dependant on the key attribute

Second Normal Form (2NF)

What about anomalies?
At first look it appears like all our anomalies are
taken away!
Now we are storing Student 1003 and M4 record
only once.
We can insert prospective students and courses at
our will.
We will update only once if we need to change any
data in STUDENT, COURSE tables.
We can get rid of any course or student details by
deleting just one row.

Second Normal Form (2NF)
Let us analyse the RESULT Table
We already concluded that:
All attributes are atomic in nature
No partial dependency exists between the key
attributes and non-key attributes
RESULT table is in 2NF

Second Normal Form (2NF)
Assume, at present, as per the university
evaluation policy,
Students who score more than or equal
to 80 marks are awarded with A grade
Students who score more than or equal
to 70 marks up till 79 are awarded with
B grade
Students who score more than or equal
to 60 marks up till 69 are awarded with
C grade
Students who score more than or equal
to 50 marks up till 59 are awarded with
D grade

The University management which is
committed to improve the quality of education wants
to change the existing grading system to a new
grading system .In the present RESULT table
structure,
We dont have an option to introduce new grades
like A+ , B- and E
We need to do multiple updates on the existing
record to bring them to new grading definition
We will not be able to take away D grade if we
want to.
2NF does not take care of all the anomalies and
inconsistencies.
Second Normal Form (2NF)
Third Normal Form 3NF
A relation R is said to be in 3NF if and only if
It is in 2NF
No transitive dependency exists between non-
key attributes and key attributes.
In the RESULT table Student# and Course# are the
key attributes.
All other attributes, except grade are non-partially,
non transitively dependant on key attributes.
The grade attribute is dependent on Marks and in
turn Marks is dependent on Student# Course#.
To bring the table in 3NF we need to take off this
transitive dependency.
Third Normal Form 3NF-Result & Grade
Table
Third Normal Form 3NF
After normalizing tables to 3NF, we got rid of all
the anomalies and inconsistencies.
Now we can add new grade systems, update the
existing one and delete the unwanted ones.
Hence the Third Normal form is the most optimal
normal form and 99% of the databases which
require efficiency in
INSERT
UPDATE
DELETE
Operations are designed in this normal form.
BCNF
A relation is in Boyce-Codd normal form id and only if
every determinant is a candidate key.
It should be noted that most relations that are in 3NF
are also in BCNF. Infrequently, a 3NF relation is not in
BCNF and this happens only if
(a) the candidate keys in the relation are composite keys
(that is, they are not single attributes),
(b) there is more than one candidate key in the relation,
and
(c) the keys are not disjoint, that is, some attributes in
the keys are common

BoyceCodd normal form (BCNF)
A relation is in BCNF if and only if every
determinant is a candidate key.
Difference between 3NF and BCNF is that for a
functional dependency A B, 3NF allows this
dependency in a relation if B is a primary-key
attribute and A is not a candidate key.
Whereas, BCNF insists that for this
dependency to remain in a relation, A must be
a candidate key.

Every relation in BCNF is also in 3NF.
However, a relation in 3NF is not necessarily
in BCNF.

Violation of BCNF is quite rare.

The potential to violate BCNF may occur in a
relation that:
contains two (or more) composite candidate
keys;
the candidate keys overlap, that is have at least
one attribute in common.

Multi-valued Dependency (MVD)
Dependency between attributes (for example, A,
B, and C) in a relation, such that for each value
of A there is a set of values for B and a set of
values for C. However, the set of values for B
and C are independent of each other.

The formal definition is given as follows.
Let be a relation schema and let and
(subsets). The multivalued dependency

(which can be read as multidetermines ) holds on
if, in any legal relation , for all pairs of tuples and
in such that , there exist tuples and in such that

There exist anomalies/redundancies in relational schemas that cannot
be captured by FDs.
Example: consider the following table:
There are no (non-trivial) FDs that hold on this scheme; therefore the
scheme (Course, Set-of-teachers, Set-of-books) is in BCNF.
CTB table contains redundant information
because:
whenever (c;t1b1)2CTB and (c;t2b2)2CTB
then also (c;t1b2)2CTB
and, by symmetry, (c;t2b1)2CTB
we say that a multivalued dependency (MVD)
C !T (and C !B as well)
holds on CTB.
given a course, the set of teachers and the set of
books are uniquely
determined and independent.

LT - & Xii - Neet - GT - 1 (Set - 2) QP - 19.04.2024
0% (1)
LT - & Xii - Neet - GT - 1 (Set - 2) QP - 19.04.2024
20 pages
Srs For Attendance Management System
89% (19)
Srs For Attendance Management System
10 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
10 Historical Monuments in India
100% (1)
10 Historical Monuments in India
13 pages
E Flight Manuals
0% (1)
E Flight Manuals
3 pages
DB Design
No ratings yet
DB Design
21 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
78 pages
Dbms Unit III
No ratings yet
Dbms Unit III
14 pages
MODULE-3 DBMS CS208 NOTES (Ktuassist - In)
No ratings yet
MODULE-3 DBMS CS208 NOTES (Ktuassist - In)
4 pages
DBMS R19 UNIT III-Part2
No ratings yet
DBMS R19 UNIT III-Part2
31 pages
Module-3 DBMS CS208 Notes
No ratings yet
Module-3 DBMS CS208 Notes
4 pages
Dbms Unit III Normalforms
No ratings yet
Dbms Unit III Normalforms
20 pages
6final (14 Files Merged)
No ratings yet
6final (14 Files Merged)
15 pages
Unit 4 Databasedesign: Functional Dependencies and Normalization Informal Design Guidelines For Relation Schemas
No ratings yet
Unit 4 Databasedesign: Functional Dependencies and Normalization Informal Design Guidelines For Relation Schemas
24 pages
BES Dbms UNIT IV Notes
No ratings yet
BES Dbms UNIT IV Notes
7 pages
DBMS Module 3 (Chap 1)
No ratings yet
DBMS Module 3 (Chap 1)
41 pages
Relational Database Design
No ratings yet
Relational Database Design
52 pages
DBMS: Unit-3
No ratings yet
DBMS: Unit-3
59 pages
UNIT-3DBMS (Normalization and Functional Dependency)
No ratings yet
UNIT-3DBMS (Normalization and Functional Dependency)
34 pages
Chapter 2 RM and normalization V2
No ratings yet
Chapter 2 RM and normalization V2
71 pages
DBMS - Unit 4 - (2022-2023)
No ratings yet
DBMS - Unit 4 - (2022-2023)
17 pages
Pvp19 Dbms Unit-4 Material
No ratings yet
Pvp19 Dbms Unit-4 Material
41 pages
DBMS R19 - Unit-4
No ratings yet
DBMS R19 - Unit-4
9 pages
Two Marks
No ratings yet
Two Marks
11 pages
Database Mangement Systems
No ratings yet
Database Mangement Systems
59 pages
Unit - 3.2
No ratings yet
Unit - 3.2
13 pages
Types of Functional Dependencies in DBMS
No ratings yet
Types of Functional Dependencies in DBMS
8 pages
DBMS - Unit 4
No ratings yet
DBMS - Unit 4
27 pages
PDD 4
No ratings yet
PDD 4
20 pages
Unit 6 RDB Design
No ratings yet
Unit 6 RDB Design
103 pages
NORMALIZATION
No ratings yet
NORMALIZATION
51 pages
B.Tech - V - KCS-501 - Unit 3 - 1
No ratings yet
B.Tech - V - KCS-501 - Unit 3 - 1
9 pages
Functional Dependency: T T T (A) T (A) T (B) T (B)
No ratings yet
Functional Dependency: T T T (A) T (A) T (B) T (B)
8 pages
Database Design - Functional Dependencies
No ratings yet
Database Design - Functional Dependencies
12 pages
Normalization &amp ER Model
100% (1)
Normalization &amp ER Model
145 pages
Module No 5 Relational Database Design
No ratings yet
Module No 5 Relational Database Design
160 pages
Chapter 7 Part 1
No ratings yet
Chapter 7 Part 1
6 pages
Semantics of The Relation Attributes: Each Tuple in A Relation Should Represent One Entity or Relationship Instance
No ratings yet
Semantics of The Relation Attributes: Each Tuple in A Relation Should Represent One Entity or Relationship Instance
36 pages
UNIT-6: Schema Refinement (Normalization)
No ratings yet
UNIT-6: Schema Refinement (Normalization)
19 pages
Unit-III Part - I
No ratings yet
Unit-III Part - I
35 pages
Dbms-Unit-3 - Aktu
100% (1)
Dbms-Unit-3 - Aktu
7 pages
Normalization
No ratings yet
Normalization
30 pages
Unit-3 Notes
No ratings yet
Unit-3 Notes
21 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
41 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
72 pages
DBMS Module IV NOTES
No ratings yet
DBMS Module IV NOTES
68 pages
Unit 4
No ratings yet
Unit 4
33 pages
Functional Dependency
No ratings yet
Functional Dependency
14 pages
Normalization
No ratings yet
Normalization
35 pages
Unit - 3
No ratings yet
Unit - 3
40 pages
UNIT-3 FD
No ratings yet
UNIT-3 FD
65 pages
Chapter 7: Relational Database Design
No ratings yet
Chapter 7: Relational Database Design
92 pages
DBMS Unit 3 Notes by MultiAtomsPlus (1)
No ratings yet
DBMS Unit 3 Notes by MultiAtomsPlus (1)
26 pages
Database Normalization
No ratings yet
Database Normalization
28 pages
Manas Design
No ratings yet
Manas Design
6 pages
Relational Database Design
No ratings yet
Relational Database Design
76 pages
Functional Dependency & Normalization
No ratings yet
Functional Dependency & Normalization
42 pages
Cb3401 Unit 2
No ratings yet
Cb3401 Unit 2
30 pages
Relational Database Design
No ratings yet
Relational Database Design
92 pages
Function Dependancy
No ratings yet
Function Dependancy
6 pages
Chapter 5 Relational Data Model
No ratings yet
Chapter 5 Relational Data Model
20 pages
From Simple IO to Monad Transformers
From Everand
From Simple IO to Monad Transformers
J Adrian Zimmer
2/5 (1)
Lecture Notes in Elementary Real Analysis
From Everand
Lecture Notes in Elementary Real Analysis
Rohan Dalpatadu
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Comm Studies IA Analytical Guidelines
No ratings yet
Comm Studies IA Analytical Guidelines
2 pages
Polygraphy Notes
100% (1)
Polygraphy Notes
33 pages
Justin Smalley AAPDP
No ratings yet
Justin Smalley AAPDP
14 pages
3DT120
No ratings yet
3DT120
1 page
JR Phy Impshort Answer Questions
No ratings yet
JR Phy Impshort Answer Questions
1 page
Bloch PDF
No ratings yet
Bloch PDF
66 pages
Nagar Parishad, Sailana, Distt. Ratlam (M.P.) : Name of Work - Construction of Boundry Wall
No ratings yet
Nagar Parishad, Sailana, Distt. Ratlam (M.P.) : Name of Work - Construction of Boundry Wall
6 pages
Free Electron Fermi Gas: (Kittel Ch. 6)
No ratings yet
Free Electron Fermi Gas: (Kittel Ch. 6)
44 pages
OMBC Quiz
No ratings yet
OMBC Quiz
3 pages
Ashley Matthias Uwp Resume
No ratings yet
Ashley Matthias Uwp Resume
2 pages
21 Service Costing 1730866396
No ratings yet
21 Service Costing 1730866396
20 pages
Welcome To American Mosaic From VOA Learning English
No ratings yet
Welcome To American Mosaic From VOA Learning English
2 pages
14 Subsea Actuator March 2017 Compressed
No ratings yet
14 Subsea Actuator March 2017 Compressed
4 pages
Bibliography Sams
No ratings yet
Bibliography Sams
7 pages
Untitled Document
No ratings yet
Untitled Document
7 pages
Tinanium Anodising Space PDF
No ratings yet
Tinanium Anodising Space PDF
7 pages
SFG 2024 Level2 - v3 - Updated Schedule 2
No ratings yet
SFG 2024 Level2 - v3 - Updated Schedule 2
10 pages
Changing Colors Lesson 2
No ratings yet
Changing Colors Lesson 2
3 pages
Beetlejuice
No ratings yet
Beetlejuice
15 pages
杨帅雅思口语第1课
No ratings yet
杨帅雅思口语第1课
27 pages
4.1 Electrolysis MCQ QP
No ratings yet
4.1 Electrolysis MCQ QP
12 pages
PPWC Minor Assignment 001
No ratings yet
PPWC Minor Assignment 001
4 pages
GM 1927 36 Group C Elements BIQS
No ratings yet
GM 1927 36 Group C Elements BIQS
52 pages
Introduction To Soil Ecology
No ratings yet
Introduction To Soil Ecology
15 pages
Microsoft Excel Skills: How To Use Basic Functions
No ratings yet
Microsoft Excel Skills: How To Use Basic Functions
2 pages
Job Description - MYP Coordinator
No ratings yet
Job Description - MYP Coordinator
3 pages

Normalization

Uploaded by

Normalization

Uploaded by

Database Design

You might also like