Normalization and Denormalization

Normalization is the process of organizing data in a database to minimize redundancy and dependency. It divides tables to eliminate anomalies like insertion, update, and deletion anomalies. There are several normal forms like 1NF, 2NF, 3NF, BCNF, 4NF and 5NF that tables must satisfy to be normalized properly. Denormalization is the opposite process where redundant data is added to optimize performance by avoiding costly joins.

Uploaded by

umurita37

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Normalization and Denormalization

Uploaded by

umurita37

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Normalization?

Normalization is the process of organizing the data in the database.

Normalization is used to minimize the redundancy from a relation or set
of relations. It is also used to eliminate undesirable characteristics like
Insertion, Update, and Deletion Anomalies.
Normalization divides the larger table into smaller and links them using
relationships.
The normal form is used to reduce redundancy from the database table.
Why do we need Normalization?
• The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows. Normalization consists of a
series of guidelines that helps to guide you in creating a good database structure.
Data modification anomalies can be categorized into three types:
• Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new
tuple into a relationship due to lack of data. This occurs when we are not able to
insert data into a database because some attributes may be missing at the time
of insertion.
• Deletion Anomaly: The delete anomaly refers to the situation where the deletion
of data results in the unintended loss of some other important data. This occurs
when deleting one part of the data deletes the other necessary information from
the database.
• Updatation Anomaly: The update anomaly is when an update of a single data
value requires multiple rows of data to be updated. This occurs when the same
data items are repeated with the same values and are not linked to each other.
Advantages of Normalization
• Normalization helps to minimize data redundancy.
• Greater overall database organization.
• Data consistency within the database.
• Much more flexible database design.
• Enforces the concept of relational integrity.
Disadvantages of Normalization
• You cannot start building the database before knowing what the user
needs.
• The performance degrades when normalizing the relations to higher
normal forms, i.e., 4NF, 5NF.
• It is very time-consuming and difficult to normalize relations of a
higher degree.
• Careless decomposition may lead to a bad database design, leading to
serious problems.
Functional dependency
is a relationship that exists between two sets of attributes of a relational
table where one set of attributes can determine the value of the other set of
attributes. It is denoted by X -> Y, where X is called a determinant and Y is
called dependent. It typically exists between the primary key and non-key
attribute within a table.
For example:
• Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.
• Here Emp_Id attribute can uniquely identify the Emp_Name attribute of
employee table because if we know the Emp_Id, we can tell that employee
name associated with it.
• Functional dependency can be written as:
Emp_Id → Emp_Name
• We can say that Emp_Name is functionally dependent on Emp_Id.
Types of Functional dependency
1. Trivial functional dependency
A → B has trivial functional dependency if B is a subset of A.
The following dependencies are also trivial like: A → A, B → B
Example:

Consider a table with two columns Employee_Id and Employee_Name.

{Employee_id, Employee_Name} → Employee_Id is a trivial
functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name →
Employee_Name are trivial dependencie
2. Non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a subset of A.
When A intersection B is NULL, then A → B is called as complete non-
trivial.
• Example:
1.ID → Name,
2.Name → DOB
•
Types of Normal Forms:
• Normalization works through a series of stages called Normal forms.
The normal forms apply to individual relations. The relation is said to
be in particular normal form if it satisfies constraints.
First Normal Form (1NF)
• A relation will be 1NF if it contains an atomic value.
• It states that an attribute of a table cannot hold multiple values. It must
hold only single-valued attribute.
• First normal form disallows the multi-valued attribute, composite attribute,
and their combinations.
• A relation is in 1NF if every attribute is a single-valued attribute or it does
not contain any multi-valued or composite attribute, i.e., every attribute is
an atomic attribute. If there is a composite or multi-valued attribute, it
violates the 1NF. To solve this, we can create a new row for each of the
values of the multi-valued attribute to convert the table into the 1NF.

• Let’s take an example of a relational table <EmployeeDetail> that contains

the details of the employees of the company.
Table: EmployeeDetail

Here, the Employee Phone Number is a multi-valued attribute. So, this relation is not in 1NF.
To convert this table into 1NF, we make new rows with each Employee Phone Number as a new row as shown
below:
EmployeeDetail
Second Normal Form (2NF)

The normalization of 1NF relations to 2NF involves the elimination of partial

dependencies. A partial dependency exists when any non-prime attributes, i.e., an
attribute not a part of the candidate key, is not fully functionally dependent on one
of the candidate keys. It means that each field in a table must depend upon the
entire key, they don’t depend upon the combination key, they are a move to
another table on whose key they depend. The structure which doesn’t contain
combination keys is automatically in the Second Normal Form. An attribute that is
not part of any candidate key is known as non-prime attribute.
For a relational table to be in second normal form, it must satisfy the following
rules:
1. The table must be in first normal form.
2. It must not contain any partial dependency, i.e., all non-prime attributes are
fully functionally dependent on the primary key.
If a partial dependency exists, we can divide the table to remove the partially
dependent attributes and move them to some other table where they fit in well.
Example: Let’s say a school wants to store the data of teachers and the
subjects they teach. They create a table Teacher that looks like this:
Since a teacher can teach more than one subjects, the table can have
multiple rows for a same teacher.
To make the table complies with 2NF we can
disintegrate it in two tables like this:
Teacher_Details table:
Third Normal form (3NF)
A table design is said to be in 3NF if both the following conditions hold:
1. Table must be in 2NF
2. Transitive functional dependency of non-prime attribute on any super
key should be removed.
An attribute that is not part of any candidate key is known as non-prime
attribute. Partial Dependency occurs when a non-prime attribute is
functionally dependent on part of a candidate key
Partial dependency occurs when one primary key determines some other
attribute/attributes. While transitive dependency occurs when some non-key
attribute determines some other attribute.
Full dependency :it means that this meets all the requirements of the First
Normal Form, and all non-key attributes are functionally dependent fully on
the primary key.
An attribute that is a part of one of the candidate keys is known
as prime attribute.
Boyce Codd normal form (BCNF)

• It is an advance version of 3NF that’s why it is also referred as

3.5NF. BCNF is stricter than 3NF.
• A table complies with BCNF if it is in 3NF and for
every functional dependency X->Y, X should be the super key of
the table.
• Example: Suppose there is a company wherein employees work
in more than one department. They store the data like this:
Functional dependencies in the table above:
Emp_Id -> Emp_Nationality
Emp_Dept -> {Dept_Type, Dept_No_Of_Emp}

Candidate key: {Emp_Id, Emp_Dept}

The table is not in BCNF as neither Emp_Id nor Emp_Dept alone are
keys.

To make the table comply with BCNF we can break the table in three
tables like this:
Emp_Nationality table:
Fourth normal form (4NF)

• A relation will be in 4NF if it is in Boyce Codd normal form and has no

multi-valued dependency.
• For a dependency A → B, if for a single value of A, multiple values of B
exists, then the relation will be a multi-valued dependency.
• The given STUDENT table is in 3NF, but the COURSE and HOBBY are
two independent entity. Hence, there is no relationship between
COURSE and HOBBY.
• In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing.
So there is a Multi-valued dependency on STU_ID, which leads to
unnecessary repetition of data.
• So to make the above table into 4NF, we can decompose it into two
tables:
Fifth normal form (5NF)
• A relation is in 5NF if it is in 4NF and not contains any join
dependency and joining should be lossless.
• 5NF is satisfied when all the tables are broken into as many tables as
possible in order to avoid redundancy.
• 5NF is also known as Project-join normal form (PJ/NF).
• In the above table, John takes both Computer and Math class for
Semester 1 but he doesn't take Math class for Semester 2. In this
case, combination of all these fields required to identify a valid data.
• Suppose we add a new Semester as Semester 3 but do not know
about the subject and who will be taking that subject so we leave
Lecturer and Subject as NULL. But all three columns together acts as a
primary key, so we can't leave other two columns blank.
So to make the above table into 5NF, we can decompose it into three
relations P1, P2 & P3:
Denormalization
Denormalization is a database optimization technique in which
we add redundant data to one or more tables. This can help us
avoid costly joins in a relational database. Note
that denormalization does not mean ‘reversing normalization’ or
‘not to normalize’. It is an optimization technique that is applied
after normalization.
Denormalization is a database design technique that involves
intentionally introducing redundancy into a relational database
by incorporating data from related tables into a single table. The
primary goal of denormalization is to optimize query
performance and simplify data retrieval at the cost of increased
storage requirements and a potential increase in data update
complexity.
Some key points and considerations related to
denormalization:
• Performance Optimization: Denormalization can significantly improve the performance of read-intensive database
operations because it reduces the need for complex JOIN operations and enables faster data retrieval. It can be
especially beneficial for systems where queries are common and need to be executed quickly.
• Reduced JOINs: In normalized database designs, data is often distributed across multiple tables, requiring JOIN
operations to retrieve related information. Denormalization combines related data into a single table, eliminating the
need for many JOINs.
• Simplified Queries: Denormalized databases often lead to simpler and more straightforward queries, as they eliminate
the need to traverse multiple tables to retrieve data.
• Data Duplication: Denormalization involves duplicating data, which can lead to data redundancy and increased storage
requirements. This can lead to data integrity issues if not properly managed.
• Data Update Complexity: Because data is duplicated in denormalized tables, updates can become more complex. When
a piece of information needs to change, it must be updated in multiple places to maintain consistency.
• Maintenance Challenges: Denormalized databases may require more effort to maintain, as they can become inconsistent
if updates are not carefully managed.
• Use Cases: Denormalization is often used in data warehousing, reporting, and analytical systems where the focus is on
fast data retrieval and reporting. It is less suitable for transactional systems where data integrity and consistency are
paramount.
• Balancing Act: Deciding whether to denormalize or not is a trade-off. Database designers need to strike a balance
between the performance benefits of denormalization and the increased complexity and potential risks it introduces.
Pros of Denormalization:
• Retrieving data is faster since we do fewer joins
• Queries to retrieve can be simpler(and therefore less likely to have
bugs),
• since we need to look at fewer tables.
Cons of Denormalization:
• Updates and inserts are more expensive.
• Denormalization can make update and insert code harder to write.
• Data may be inconsistent.
• Data redundancy necessitates more storage.
Advantages of Denormalization:
• Improved Query Performance: Denormalization can improve query
performance by reducing the number of joins required to retrieve data.
• Reduced Complexity: By combining related data into fewer tables,
denormalization can simplify the database schema and make it easier to
manage.
• Easier Maintenance and Updates: Denormalization can make it easier to
update and maintain the database by reducing the number of tables.
• Improved Read Performance: Denormalization can improve read
performance by making it easier to access data.
• Better Scalability: Denormalization can improve the scalability of a
database system by reducing the number of tables and improving the
overall performance.
Disadvantages of Denormalization:
• Reduced Data Integrity: By adding redundant data, denormalization can
reduce data integrity and increase the risk of inconsistencies.
• Increased Complexity: While denormalization can simplify the database
schema in some cases, it can also increase complexity by introducing
redundant data.
• Increased Storage Requirements: By adding redundant data,
denormalization can increase storage requirements and increase the cost
of maintaining the database.
• Increased Update and Maintenance Complexity: Denormalization can
increase the complexity of updating and maintaining the database by
introducing redundant data.
• Limited Flexibility: Denormalization can reduce the flexibility of a database
system by introducing redundant data and making it harder to modify the
schema.

UDL Answer Booklet Students
No ratings yet
UDL Answer Booklet Students
79 pages
101 Most Popular Excel Formulas: 101 Excel Series, #1
From Everand
101 Most Popular Excel Formulas: 101 Excel Series, #1
John Michaloudis
4/5 (5)
Text Books: Physical Chemistry (I&II), 6 Edition, T. M. Leung & C. C. Lee, Fillans Inorganic Chemistry, 6 Edition, T. M. Leung & C. C. Lee, Fillans
No ratings yet
Text Books: Physical Chemistry (I&II), 6 Edition, T. M. Leung & C. C. Lee, Fillans Inorganic Chemistry, 6 Edition, T. M. Leung & C. C. Lee, Fillans
4 pages
DBMS Unit-III (1)
No ratings yet
DBMS Unit-III (1)
42 pages
Unit 3 (KCS501)
No ratings yet
Unit 3 (KCS501)
13 pages
Normalization
No ratings yet
Normalization
11 pages
NORMALISATION
No ratings yet
NORMALISATION
15 pages
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
Normalization and Normal Form
No ratings yet
Normalization and Normal Form
11 pages
Chapter 4
No ratings yet
Chapter 4
12 pages
DBMS Unit-2
No ratings yet
DBMS Unit-2
39 pages
Normalization
No ratings yet
Normalization
17 pages
Normalization
No ratings yet
Normalization
42 pages
Normalization
No ratings yet
Normalization
5 pages
Unit - 3
No ratings yet
Unit - 3
22 pages
2nd and 3rd Unit
No ratings yet
2nd and 3rd Unit
87 pages
RDBMS Unit 4
No ratings yet
RDBMS Unit 4
15 pages
DBMS Chap 3
No ratings yet
DBMS Chap 3
17 pages
Normalization
No ratings yet
Normalization
23 pages
Functional Dependency Notes
No ratings yet
Functional Dependency Notes
52 pages
Normalization AND KEYS
No ratings yet
Normalization AND KEYS
19 pages
Normalization
No ratings yet
Normalization
17 pages
Normalization
No ratings yet
Normalization
57 pages
Unit 3 1
No ratings yet
Unit 3 1
11 pages
Unit 3 Updated FG
No ratings yet
Unit 3 Updated FG
16 pages
Normalization
No ratings yet
Normalization
23 pages
unit 3 ADBMS
No ratings yet
unit 3 ADBMS
12 pages
Unit 3 DBMS - 1596870407
100% (1)
Unit 3 DBMS - 1596870407
16 pages
lesson10 Normalization
No ratings yet
lesson10 Normalization
8 pages
Normalization
No ratings yet
Normalization
19 pages
Noormalization 10
No ratings yet
Noormalization 10
26 pages
Normalization and Functional Dependency
No ratings yet
Normalization and Functional Dependency
14 pages
RDBMS Normalization
No ratings yet
RDBMS Normalization
29 pages
Module II Normal Form (NF1, NF2, NF3, BCNF)
No ratings yet
Module II Normal Form (NF1, NF2, NF3, BCNF)
9 pages
Normalization of Database
No ratings yet
Normalization of Database
10 pages
Normalization
No ratings yet
Normalization
17 pages
DBMS Unit-3
No ratings yet
DBMS Unit-3
28 pages
Unit3-Part2-Normalization-Normal Forms
No ratings yet
Unit3-Part2-Normalization-Normal Forms
20 pages
Normalization-FINAL-summary
No ratings yet
Normalization-FINAL-summary
6 pages
Normalization 1
No ratings yet
Normalization 1
26 pages
Normal Form
No ratings yet
Normal Form
27 pages
DBMS Normalization Normalization: Types of Normal Forms
No ratings yet
DBMS Normalization Normalization: Types of Normal Forms
17 pages
Normalization Concepts
No ratings yet
Normalization Concepts
13 pages
Normalization
No ratings yet
Normalization
9 pages
Chapter-9-NORMALIZATION-1
No ratings yet
Chapter-9-NORMALIZATION-1
45 pages
Normalization
No ratings yet
Normalization
18 pages
Chapter 9 NORMALIZATION
No ratings yet
Chapter 9 NORMALIZATION
45 pages
Normalization
No ratings yet
Normalization
30 pages
unit 4 rdbms s
No ratings yet
unit 4 rdbms s
8 pages
Normalization 1
No ratings yet
Normalization 1
6 pages
CSE_CSPC403_DBMS-59-65
No ratings yet
CSE_CSPC403_DBMS-59-65
7 pages
DBMS, unit-5
No ratings yet
DBMS, unit-5
9 pages
Database Normalization Tutorial
No ratings yet
Database Normalization Tutorial
14 pages
Unit 3_Normalization 1_3.5
No ratings yet
Unit 3_Normalization 1_3.5
12 pages
Functional Dependency: Hassan Khan
No ratings yet
Functional Dependency: Hassan Khan
16 pages
Database Normalization - New
No ratings yet
Database Normalization - New
8 pages
DBMS 20 Mark Questions
No ratings yet
DBMS 20 Mark Questions
12 pages
Unit-3(Part-1,DBMS)
No ratings yet
Unit-3(Part-1,DBMS)
11 pages
Normalization
No ratings yet
Normalization
22 pages
Unit 3 Normalization
No ratings yet
Unit 3 Normalization
70 pages
More Excel Outside the Box: Unbelievable Excel Techniques from Excel MVP Bob Umlas
From Everand
More Excel Outside the Box: Unbelievable Excel Techniques from Excel MVP Bob Umlas
Bob Umlas
No ratings yet
Mastering Excel: The Complete Guide to All Excel Formulas
From Everand
Mastering Excel: The Complete Guide to All Excel Formulas
Namo
No ratings yet
13 Stacks and Queues (Autosaved)
No ratings yet
13 Stacks and Queues (Autosaved)
46 pages
Fundamentals of Network
No ratings yet
Fundamentals of Network
52 pages
Data Transmission and Topology
No ratings yet
Data Transmission and Topology
58 pages
Lesson 1 Introduction To Java
No ratings yet
Lesson 1 Introduction To Java
42 pages
Reported Speech
No ratings yet
Reported Speech
19 pages
Isolation Aids and Gingival Management in Dentistry
No ratings yet
Isolation Aids and Gingival Management in Dentistry
125 pages
NP EX365 2021 EOM3-1 AlanMartinez 2
No ratings yet
NP EX365 2021 EOM3-1 AlanMartinez 2
5 pages
Mach Number and Airspeed Vs Altitude
No ratings yet
Mach Number and Airspeed Vs Altitude
3 pages
Iss Project Report
No ratings yet
Iss Project Report
13 pages
Governor (Types I, II, IV, and V) - Check
No ratings yet
Governor (Types I, II, IV, and V) - Check
5 pages
History: Algebra - Is A Branch of Mathematics That Substitutes Letters For Numbers. Algebra Is About
No ratings yet
History: Algebra - Is A Branch of Mathematics That Substitutes Letters For Numbers. Algebra Is About
3 pages
Welding-1
No ratings yet
Welding-1
7 pages
What Is The ICT Test
No ratings yet
What Is The ICT Test
21 pages
SS7 Tutorial
No ratings yet
SS7 Tutorial
21 pages
Applications of Al
No ratings yet
Applications of Al
36 pages
Module FX 115esplus PDF
No ratings yet
Module FX 115esplus PDF
1 page
EP450 ELP Cross Reference Parts
No ratings yet
EP450 ELP Cross Reference Parts
7 pages
Schneider Electric - Modicon-X80-I-Os - BMXSRA0405
No ratings yet
Schneider Electric - Modicon-X80-I-Os - BMXSRA0405
4 pages
General Mathematics Review Questions 2
No ratings yet
General Mathematics Review Questions 2
4 pages
cdn3.digialm.com__per_g22_pub_1907_touchstone_AssessmentQPHTMLMode1__1907O245_1907O245S5D14376_17346456219002266_223242230491345_1907O245S5D14376E1.html
No ratings yet
cdn3.digialm.com__per_g22_pub_1907_touchstone_AssessmentQPHTMLMode1__1907O245_1907O245S5D14376_17346456219002266_223242230491345_1907O245S5D14376E1.html
35 pages
Experiment Titration of Na2co3 vs Hcl
No ratings yet
Experiment Titration of Na2co3 vs Hcl
1 page
Mastering Python Programming
No ratings yet
Mastering Python Programming
9 pages
Cisco Nexus 9500 R-Series Line Cards and Fabric Modules
No ratings yet
Cisco Nexus 9500 R-Series Line Cards and Fabric Modules
7 pages
Spurious Counts and Spurious Results On Haematology Analysers A Review. Part I Platelets
No ratings yet
Spurious Counts and Spurious Results On Haematology Analysers A Review. Part I Platelets
17 pages
Navigator 161128 221653.log
No ratings yet
Navigator 161128 221653.log
2 pages
Math Performance Task Arithmetic Sequence
No ratings yet
Math Performance Task Arithmetic Sequence
24 pages
Population Trends Answers: Fruit Fly Population Growth Rabbit Population Growth
No ratings yet
Population Trends Answers: Fruit Fly Population Growth Rabbit Population Growth
2 pages
CA Final SFM Revision Material May 2019 PDF
0% (1)
CA Final SFM Revision Material May 2019 PDF
256 pages
Ed21-301a (Sky Air Deluxe - r22)
100% (1)
Ed21-301a (Sky Air Deluxe - r22)
280 pages
Obtaining Measurable Bilateral Simultaneous Occlusal Contacts With Computer-Analyzed and Guided Occlusal Adjustments
No ratings yet
Obtaining Measurable Bilateral Simultaneous Occlusal Contacts With Computer-Analyzed and Guided Occlusal Adjustments
12 pages
Distributed Active Transformer-A New Power-Combining and Impedance-Transformation Technique
No ratings yet
Distributed Active Transformer-A New Power-Combining and Impedance-Transformation Technique
16 pages
MET 413 PPT Module 5
No ratings yet
MET 413 PPT Module 5
50 pages
Toner Sensor Calibration
No ratings yet
Toner Sensor Calibration
3 pages

Normalization and Denormalization

Uploaded by

Normalization and Denormalization

Uploaded by

Normalization?

Normalization is the process of organizing the data in the database.

Consider a table with two columns Employee_Id and Employee_Name.

• Let’s take an example of a relational table <EmployeeDetail> that contains

The normalization of 1NF relations to 2NF involves the elimination of partial

• It is an advance version of 3NF that’s why it is also referred as

Candidate key: {Emp_Id, Emp_Dept}

• A relation will be in 4NF if it is in Boyce Codd normal form and has no

You might also like