0% found this document useful (0 votes)
35 views11 pages

2023_IT_22IT405_U3-LM10

The document outlines the principles of good relational database design, emphasizing features such as the representation of entities, reduction of null values, elimination of redundancy, and prevention of modification anomalies. It discusses the importance of normalization to eliminate data redundancy and anomalies, detailing various normal forms from First Normal Form (1NF) to Sixth Normal Form (6NF). Each normal form builds upon the previous one to ensure data integrity and logical organization within the database.

Uploaded by

dakshata277
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views11 pages

2023_IT_22IT405_U3-LM10

The document outlines the principles of good relational database design, emphasizing features such as the representation of entities, reduction of null values, elimination of redundancy, and prevention of modification anomalies. It discusses the importance of normalization to eliminate data redundancy and anomalies, detailing various normal forms from First Normal Form (1NF) to Sixth Normal Form (6NF). Each normal form builds upon the previous one to ensure data integrity and logical organization within the database.

Uploaded by

dakshata277
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

22IT305 – DATABASE MANAGEMENT SYSTEMS

Unit III & LP 2 – FUNCTIONAL DEPENDENCIES AND


NORMAL FORMS

1. FEATURES OF GOOD RELATIONAL DESIGN


I. Relation for every entity

II. Lesser number of Null values

III. No spurious tuples

IV. No Redundancy

V. No modification anomaly

1.1 Relation for Every Entity

 Informally, each tuple in a relation should represent one entity relationship


instance. (Applies to individual relations and their attributes).
 Attributes of different entities (EMPLOYEEs, DEPARTMENTSs, PROJECTs)
should not be mixed in the same relation.
 Only foreign keys should be used to refer to other entities.
 Entity and relationship attributes should be kept apart as much as possible.

1.2 Lesser Number of Null Values

 Relations should be designed such that their tuples will have as few NULL values
as possible.
 Attributes that are NULL frequently added in separate relations (with the CPL
primary key).

Reasons for nulls:

 Attributes not applicable or invalid


 Attribute value unknown (may exist)
1
 Value known to exist, but unavailable

1.3 Decomposition (Spurious Tuples):

 Bad designs for a relational database may result in erroneous results for a certain
JOIN operations.
 The “lossless join” property is used to guarantee meaningful results for join
operations.
 The relations should be designed to satisfy the lossless join condition.
 No spurious tuples should be generated by doing a natural-join of any relations.

1.4 Data Redundancy

Redundancy is storing the same data item in more than one place.

A redundancy creates several problems like the following:

1. Extra storage space: storing the same data in many places takes large amount of
disk space.
2. Entering same data more than once during data insertion.
3. Deleting data from more than one place during deletion.
4. Modifying data in more than one place.
5. Anomalies may occur in the database if insertion, deletion, modification etc are not
done properly. It creates inconsistency and unreliability in the database.

1.5 Modification Anomaly:

 Every schema must ensure the guarantee of the data when modification happens to
the data.
 Types of anomaly:

1. Update Anomalies
2. Deletion Anomalies
3. Insert Anomalies

2
2. NORMALIZATION
DBMS Normalization is a systematic approach to decompose (break down)
tables to eliminate data redundancy(repetition) and undesirable characteristics like
Insertion anomaly in DBMS, Update anomaly in DBMS, and Delete anomaly in DBMS.

2.1 Why we need Normalization in DBMS?


Normalization is required for,
 Eliminating redundant(useless) data, therefore handling data integrity, because if
data is repeated it increases the chances of inconsistent data.
 Normalization helps in keeping data consistent by storing the data in one table and
referencing it everywhere else.
 Storage optimization although that is not an issue these days because Database
storage is cheap.
 Breaking down large tables into smaller tables with relationships, so it makes the
database structure more scalable and adaptable.
 Ensuring data dependencies make sense i.e. data is logically stored.

2.2 Problems without Normalization in DBMS

If a table is not properly normalized and has data redundancy(repetition) then it will
not only eat up extra memory space but will also make it difficult for you to handle and
update the data in the database, without losing data.
Insertion, Updation, and Deletion Anomalies are very frequent if the database is not
normalized.
To understand these anomalies let us take an example of a Student table.

In the table above, we have data for four Computer Science students.

3
As we can see, data for the fields branch, hod(Head of Department),
and office_tel are repeated for the students who are in the same branch in the college, this
is Data Redundancy.

2.2.1 Insertion Anomaly in DBMS

 Suppose for a new admission, until and unless a student opts for a branch, data of
the student cannot be inserted, or else we will have to set the branch information
as NULL.
 Also, if we have to insert data for 100 students of the same branch, then the branch
information will be repeated for all those 100 students.
 These scenarios are nothing but Insertion anomalies.
 If you have to repeat the same data in every row of data, it's better to keep the data
separately and reference that data in each row.
 So in the above table, we can keep the branch information separately, and just use
the branch_id in the student table, where branch_id can be used to get the branch
information.

2.2.2 Updation Anomaly in DBMS

 What if Mr. X leaves the college? or Mr. X is no longer the HOD of the computer
science department? In that case, all the student records will have to be updated,
and if by mistake we miss any record, it will lead to data inconsistency.
 This is an Updation anomaly because you need to update all the records in your
table just because one piece of information got changed.

2.2.3 Deletion Anomaly in DBMS

 In our Student table, two different pieces of information are kept together,
the Student information and the Branch information.
 So if only a single student is enrolled in a branch, and that student leaves the
college, or for some reason, the entry for the student is deleted, we will lose the
branch information too.
 So never in DBMS, we should keep two different entities together, which in the
above example is Student and branch,

2.3 Types of DBMS Normal forms

Normalization rules are divided into the following normal forms:


1. First Normal Form
4
2. Second Normal Form
3. Third Normal Form
4. BCNF
5. Fourth Normal Form
6. Fifth Normal Form
7. Sixth Normal Form

2.3.1 First Normal Form (1NF)

1NF requires that each column in a table contains atomic values and that each row
is uniquely identified. This means that a table cannot have repeating groups or arrays as
columns, and each row must have a unique primary key.
Example
A table is in 1NF if each column contains atomic values and each row is uniquely
identified. For example, a table that lists customers and their phone numbers –

This violates 1NF because the Phone Numbers column contains repeating groups.

To normalize this table to 1NF, we can split the Phone Numbers column into
separate rows and add a separate primary key column –

2.3.2 Second Normal Form (2NF)


2NF builds on 1NF by requiring that each non-primary key column in a table is
fully functionally dependent on the primary key. This means that a table should not have
partial dependencies, where a non-primary key column depends on only part of the primary
key.
Example
A table is in 2NF if each non-primary key column is fully functionally dependent
on the primary key. For example, a table that lists orders and their line items:
5
This violates 2NF because the Customer Name column depends on only part of the
primary key (Customer ID). To normalize this table to 2NF, we can split it into two tables

2.3.3 Third Normal Form (3NF)

3NF builds on 2NF by requiring that each non-primary key column in a table is not
transitively dependent on the primary key. This means that a table should not have
transitive dependencies, where a non-primary key column depends on another non-primary
key column.
Example
To explain 3NF further, let's consider an example of a table that lists customer
orders –

In this example, the non-primary key column "Customer City" is transitively


dependent on the primary key. That is, it depends on "Customer ID", which is not part of
the primary key, instead of depending directly on the primary key "Order ID". To bring
this table to 3NF, we can split it into two tables –

6
Now, the "Customer City" column is no longer transitively dependent on the
primary key and is instead in a separate table that has a direct relationship with the primary
key. This makes the table 3NF-compliant.

2.3.4 Boyce-Codd Normal Form (BCNF)

BCNF is a stricter form of 3NF that applies to tables with more than one candidate
key. BCNF requires that each non-trivial dependency in a table is a dependency on a
candidate key. This means that a table should not have non-trivial dependencies, where a
non-primary key column depends on another non-primary key column. BCNF ensures that
each table in a database is a separate entity and eliminates redundancies.
Example
A table is in BCNF if each determinant is a candidate key. In other words, every
non-trivial functional dependency in the table must be on a candidate key. For example,
consider a table that lists information about books and their authors −

In this example, the functional dependency between "Author ID" and "Author
Name" violates BCNF because it is not on a candidate key. To bring this table to BCNF,
we can split it into two tables –

7
Now, the "Author Name" and "Author Nationality" columns are not transitively
dependent on the primary key, and the table is in BCNF.

2.3.5 Fourth Normal Form (4NF)

4NF builds on BCNF by requiring that a table should not have multi-valued
dependencies. A multi-valued dependency occurs when a non-primary key column
depends on a combination of other non-primary key columns. For example, a table that
lists customer orders with a primary key of order ID and non-primary key columns for
customer ID and order items violates 4NF because order items depend on both order ID
and customer ID.

For example, a table that lists orders and their products, with columns for order ID,
product ID, and product details, violates 4NF because the product details depend on the
combination of order ID and product ID.
Example
Consider the following table of orders and products

In this table, the product name and description depend on both the order ID and
product ID, creating a multi-valued dependency. To bring the table into 4NF, we can split
it into three tables −

8
2.3.6 Fifth Normal Form (5NF)

o A relation is in 5NF if it is in 4NF and not contains any join dependency and
joining should be lossless.
o 5NF is satisfied when all the tables are broken into as many tables as possible in
order to avoid redundancy.
o 5NF is also known as Project-join normal form (PJ/NF).

Example

In the above table, John takes both Computer and Math class for Semester 1 but he
doesn't take Math class for Semester 2. In this case, combination of all these fields required
to identify a valid data.

9
Suppose we add a new Semester as Semester 3 but do not know about the subject
and who will be taking that subject so we leave Lecturer and Subject as NULL. But all
three columns together acts as a primary key, so we can't leave other two columns blank.

So to make the above table into 5NF, we can decompose it into three relations P1, P2 &
P3:

2.3.6 Sixth Normal Form (6NF)

In 6NF, the relation variable is decomposed into irreducible components. A relation


is in 6NF, only if, it is in 5NF, and every join dependency on the relation is trivial.
Example

Let us decompose the table -


10
11

You might also like