0% found this document useful (0 votes)
3 views

Module 3 DBMS 4th Sem B.Tech CSE

The document discusses the importance of database normalization, detailing its processes and the need to minimize redundancy to avoid data anomalies such as insertion, deletion, and update issues. It explains various types of functional dependencies and normal forms, emphasizing how they help in organizing data efficiently. Additionally, it provides examples of normalization techniques to illustrate the concepts of functional and multivalued dependencies, as well as join dependencies.

Uploaded by

abcan0000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Module 3 DBMS 4th Sem B.Tech CSE

The document discusses the importance of database normalization, detailing its processes and the need to minimize redundancy to avoid data anomalies such as insertion, deletion, and update issues. It explains various types of functional dependencies and normal forms, emphasizing how they help in organizing data efficiently. Additionally, it provides examples of normalization techniques to illustrate the concepts of functional and multivalued dependencies, as well as join dependencies.

Uploaded by

abcan0000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

L. S.

Kalundia
DATABASE MANAGEMENT SYSTEM Assistant Professor
MODULE III Dept. of Computer Sc. & Engg.
Amity University, Jharkhand
MODULE III: RELATIONAL DATABASE DESIGN

Normalization using Functional Dependency, Multivalued dependency and Join dependency.


DATABASE NORMALIZATION
❑Normalization is a process of organizing data in a database to reduce redundancy and
improve data consistency.
❑Normalization is the process of minimizing redundancy from a relation or set of relations.
Redundancy in relation may cause insertion, deletion and update anomalies.

Need for Normalization?


If a table is not properly normalized and has data redundancy, it will not only take up extra
data storage space but also make it difficult to handle and update the database.
1. Data modification anomalies
Any form of change in a table can lead to errors or inconsistencies in other tables if not
handled carefully. These changes can either be adding new data to a database, updating the
data, or deleting records, which can lead to unintended loss of data.
Data modification anomalies can be categorized into three types:
A) Insertion Anomaly:
Insertion Anomaly refers to when one cannot insert a new tuple into a relationship due to lack of data.

•Students and Courses:

Student_ID Student_Name Course_ID Course_Name Instructor


101 Raj CSE101 DBMS Dr. Sharma
102 Riya CSE102 OOP Dr. Mehta

•Problem: If a new instructor Dr. Singh is hired to teach a new course CSE103,
but no student has enrolled in that course yet, we cannot insert the instructor's
details because the table structure requires a Student_ID for each row.

•Solution: Normalize the table by creating a separate Instructor table.


B) Deletion Anomaly:
The delete anomaly refers to the situation where the deletion of data results in the unintended
loss of some other important data.

Example: School Database where student and teacher information is stored in the same table:

Student_ID Student_Name Class Teacher_Name Subject


101 Rohan 5A Mr. Sharma Math
102 Priya 5A Mr. Sharma Math
103 Aryan 5B Ms. Verma Science

Issue:
If all students from Class 5A leave the school, their records are deleted.
•Unintended Loss: Since Mr. Sharma's details are stored only in student records, deleting all students from Class
5A also removes information about Mr. Sharma, even though he is still a teacher at the school.
•Problem: The school loses track of Mr. Sharma just because no students are currently assigned to him.

Solution:
•Create separate tables for Teachers and Students, ensuring that deleting students does not remove teacher
data.
C) Updatation Anomaly: The update anomaly is when an update of a single data value requires
multiple rows of data to be updated.

Company Database where employee and department details are stored in the same table:
Employee_ID Employee_Name Department_ID Department_Name Manager_Name
101 Rahul D1 HR Mr. Sharma
102 Priya D1 HR Mr. Sharma
103 Aman D2 IT Ms. Verma

Issue:
If Mr. Sharma (HR Manager) is replaced by Mr. Gupta, we must update every row where HR department is
listed.
•Unintended Consequence: Since Mr. Sharma’s name appears multiple times, updating the manager’s name
requires modifying multiple records.
•Risk of Inconsistency: If we miss updating one row, some employees will still show Mr. Sharma as their
manager while others will have Mr. Gupta.

Solution:
•Normalize the database by storing Departments and Employees in separate tables.
2. Difficulty in managing relationships:
It becomes more challenging to maintain complex relationships in an unnormalized structure.

Unnormalized Table (Student_Course Table)

Student_ID Student_Name Course_1 Course_2 Course_3


201 Alex Math Science English
202 Ben History NULL NULL
203 Charlie Math English NULL

Problem (Difficulty in Managing Relationships)


1.Data Duplication: If a student takes more than three courses, we have to add more columns (Course_4,
Course_5, etc.).
2.Wasted Space: If a student takes only one course, the extra columns remain empty (NULL values).
3.Difficult Queries: Finding students enrolled in "Math" is difficult because it appears in multiple columns.

Solution:
•Normalize the database by creating separate Tables for Students & Courses.
•This reduces redundancy and simplifies queries.
3. Data Dependency issue:
Other factors that drive the need for normalization are partial dependencies and transitive
dependencies, in which partial dependencies can lead to data redundancy and update
anomalies, and transitive dependencies can lead to data anomalies.
FUNCTIONAL DEPENDENCY IN DBMS
❑The core concepts behind database
normalization are based on functional
dependencies.
❑Functional Dependency is the relationship
between attributes(characteristics) of a table
related to each other.
❑Functional dependencies help in identifying
unique relationships between attributes. (If we know A, we can determine B)

❑ Example Roll_No Name Dept


Consider a table Student: 101 Raj CSE
Roll_No → Name
102 Aman ECE
Roll_No uniquely determines the student's name
103 Pooja CSE
Name depends completely on Roll_No
Consider a relation with four attributes A, B, C and D

R (ABCD)

A → BCD
For the first functional dependency A → BCD, attributes B,
C and D are functionally dependent on attribute A.

B → CD
Function dependency B → CD has two
attributes C and D functionally depending upon
attribute B.
Types of Functional Dependencies in DBMS
1. Trivial Functional Dependency
2. Non-Trivial Functional Dependency
3. Full Functional Dependency
4. Partial Functional Dependency
5. Transitive Functional Dependency
6. Multivalued Dependency (MVD)
7. Join Dependency

1. Trivial Functional Dependency


A functional dependency X → Y is said to be trivial if the dependent attribute (Y) is a subset of
the determinant (X).
Example: • {Roll_No, Name} → Name This is trivial because
Roll_No Name Age Name is already a part of the determinant (Roll_No,
101 Raj 20 Name), does not provide any new information about the
relationship between attributes.
102 Aman 22
• {Roll_No, Age} → Age
Impact on Normalization : Since a trivial FD does not provide new information about relationships between
attributes, it does not cause redundancy in the database. It do not violate normal forms, do not directly impact
the normalization process and are ignored during decomposition.
2. Non-Trivial Functional Dependency
A functional dependency X → Y is non-trivial if the dependent attribute (Y) is not a subset of the
determinant (X). In other words, Y contains attributes that do not already exist in X.

Example : Roll_No Name Age •Roll_No → Name (Non-trivial,


101 Raj 20 because Name is not part of Roll_No)
•Roll_No → Age (Non-trivial,
102 Aman 22
because Age is not part of Roll_No)
Impact on Normalization : Non-trivial functional dependencies play a key role in normalization
because they help identify redundancy If a non-trivial dependency causes redundancy, decomposition
is required to remove anomalies..

3. Full Functional Dependency


A functional dependency X → Y is fully functional if Y depends on the entire set of attributes in X,
and removing any attribute from X breaks the dependency.
Example: Marks table:
•{Roll_No, Course_ID} → Marks (Fully functional dependency).
Roll_No Course_ID Marks Marks is fully functionally dependent on (Roll_No, Course_ID)
•If we remove Course_ID, then Roll_No alone cannot determine
101 CS101 85 Marks.
101 CS102 90 Impact on Normalization : Full functional dependency is
important in 2NF (Second Normal Form).
4. Partial Functional Dependency
A functional dependency X → Y is partial if a part of X (instead of the whole X) can still determine Y..
Example: Marks table:

Roll_No Course_ID Marks Student_Name •{Roll_No, Course_ID} → Marks (Full Dependency )


•But, Roll_No → Student_Name (Partial Dependency )
101 CS101 85 Raj Here, Student_Name depends only on Roll_No, not on
102 CS102 90 Aman Course_ID, making it a partial dependency.

Impact on Normalization (Breaking 2NF Rule) - Partial Functional Dependency leads to redundancy
and should be removed in 2NF.

5. Transitive Functional Dependency


A functional dependency X → Z is transitive if if there is an intermediate attribute Y, such that:
X → Y and Y → Z This means that Z is indirectly dependent on X through Y.
Example: Student table:
Roll_No Dept HOD •Roll_No → Dept (Each Roll_No belongs to one department)
•Dept → HOD (Each department has a unique HOD name)
101 CSE Dr. Rao
•By transitivity: Roll_No → HOD
102 ECE Dr. Singh

Impact on Normalization : Causes redundancy and is removed in 3NF (Third Normal Form).
6. Multivalued Dependency (MVD)
• A Multivalued Dependency (MVD) exists when for a single value of X, multiple values of Y exist but are
independent of third attribute Z. Given a relation R with attributes X,Y, and Z : -

• It is denoted as: X↠Y (read as X multidetermines Y) relates one value of X to many values of Y but
independent of values of Z. (where Z = all other attributes in the table apart from X and Y).

Example: So we have two multivalued dependencies:


Table Movie(Movie_ID, Movie_ID Actor Director •Movie_ID →→ Actor
Actor, Director) (A movie can have multiple actors).
M101 SRK Rohit Shetty •For Movie_ID = M101, Actor = {SRK, Salman}
•A movie can have multiple
actors M101 Salman Rohit Shetty
•A movie can have multiple M101 SRK Karan Johar •Movie_ID →→ Director
directors (A movie can have multiple directors).
M101 Salman Karan Johar
•But actors and directors are This means that Actors and Directors
independent of each other are independent for a given Movie_ID.
Impact on Normalization : Multivalued Dependencies violate 4NF (Fourth Normal Form) if they
cause redundancy. To achieve 4NF, we must remove MVDs by decomposing the table.

Solution (Decomposition into 4NF):To remove redundancy, split the table:


1.Movie_Actors(Movie_ID, Actor)
2.Movie_Directors(Movie_ID, Director)
Example
Consider a relation Student(SID, Course, Hobby) If a student can have many hobbies and many
A student can enroll in multiple courses. courses, and hobbies don't depend on courses,
•A student can have multiple hobbies. that’s a multivalued dependency.
•Courses and hobbies are independent.

Data: SID Course Hobby


1 Math Painting
1 Math Music
1 English Painting
1 English Music
Here,
•For SID = 1, Courses = {Math, English}
•For SID = 1, Hobbies = {Painting, Music}

This implies: Because Course and Hobby are


independent, the combination of all possible
SID ↠ Course
Course-Hobby pairs appears.
SID ↠ Hobby
7. Join Dependency
A Join Dependency (JD) occurs when a relation can be decomposed into smaller tables
while preserving the original information without losing data.

Example: Table Project (Emp_ID, Project_ID, Skill)

Emp_ID Project_ID Skill


101 P1 Java
101 P1 Python
102 P2 SQL

A Join Dependency exists if we can split this table into smaller relations:
1.Emp_Projects (Emp_ID, Project_ID) - Stores which employees are assigned to which projects.
2.Emp_Skills (Emp_ID, Skill) - Stores which skills each employee has.

Such that when we join these tables back on Emp_ID, we obtain the original Project table without any
loss of information.

Impact on Normalization : Ensures lossless decomposition and is used in 5NF (Fifth Normal Form).
Types of Normal Forms (Database Normalization Techniques)

Normalization works through a series of stages called Normal forms. The normal forms apply to
individual relations. The relation is said to be in particular normal form if it satisfies constraints.

There are six normal forms commonly used. Each Normal Form (NF) is a step in this process,
with stricter rules as we move higher.

Normal Form Key Rule Eliminates / Removes


Atomic values, no repeating
1NF Repeating data
groups
2NF No partial dependencies Partial dependencies
Transitive
3NF No transitive dependencies
dependencies
Every determinant is a Functional
BCNF
candidate key dependencies
Multivalued
4NF No multivalued dependencies
dependencies
5NF No join dependencies Join dependencies
Primary Key and Non-key attributes
1NF (First Normal Form)
Rules:
Atomic values: Each column must have atomic (indivisible) values.
Uniqueness: Each column should store unique data.
No repeating groups: A table should not have sets of values in a single column.

Example (Before 1NF - Repeating Groups)

Student_ID Name Subjects


101 Raj Math, Physics
102 Aman Chemistry

Problem: "Subjects" column has multiple values (not atomic).

After 1NF (Atomic values in separate rows)


Student_ID Name Subject
101 Raj Math
101 Raj Physics
Now, the table follows 1NF. 102 Aman Chemistry
Employee table
emp_id emp_name emp_mobile emp_skills
1 John Tick 9999957773 Python, JavaScript
2 Darth Trader 8888853337 HTML, CSS, JavaScript
3 Rony Shark 7777720008 Java, Linux, C++

Apply First Normal form.


OPTION 2
OPTION 1 Add Multiple rows for Multiple skills
Create Separate tables for Employee and Employee Skills
emp_id emp_name emp_mobile emp_skill

emp_id emp_name emp_mobile emp_id emp_skill 1 John Tick 9999957773 Python


1 Python
1 John Tick 9999957773 1 John Tick 9999957773 JavaScript
1 JavaScript
2 Darth Trader 8888853337 2 Darth Trader 8888853337 HTML
2 HTML
3 Rony Shark 7777720008 2 CSS 2 Darth Trader 8888853337 CSS
2 JavaScript 2 Darth Trader 8888853337 JavaScript
3 Java
3 Rony Shark 7777720008 Java
3 Linux
3 C++ 3 Rony Shark 7777720008 Linux

3 Rony Shark 7777720008 C++


2NF (Second Normal Form)
Rules:
Must be in 1NF
No Partial Dependencies – A non-key attribute must depend on the whole primary key(primary key that
is made up of two or more columns), not just a part of it. It should be in Full Functional Dependency
Key attributes – Roll_No, Course_ID
Example (Before 2NF - Partial Dependency Exists)
Non-key attribute – Marks, Student_Name
Roll_No Course_ID Marks Student_Name
101 CS101 85 Raj
102 CS102 90 Aman
Problem:
•{Roll_No, Course_ID} → Marks (Correct Functional Dependency)
•BUT Roll_No → Student_Name (Partial Dependency - Student_Name depends only on Roll_No ,
not Course_ID )

After 2NF (Splitting Tables to Remove Partial Dependency)


Student Table: Marks Table:
Roll_No Student_Name Roll_No Course_ID Marks
101 Raj 101 CS101 85
102 Aman 102 CS102 90 Now, the table follows 2NF.
Example :
Consider the following table. Its primary key is {StudentId, ProjectId}.
The Functional dependencies given are -
StudentId → StudentName
StudentId ProjectId Student Name Project Name
ProjectId → ProjectName
1 P2 John IOT
2 P1 Claire Cloud
3 P7 Clara IOT
4 P3 Abhk Cloud

As it represents partial dependency, we


decompose the table as follows -

StudentId ProjectId Student Name ProjectId Project Name


1 P2 John P2 IOT
2 P1 Claire P1 Cloud
3 P7 Clara P7 IOT
4 P3 Abhk P3 Cloud

Here projectId is mentioned in both tables to set up a relationship between them.


3NF (Third Normal Form) Transitive Dependencies
Rules: A → B and B → C, so A→C
Must be in 2NF Here A is the primary key.
No Transitive Dependencies – A non-key attribute should not depend
on another non-key attribute. To convert it into 3NF, we create two
relations -
Example (Before 3NF - Transitive Dependency Exists)
R1(A, B), where A is the primary key
Roll_No Student_Name Dept HOD and
R2(B, C), where B is the primary key.
101 Raj CSE Dr. Rao
102 Aman ECE Dr. Singh

Problem:
•Roll_No → Dept
•Dept → HOD
•Transitive Dependency: Roll_No → HOD

After 3NF (Removing Transitive Dependency by creating a Department Table:


separate table) Dept HOD
Student Table: Roll_No Student_Name Dept CSE Dr. Rao
101 Raj CSE
ECE Dr. Singh
102 Aman ECE
Now, the table follows 3NF.
prime attribute - Student ID
Rest are Non-Prime

But, STU_ID->ZIPCODE and


ZIPCODE->STU_STATE are the two
functional dependencies that exist.

Since Zipcode is a non-prime


attribute, STU_STATE indirectly
depends on STU_ID. Therefore,
STU_ID->STU_STATE also exists which
is a Transitive Functional Dependency.
BCNF (Boyce-Codd Normal Form)
Rules: Boyce-Codd Normal
Must be in 3NF Form (BCNF) is a
Each determinant(left side) must be a candidate key. stricter version of 3NF

Example (Before BCNF - FD Issue Exists)

Professor Course Department


Dr. Rao DBMS CSE
Dr. Singh OS ECE
Dr. Rao DS CSE

Problem:
•Professor → Department (FD exists, but Professor is not a candidate key)
After BCNF (Splitting into separate tables)
Professor Table: Course Table:
Professor Department Professor Course
Dr. Rao CSE Dr. Rao DBMS
Now, the table follows BCNF.
Dr. Singh ECE Dr. Rao DS
Dr. Singh OS
Example: Student_Course

student_id course_id instructor Step-by-Step BCNF Decomposition:


101 CSE101 Dr. Sharma We break the table into two tables:
102 CSE101 Dr. Sharma 1. Instructor_Course (instructor → course_id)
103 CSE102 Dr. Mehta
104 CSE102 Dr. Mehta instructor course_id
Dr. Sharma CSE101
student_id, course_id → instructor
Dr. Mehta CSE102
instructor → course_id

Why It Violates BCNF: 2. Student_Instructor (student_id, instructor)


•FD instructor → course_id violates BCNF because:
•instructor is not a super key(candidate key). student_id instructor
101 Dr. Sharma
102 Dr. Sharma
103 Dr. Mehta
104 Dr. Mehta
Multivalued Dependencies (MVDs)
4NF (Fourth Normal Form) If X →→ Y and X →→ Z, and Y and Z are independent (like Actors
Rules: and Directors in example below, don’t depend on each other)
Must be in BCNF
No Multivalued Dependencies (MVDs) In one sentence:
"Don’t force unrelated lists into the same table—split them!"
Example (Before 4NF - Multivalued Dependency Exists)

Movie_ID(X) Actor(Y) Director(Z) After 4NF (Decomposing into two tables)


M101 SRK Rohit Shetty Movie_Actors Table:
M101 Salman Rohit Shetty Movie_ID Actor
M101 SRK Karan Johar M101 SRK
M101 Salman Karan Johar M101 Salman
Movie_Directors Table:
Problem:
Movie_ID Director
•Movie_ID →→ Actor
•Movie_ID →→ Director M101 Rohit Shetty
•Actors and Directors are independent attributes. M101 Karan Johar
Now, the table follows 4NF.
5NF (Fifth Normal Form) also called the PJNF - Project-Join Normal Form
Rules:
Must be in 4NF
No Join Dependencies - break down the table to remove redundancy and anomaly and then re-join
the decomposed tables using candidate keys. It is to ensure that splitting and rejoining the tables
doesn't create extra or missing data.
Example:
A Project Table (Emp_ID, Project_ID, Skill) might be decomposed into
two separate tables based on dependencies.
Removes unnecessary joins and redundancy.

Example (Before 5NF - Join Dependency Exists)


Table: Employee_Project_Skill (Before 5NF)

Emp_ID Project_ID Skill


101 P1 Java Problem:
•This table suggests that employees, projects, and
101 P1 Python skills are interdependent.
101 P2 Java •However, it creates redundancy.
102 P1 Python •Instead of keeping all three attributes together, we
should break it into multiple tables.
102 P2 Java
After 5NF (Removing Join Dependency)
To eliminate redundancy, we decompose the table into three smaller tables:

Table 1: Employee_Project (Which employees are assigned to which projects)

Emp_ID Project_ID
101 P1
101 P2
102 P1
102 P2

Table 2: Employee_Skill (Which skills each employee has) Table 3: Project_Skill (Which skills
are required for each project)
Emp_ID Skill
Project_ID Skill
101 Java
P1 Java
101 Python
P1 Python
102 Java
P2 Java
102 Python
END

You might also like