0% found this document useful (0 votes)
25 views

DBMS Unit-3

The document discusses database design and normalization. It covers: - Top-down and bottom-up approaches to database design. The bottom-up approach starts with unstructured attributes and uses normalization to improve structure. - Functional dependencies, which define relationships between attributes. Attributes functionally depend on the primary key. The document provides examples of valid and invalid functional dependencies. - Normalization is used to eliminate data redundancy, anomalies, and inconsistencies. It divides relations into smaller, normalized relations. The document describes 1NF through BCNF and provides examples.

Uploaded by

harshitpanwaar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

DBMS Unit-3

The document discusses database design and normalization. It covers: - Top-down and bottom-up approaches to database design. The bottom-up approach starts with unstructured attributes and uses normalization to improve structure. - Functional dependencies, which define relationships between attributes. Attributes functionally depend on the primary key. The document provides examples of valid and invalid functional dependencies. - Normalization is used to eliminate data redundancy, anomalies, and inconsistencies. It divides relations into smaller, normalized relations. The document describes 1NF through BCNF and provides examples.

Uploaded by

harshitpanwaar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 88

Unit-3

Data Base Design & Normalization


Database design
• There are two ways of approaching database design
• Top down approach
• Bottom up approach

• Top down approach


• Structure data at conceptual level (ER Model)
• Then map to physical level (Relational Model)
• Final schema is a collection of tables
• Bottom up approach
• Start from an unstructured collection of attributes.
• Use normalization to improve the structure
• Final schema is a collection of tables with minimal redundancy
• ER design does not guarantee minimal redundancy that’s why normalization
is required
Functional Dependency
• The functional dependency is a relationship that exists between two attributes. It typically
exists between the primary key and non-key attribute within a table.
• X → Y
• The left side of FD is known as a determinant, the right side of the production is known as
a dependent.
• For example:
• Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.
• Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table
because if we know the Emp_Id, we can tell that employee name associated with it.
• Functional dependency can be written as:
• Emp_Id → Emp_Name
From the above table we can conclude some valid functional dependencies:

• roll_no → { name, dept_name, dept_building },→ Here, roll_no can determine


values of fields name, dept_name and dept_building, hence a valid Functional
dependency
• roll_no → dept_name , Since, roll_no can determine whole set of {name,
dept_name, dept_building}, it can determine its subset dept_name also.
• dept_name → dept_building , Dept_name can identify the dept_building
accurately, since departments with different dept_name will also have a different
dept_building
• More valid functional dependencies: roll_no → name, {roll_no, name} ⇢
{dept_name, dept_building}, etc.
Here are some invalid functional dependencies:

• name → dept_name Students with the same name can have


different dept_name, hence this is not a valid functional dependency.
• dept_building → dept_name There can be multiple departments in
the same building. Example, in the above table departments ME and
EC are in the same building B2, hence dept_building → dept_name is
an invalid functional dependency.
• More invalid functional dependencies: name → roll_no, {name,
dept_name} → roll_no, dept_building → roll_no, etc.
Types of Functional dependency
1. Trivial Functional Dependency
• In Trivial Functional Dependency, a dependent is
always a subset of the determinant. i.e. If X →
Y and Y is the subset of X, then it is called trivial
functional dependency
• Here, {roll_no, name} → name is a trivial functional
dependency, since the dependent name is a subset of
determinant set {roll_no, name}. Similarly, roll_no →
roll_no is also an example of trivial functional
dependency.
2. Non-trivial Functional Dependency
• In Non-trivial functional dependency, the dependent is
strictly not a subset of the determinant. i.e. If X →
Y and Y is not a subset of X, then it is called Non-trivial
functional dependency.
• Here, roll_no → name is a non-trivial functional
dependency, since the dependent name is not a subset
of determinant roll_no. Similarly, {roll_no, name} → age is
also a non-trivial functional dependency, since age is not a
subset of {roll_no, name}
3. Multivalued Functional Dependency
• In Multivalued functional dependency, entities of the
dependent set are not dependent on each other. i.e.
If a → {b, c} and there exists no functional
dependency between b and c, then it is called
a multivalued functional dependency.
• Here, roll_no → {name, age} is a multivalued
functional dependency, since the
dependents name & age are not dependent on each
other(i.e. name → age or age → name doesn’t exist !)
4. Transitive Functional Dependency
• In transitive functional dependency, dependent is
indirectly dependent on determinant. i.e. If a → b & b →
c, then according to axiom of transitivity, a → c. This is
a transitive functional dependency.
• Here, enrol_no → dept and dept →
building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid
functional dependency. This is an indirect functional
dependency, hence called Transitive functional
dependency.
Practice question
Practice question
Attribute Closure
• Attribute closure of an attribute set A can be defined as a set of
attributes which ca be functionally determined from it.
Attribute Closure
Armstrong’s axioms/properties of
functional dependencies:
Practice questions
Irreducible set of FD (Canonical form)
Irreducible set of FD (Canonical form)
Irreducible set of FD (Canonical form)
Irreducible set of FD (Canonical form)
Irreducible set of FD (Canonical form)
Super key, candidate key and primary key
• Super key- if there exists a candidate key which is proper subset of
that super key then that super key is not a candidate key.
Finding number of candidate keys
1&2
3&4
5&6
7
Normalization
• Idea- In the table student_info we tried to store all the data
• Result - Entire branch data must be repeated for every student of the
branch
• Redundancy – When same data is stored multiple times in a database.
• Disadvantage – Insertion, deletion and modification anomalies
- Inconsistency
- Increase in database size and access time
• Insertion anomaly – When certain data or attribute cannot be
inserted in database without the presence of other data.
• Deletion anomaly – If we delete some unwanted data, it causes
deletion of other important data.
• Updation anomaly – When we want to update a single piece of data
but it must be done for all the copies.
Normalization
• A large database defined as a single relation may result in data
duplication. This repetition of data may result in:
• Making relations very large.
• It isn't easy to maintain and update data as it would involve searching
many records in relation.
• Wastage and poor utilization of disk space and resources.
• The likelihood of errors and inconsistencies increases.
• So to handle these problems, we should analyze and decompose the
relations with redundant data into smaller, simpler, and well-structured
relations that satisfy desirable properties. Normalization is a process of
decomposing the relations into relations with fewer attributes.
Normalization

• Normalization is the process of organizing the data in the database.


• Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
• Normalization divides the larger table into smaller and links them using
relationships.
• The normal form is used to reduce redundancy from the database table.
• The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows. Normalization consists of a
series of guidelines that helps to guide you in creating a good database
structure.
Disadvantages of Normalization
• You cannot start building the database before knowing what the user needs.
• The performance degrades when normalizing the relations to higher normal forms,
i.e., 4NF, 5NF.
• It is very time-consuming and difficult to normalize relations of a higher degree.
• Careless decomposition may lead to a bad database design, leading to serious
problems.
First Normal Form (1NF)
• A relation will be 1NF if it contains an atomic value.
• It states that an attribute of a table cannot hold multiple values. It must hold only
single-valued attribute.
• First normal form disallows the multi-valued attribute, composite attribute, and
their combinations.
• Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE.
Second Normal Form (2NF)
• In the 2NF, relation must be in 1NF.
• In the second normal form, all non-key attributes are fully functional
dependent on the primary key
Practice Ques for 2NF
Solution 1 2 3
Third Normal Form (3NF)
• A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
• 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
• If there is no transitive dependency for non-prime attributes, then the relation must
be in third normal form.
• A relation is in third normal form if it holds atleast one of the following conditions for
every non-trivial function dependency X → Y.

• X is a super key.
• Y is a prime attribute, i.e., each element of Y is part of some candidate key.
3 NF
• Transitive dependency – A functional dependency A->B is called
transitive if A,B are non prime attributes.
• A relation is in 3NF if-
• It is in 2NF
• There is no partial dependency
• For every dependency A->B, if B is a prime attribute then it can be said
that relation has no partial dependency and transitive dependency,
then relation will always be in 3NF
Practice Questions
Boyce Codd normal form (BCNF)
• BCNF is the advance version of 3NF. It is stricter than 3NF.
• A table is in BCNF if every functional dependency X → Y, X is the super
key of the table.
• For BCNF, the table should be in 3NF, and for every FD, LHS is super
key.
• Example: Let's assume there is a company where employees work in
more than one department.
• Note- 3NF always ensures dependency preservation but BSNF not
• Both 3NF and BCNF ensures lossless decomposition
Fourth normal form (4NF)

• A relation will be in 4NF


if it is in Boyce Codd
normal form and has no
multi-valued
dependency.
• For a dependency A →
B, if for a single value of
A, multiple values of B
exists, then the relation
will be a multi-valued
dependency.
• The given STUDENT table is in 3NF, but the
COURSE and HOBBY are two independent
entity. Hence, there is no relationship
between COURSE and HOBBY.
• In the STUDENT relation, a student with
STU_ID, 21 contains two
courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a
Multi-valued dependency on STU_ID, which
leads to unnecessary repetition of data.
• So to make the above table into 4NF, we can
decompose it into two tables:
Fifth normal form (5NF)

• A relation is in 5NF if it is in 4NF


and not contains any join
dependency and joining should be
lossless.
• 5NF is satisfied when all the tables
are broken into as many tables as
possible in order to avoid
redundancy.
• 5NF is also known as Project-join
normal form (PJ/NF).
• In the above table, John takes both Computer and Math class for Semester 1 but
he doesn't take Math class for Semester 2. In this case, combination of all these
fields required to identify a valid data.
• Suppose we add a new Semester as Semester 3 but do not know about the subject
and who will be taking that subject so we leave Lecturer and Subject as NULL.
But all three columns together acts as a primary key, so we can't leave other two
columns blank.
• So to make the above table into 5NF, we can decompose it into three relations P1,
P2 & P3:
Lossy and lossless decomposition
• Decomposition is a process of dividing a relation into multiple
relations to remove redundancy while maintaining the original data.
• A lossless decomposition of a relation ensures that:
• No information is lost during decomposition.
• If a relation R is divided into two relations R1 and R2 using lossless
decomposition then the natural join of R1 and R2 would return the original
relation R.
Rules of Lossless decomposition
Example
• This is a lossy
decomposition
Example
• In this decomposition, all
attributes are present but no
common attribute between
R1 and R2
• We have cartesian product
here to generate original
relation
• This is a lossy decomposition

These both rows in this


resultant table is extra which
is not present in the original
table
• We have decomposed here with
a common attribute B.
• Natural join is performed for
generating original relation.
• This is also a lossy
decomposition because the
common attribute is not a
distinct or key attribute.
Practice questions
Dependency Preserving Decomposition

• If a table R is having FD set F, is decomposed into two tables R1 and


R2 having FD set F1 and F2 then,
• F1⊆F+
• F2⊆F+
• (F1UF2)+ = F+
Example

You might also like