0% found this document useful (0 votes)
3 views

Chapter 5 Database Management

The document discusses relational database design, focusing on pitfalls, normalization concepts, and functional dependencies. It covers various normal forms, including First Normal Form (1NF), Second Normal Form (2NF), and others, emphasizing the importance of reducing redundancy and ensuring data consistency. Additionally, it outlines the advantages and disadvantages of normalization and explains different types of functional dependencies in database management.

Uploaded by

pythonwork98
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Chapter 5 Database Management

The document discusses relational database design, focusing on pitfalls, normalization concepts, and functional dependencies. It covers various normal forms, including First Normal Form (1NF), Second Normal Form (2NF), and others, emphasizing the importance of reducing redundancy and ensuring data consistency. Additionally, it outlines the advantages and disadvantages of normalization and explains different types of functional dependencies in database management.

Uploaded by

pythonwork98
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

RELATIONAL DATABASE

DESIGN

Presented by
Prof. Pooja Mhatre

PRAVIN ROHIDAS PATIL COLLEGE OF


ENGINEERING AND TECHNOLOGY
Topics to be covered
✔ Pitfalls in Relational Database Designs
✔ Concept of Normalization
✔ Function Dependencies
✔ First Normal Form (1NF)
✔ Second Normal Form (2NF)
✔ Third Normal Form (3NF)
✔ Boyce–Codd Normal Norm (BCNF)
Pitfalls in Relational Database
Designs
1. Inadequate Normalization: Normalization is the process of organizing data into
tables to minimize redundancy and eliminate data anomalies. Failure to normalise
data can result in redundant data and update anomalies. This can lead to data
inconsistencies and poor performance.
2. Overuse of NULL values: Using NULL values can make it difficult to query data
and can lead to confusion when interpreting data. Overuse of NULL values can
also result in poor performance.
3. Poor Indexing: Indexing is essential for efficient querying of data. Poorly
designed indexes can result in slow query performance and database bloat.
4. Insufficient Primary and Foreign Keys: Primary and foreign keys establish
relationships between tables and ensure data consistency. Failure to implement
these keys can result in data inconsistencies and poor performance.
5. Denormalization: While denormalization can improve query performance, it can
also lead to data inconsistencies and update anomalies. Denormalization should
be used sparingly and only after careful consideration.
6. Failure to Plan for Growth: A database should be designed with future growth
in mind. Failure to plan for growth can result in poor performance, data
inconsistencies, and costly database redesigns.
7. Lack of Documentation: A lack of documentation can make it difficult to
understand the database design and lead to errors in data analysis and reporting.
Concept of Normalization
• The normalization concept for relational databases, developed by E.F.
Codd, the inventor of the relational database model, is from the
1970s.
• Before Codd, the most common method of storing data was in large,
cryptic, and unstructured files, generating plenty of redundancy and
lack of consistency.
• When databases began to emerge, people noticed that stuffing data
into them caused many duplications and anomalies to emerge, like
insert, delete, and update anomalies.
• These anomalies could produce incorrect data reporting, which is
harmful to any business.
• Normalization is a methodological method used in the design of
databases to create a neat, structured, and structured table in which
each table relates to just one subject or one-to-one correspondence.
NORMALIZATION
• Normalization is a systematic approach to organize data in
a database to eliminate redundancy, avoid anomalies and
ensure data consistency.
• Normalization is an important process in database design
that helps improve the database’s efficiency, consistency,
and accuracy.
• It makes it easier to manage and maintain the data and
ensures that the database is adaptable to changing
business needs.
• The process involves breaking down large tables into
smaller, well-structured ones and defining relationships
between them.
• This not only reduces the chances of storing duplicate data
but also improves the overall efficiency of the database.
Advantages of Normal
Form
i. Reduced data redundancy: Normalization helps to
eliminate duplicate data in tables, reducing the amount of
storage space needed and improving database efficiency.
ii. Improved data consistency: Normalization ensures that
data is stored in a consistent and organized manner,
reducing the risk of data inconsistencies and errors.
iii. Simplified database design: Normalization provides
guidelines for organizing tables and data relationships,
making it easier to design and maintain a database.
iv. Improved query performance: Normalized tables are
typically easier to search and retrieve data from, resulting in
faster query performance.
v. Easier database maintenance: Normalization reduces the
complexity of a database by breaking it down into smaller,
more manageable tables, making it easier to add, modify,
and delete data.
Disadvantages of
Normalization
i. Normalization can result in increased performance
overhead due to the need for additional join operations
and the potential for slower query execution times.
ii. Normalization can result in the loss of data context, as
data may be split across multiple tables and require
additional joins to retrieve.
iii. Proper implementation of normalization requires expert
knowledge of database design and the normalization
process.
iv. Normalization can increase the complexity of a database
design, especially if the data model is not well
understood or if the normalization process is not carried
out correctly.
Functional Dependencies
• In relational database management, functional
dependency is a concept that specifies the relationship
between two sets of attributes where one attribute
determines the value of another attribute.
• It is denoted as X → Y, where the attribute set on the left
side of the arrow, X is called Determinant, and Y is called
the Dependent.
• A functional dependency occurs when one attribute
uniquely determines another attribute within a relation.
• It is a constraint that describes how attributes in a table
relate to each other.
• If attribute A functionally determines attribute B we write
this as the A→B.
• TABLE

roll_no name dept_name dept_building

42 abc CO A4

43 pqr IT A3

44 xyz CO A4

45 xyz IT A3

46 mno EC B2

47 jkl ME B2
• From the above table we can conclude some valid
functional dependencies:

 roll_no → { name, dept_name, dept_building }→ Here,


roll_no can determine values of fields name, dept_name
and dept_building, hence a valid Functional dependency
 roll_no → dept_name , Since, roll_no can determine
whole set of {name, dept_name, dept_building}, it can
determine its subset dept_name also.
 dept_name → dept_building , Dept_name can identify
the dept_building accurately, since departments with
different dept_name will also have a different
dept_building
 More valid functional dependencies: roll_no → name,
{roll_no, name} ⇢ {dept_name, dept_building}, etc.
• Here are some invalid functional dependencies:

 name → dept_name Students with the same name can


have different dept_name, hence this is not a valid
functional dependency.
 dept_building → dept_name There can be multiple
departments in the same building. Example, in the above
table departments ME and EC are in the same building B2,
hence dept_building → dept_name is an invalid functional
dependency.
 More invalid functional dependencies: name → roll_no,
{name, dept_name} → roll_no, dept_building → roll_no,
etc.
Types of Functional
Dependencies in DBMS
1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency
5. Full functional dependency
6. Partial functional dependency
Trivial Functional Dependency
• In Trivial Functional Dependency, a dependent is always a subset of the
determinant. i.e. If X → Y and Y is the subset of X, then it is called trivial
functional dependency.
• Symbolically: A→B is trivial functional dependency if B is a subset of A.
• The following dependencies are also trivial: A→A & B→B
• Example 1 : Example 2:
ABC -> AB roll_no name age
ABC -> A
42 abc 17
ABC -> ABC
43 pqr 18

44 xyz 18

• Here, {roll_no, name} → name is a trivial functional dependency, since the


dependent name is a subset of determinant set {roll_no, name}. Similarly,
roll_no → roll_no is also an example of trivial functional dependency.
Non-Trivial functional dependency
• In Non-trivial functional dependency, the dependent is strictly
not a subset of the determinant. i.e. If X → Y and Y is not a
subset of X, then it is called Non-trivial functional dependency.
• Example 1 :
Id -> Name
Name -> DOB
• Example 2:
roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

• Here, roll_no → name is a non-trivial functional dependency, since


the dependent name is not a subset of determinant roll_no.
Similarly, {roll_no, name} → age is also a non-trivial functional
dependency, since age is not a subset of {roll_no, name}
Multivalued functional dependency

• In Multivalued functional dependency, entities of the


dependent set are not dependent on each other. i.e.
If a → {b, c} and there exists no functional
dependency between b and c, then it is called a
multivalued functional dependency.
• Example: bike_model manuf_year color

tu1001 2007 Black

tu1001 2007 Red

tu2012 2008 Black

tu2012 2008 Red

tu2222 2009 Black

tu2222 2009 Red


• In this table:
X: bike_model
Y: color
Z: manuf_year

• For each bike model (bike_model):


 There is a group of colors (color) and a group of
manufacturing years (manuf_year).
 The colors do not depend on the manufacturing year, and
the manufacturing year does not depend on the colors.
They are independent.
 The sets of color and manuf_year are linked only to
bike_model.
 That’s what makes it a multivalued dependency.

• In this case these two columns are said to be multivalued


dependent on bike_model.
Transitive functional
dependency

• In transitive functional dependency, dependent is indirectly dependent


on determinant. i.e. If a → b & b → c, then according to axiom of
transitivity, a → c. This is a transitive functional dependency.
• Example:
• Here, enrol_no → dept and dept → building_no. Hence, according to
the axiom of transitivity, enrol_no → building_no is a valid functional
dependency. This is an indirect functional dependency, hence called
Transitive functional dependency.
enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

45 abc EC 2
Full Functional Dependency

• In full functional dependency an attribute or a set of


attributes uniquely determines another attribute or
set of attributes.
• If a relation R has attributes X, Y, Z with the
dependencies X->Y and X->Z which states that those
dependencies are fully functional.
Partial functional dependency

• In partial functional dependency a non key attribute


depends on a part of the composite key, rather than
the whole key.
• Partial dependency in a relational database occurs
when a non-prime attribute (i.e., not part of any
candidate key) is functionally dependent on only a
part of the primary key, rather than the entire
primary key.
• If a relation R has attributes X, Y, Z where X and Y
are the composite key and Z is non key attribute.
Then X->Z is a partial functional dependency in
RBDMS.
• Example

Student_ID Course_ID Course_Name Instructor

1 101 Math Mr. Smith

1 102 Science Ms. Johnson

2 101 Math Mr. Smith

3 103 English Mr. Brown

• Explanation:
Candidate Key: {Student_ID, Course_ID}
Non-Prime Attribute: Course_Name
Partial Dependency: Course_Name → Student_ID
(since Course_Name depends on part of the
primary key, which is Student_ID)
Types of Normal Forms
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Boyce-Codd Normal Form (BCNF)
First Normal Form (1NF)
• For a table to be in the first normal form, it must
meet the following criteria:
 A single cell must not hold more than one value
(atomicity)
 There must be a primary key for identification
 No duplicated rows or columns
 Each column must have only one value for each row
in the table
1NF EXAMPLE
• EMPLOYEE
EMP_ID EMP_NAME SALARY CONTACT

1 ATUL 10000 1234, 4567


2 VIPUL 20000 5678
3 SHISHIR 30000 4568

• SOLUTION 1 - (INCREASE REDUNDANCY)


EMP_ID EMP_NAME SALARY CONTACT

1 ATUL 10000 1234


1 ATUL 10000 4567
2 VIPUL 20000 5678
3 SHISHIR 30000 4568
• SOLUTION 2 - (PRODUCE NULL VALUES)
EMP_ID EMP_NAME SALARY CONTACT_1 CONTACT_2
1 ATUL 10000 1234 4567
2 VIPUL 20000 5678 NULL
3 SHISHIR 30000 4568 NULL

• SOLUTION 3 (DECOMPOSITION TECHNIQUE

EMPLOYEE_INFO - TABLE 1 EMPLOYEE_CONTAC -TABLE 2

EMP_ID EMP_NAME SALARY EMP_ID CONTACT

1 ATUL 10000 1 1234

2 VIPUL 20000 1 4567

3 SHISHIR 30000 2 5678

3 4568
Second Normal Form (2NF)
• The 1NF only eliminates repeating groups, not
redundancy. That’s why there is 2NF.
• Second Normal Form (2NF) is based on the concept of
fully functional dependency. It is a way to organize a
database table so that it reduces redundancy and ensures
data consistency.
• A table is said to be in 2NF if it meets the following
criteria:
 It’s already in 1NF has no partial dependency. That is, all
non-key attributes are fully functionally dependent on a
primary key.
• Functional Dependency
• ( A 🡪 B) - Means B is fully functionally dependent on A.
2NF EXAMPLE
• EMPLOYEE
EMP_ID EMP_NAME SALARY CONTACT JOB_ID JOB LOCATION

1 ATUL 10000 1234 L_001 OFFICER MUMBAI


MANAGE
2 VIPUL 20000 5678 L_002 R PUNE

3 SHISHIR 30000 4568 L_003 CLEARK MUMBAI

FIND OUT PARTIAL DEPENDENCIES


1) EMP_ID -> EMP_NAME, SALARY, CONTACT
2) JOB_ID -> JOB, LOCATION
• SOLUTION – DECOMPOSITION
• EMPLOYEE_INFO - TABLE 1

EMP_ID EMP_NAME SALARY CONTACT

1 ATUL 10000 1234

2 VIPUL 20000 5678

3 SHISHIR 30000 4568

• JOB_INFO - TABLE 2
JOB_ID JOB LOCATION
L_001 OFFICER MUMBAI
L_002 MANAGER PUNE
L_003 CLEARK MUMBAI
Third Normal Form (3NF)
• When a table is in 2NF, it eliminates repeating groups and
redundancy, but it does not eliminate transitive partial dependency.
• This means a non-prime attribute (an attribute that is not part of the
candidate’s key) is dependent on another non-prime attribute. This is
what the third normal form (3NF) eliminates.
• A relation is in 3NF if at least one of the following conditions holds in
every non-trivial function dependency X –> Y.
• X is a super key.
• Y is a prime attribute (each element of Y is part of some candidate
key).
• be in 2NF
• have no transitive partial dependency.
• Transitive Dependency
• ( A 🡪 B ) - Means B is fully functionally dependent on A.
• ( B 🡪 C ) - Means C is fully functionally dependent on B.
• INDIRECTLY (A 🡪 C) - Means C is Transitive Dependent on A.
3NF EXAMPLE
• EMPLOYEE
EMP_ID EMP_NAME SALARY CONTACT JOB LOCATION

1 ATUL 10000 1234 OFFICER MUMBAI

2 VIPUL 20000 5678 MANAGER PUNE

3 SHISHIR 30000 4568 CLEARK MUMBAI

• 1)FIND OUT PARTIAL AND TRANSITIVE DEPENDENCIES


EMP_ID -> EMP_NAME, SALARY, CONTACT
2) EMP_ID -> JOB, LOCATION
• SOLUTION – DECOMPOSITION
• EMPLOYEE_INFO - TABLE 1

EMP_ID EMP_NAME SALARY CONTACT

1 ATUL 10000 1234

2 VIPUL 20000 5678

3 SHISHIR 30000 4568

• JOB_INFO - TABLE 2
EMP_ID JOB LOCATION

1 OFFICER MUMBAI

2 MANAGER PUNE

3 CLEARK MUMBAI
Boyce-Codd Normal Form
(BCNF)
• 3NF is an adequate normal form for relational databases,
still, this (3NF) normal form may not remove 100%
redundancy because of X−>Y functional dependency if
X is not a candidate key of the given relation. This can
be solved by Boyce-Codd Normal Form (BCNF).
• Boyce–Codd Normal Form (BCNF) is based
on functional dependencies that take into account all
candidate keys in a relation however, BCNF also has
additional constraints compared with the general
definition of 3NF.
• Rules for BCNF
Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a superkey for every functional
dependency (FD) X−>Y in a given relation.
BCNF EXAMPLE
• LOAN_SCHEMA
LOAN_NO LOAN_TYPE APPLICANT_NAME AMOUNT
L001 HOME MR. VAIBHAV PATIL 10,00,000
L002 CAR MR. AMIT TADVI 10,00,000
L003 EDUCATION MRS. AKANSHA PAL 10,00,000
L004 GOLD MS. SUMITRA SAWANT 10,00,000

• POSSIBLITY
LOAN_NO LOAN_TYPE APPLICANT_NAME AMOUNT

L001 HOME MR. VAIBHAV PATIL 5,00,000


L001 HOME MRS.SUREKHA VAIBHAV PATIL 5,00,000
L002 CAR MR. AMIT TADVI 10,00,000
L003 EDUCATION MRS. AKANSHA PAL 10,00,000
L004 GOLD MS. SUMITRA SAWANT 10,00,000
• ALL NON-KEY ATTRIBUTES ARE DEPENDANT ON
SUPERKEY / CANDIDATE KEY
• TO NORMALIZE THIS RELATION, APPLY FOLLOWING
SOLUTION
• IDENTIFY THE DEPENDENCIES
 LOAN_NO->LOAN_TYPE, APPLICANT_NAME,
AMOUNT
 FIND OUT THE ATTRIBUTE WHICH INCREASE THE
REDUNDANCY AND DECOMPOSE IT WITH KEY –
[ APPLICANT_NAME]
• SOLUTION
• LOAN_INFO ----TABLE 1
LOAN_NO LOAN_TYPE AMOUNT
L001 HOME 10,00,000
L002 CAR 10,00,000
L003 EDUCATION 10,00,000
L004 GOLD 10,00,000

• APPLICANT_INFO ----TABLE 2
LOAN_NO APPLICANT_NAME
L001 MR. VAIBHAV PATIL
L001 MR. SUREKHA VAIBHAV PATIL
L002 MR. AMIT TADVI
L003 MRS. AKANSHA PAL
L004 MS. SUMITRA SAWANT
End of Chapter - 4

You might also like