0% found this document useful (0 votes)
3 views60 pages

04 Normalization

The document covers advanced data modeling concepts, focusing on normalization in database systems. It outlines the objectives, advantages, and goals of normalization, including the elimination of data redundancy and anomalies such as insertion, deletion, and modification. Additionally, it discusses functional dependencies, partial and transitive dependencies, and the steps required to achieve first, second, and third normal forms in database design.

Uploaded by

Aljean Sinohin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views60 pages

04 Normalization

The document covers advanced data modeling concepts, focusing on normalization in database systems. It outlines the objectives, advantages, and goals of normalization, including the elimination of data redundancy and anomalies such as insertion, deletion, and modification. Additionally, it discusses functional dependencies, partial and transitive dependencies, and the steps required to achieve first, second, and third normal forms in database design.

Uploaded by

Aljean Sinohin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Advanced

Data Modeling
Concepts III
NORMALIZATION
IT IM01: Advanced Database Systems

Christine Joyce M. Carlos, MBusAn


Ruth Ann G. Santos, MSIT
Learning Objective
At the end of the lesson, the students should be able to:
a. Identify appropriate data model for a specific system/application;
b. Create an EER design;
c. Apply the concept of supertype or subtype relations;
d. Derive normalized forms of the database.

2
Lecture Outline
• Define Normalization and Data Redundancy
• Data Redundancy Anomalies
• Normalization Objectives, Advantages, and
Goals
• Identify Dependencies Addressed in
Normalization

3
Conceptual Model to Relational Model

• Conceptual Models like ER Diagrams is a great way of


designing and representing the database design in more of a
flow chart form.
• We can generate a relational database schema using the ER
diagram, by keeping in mind the following:
• Entity gets converted into a Table, with all the attributes
becoming fields(columns) in the table.
• Primary Keys should be properly set.

4
Normalization

• A process of structuring attributes on a relational database to


eliminate data repetition/redundancy.
• Data redundancy occurs when the same piece of data exists in
multiple places

5
Normalization: Anomalies

Insertion Anomaly – You cannot add up data on new doctors until


they have a patient of their own.

6
Normalization: Anomalies

• Deletion Anomaly – If Doctor Jekyll has only one patient


(David), and his record got deleted accidentally, then the data
about Doctor Jeckyll will also be deleted

7
Normalization: Anomalies

• Modification Anomaly – Supposed Doctor Hyde leaves the


hospital and be replaced by Dr John. Then we have to update
all the records of patients of Doctor Hyde to reflect Doctor
John’s data

8
Normalization: Objectives

To avoid anomalies
To reduce
in insertion,
restructuring of
modification, and
tables.
deletion.

To make a neutral
To make the query of the
models more collection of
informative. relations despite
the fluidity of data.
9
Advantages of Normalization

More data storage as


Faster maintenance
redundancy removal of data
because of the fewer
results to a smaller size
indexes.
database.

Also, by narrowing of
Smaller database size
tables, we can easily
which then has an outcome
specify which tables are to
of faster response time.
be joined.

10
Goals of Normalization Process

Each row/column
Each relation (table)
intersection contains
represents a single
only one value and not a
subject.
group of values

All nonprime attributes in a


No data item will be relation (table) are dependent
unnecessarily stored in on the primary key. The data
more than one table is uniquely identifiable by a
primary key value

Each relation (table) has no


insertion, update, or deletion
anomalies, which ensures the 11

integrity and consistency of


the data
Normalization

• Normalization works through a series of stages called normal


forms

12
Functional Dependence

• The attribute B is functionally dependent on the attribute A if


each value of A uniquely determines value of B.

In the Student table,


Roll_No → Student_Name
(A determines B)

13
Functional Dependence

• The attribute B is functionally dependent on the attribute A if each


value of A uniquely determines value of B.

• In Normalization, we will deal with Transitive Dependence and Partial


14

Dependence
Partial Dependency

• When the determinant of a functional dependency is only an


attribute or a part of the primary key.
• Partial dependencies tend to be straightforward and easy to
identify.

If (A, B) → (C, D), B → C, and (A, B) is the primary


key, then the functional dependence B → C is a
partial dependency because only part of the
primary key (B) is needed to determine the value
of C. 15
Partial Dependency

• In the table above, Employee ID and Task No are candidate


primary keys.
• Note that Employee Name can be determined using Employee
ID alone while Task Name can be determined using Task_no
alone.

16
Transitive Dependency

• It exists when there is an indirect relationship between the two


attributes/fields

If A → B and B → C are valid functional


dependencies, the dependency A → C is a
transitive dependency because A determines
the value of C via B

17
Transitive Dependency

• Book → Author
If you know the book’s
name, you can learn the
author’s name
• Author → Author
Nationality
If you know the author’s
name, you can easily
identify his nationality

Thus, if you know the Book


name, you can also
determine the author’s
nationality
(Book → Author Nationality)
18
Summary of Dependencies

• Functional Dependence
A → B ( A determines B): For every value of A, there is a unique value of B
• Transitive Dependence
Q → E (Q is a transitive dependency of E): If Q is functionally dependent on W,
and W is functionally dependent on E, then Q is transitively dependent on E
Note: Check on non-key fields/columns: If there are fields that can be used to determine other non-key fields,
transitive dependence might be present

• Partial Dependence
If (A,B) are primary keys and C is functionally dependent on A alone, then C is partially
dependent on A
Note: Only happens when there are multiple primary keys on a table (Composite Primary Key)
19
Restaurant Mgmt Application

• Imagine we're building a restaurant management application. That application needs to store data about the
company's employees and it starts out by creating the following table of employees:

What is
the
Primary
Key?

20
Restaurant Mgmt Application

• Imagine we're building a restaurant management application. That application needs to


store data about the company's employees and it starts out by creating the following table
of employees:

Functional
Dependence: Are
all of these
columns
dependent on
and specific to
the primary key?

21
Restaurant Mgmt Application

• Imagine we're building a restaurant management application. That application needs to store data about the
company's employees and it starts out by creating the following table of employees:

Transitive
Dependence:
Do any of the
non-primary key
fields depend
on something
other than the
primary key?
22
Restaurant Mgmt Application

• Imagine we're building a restaurant management application. That application needs to store data about the
company's employees and it starts out by creating the following table of employees:

Partial
Dependence:
Do any of the
non-primary key
fields depend
on only one of
the primary
keys?
23
Student Data Table
Student Student Fees Date of Teacher Teacher Course
Address Subject 1 Subject 2 Subject 3
ID Name Paid Birth Name Address Name
3 Main Street, James
2000- John 18-Jul- 4-Aug- Economics 1 Biology 1 44 March Way,
North Boston Peterso Economics
12345 Smith 00 91 (Business) (Science) Glebe 56100
56125 n
16 Leeds Road, Business James
2000- Maria 14- 10-Sep- Biology 1 Programm 44 March Way, Computer
South Boston Intro Peterso
23456 Griffin May-01 92 (Science) ing 2 (IT) Glebe 56100 Science
56128 (Business) n
21 Arrow
2000- Susan 3-Feb- 13-Jan- Biology 2 Sarah
Street, South Medicine
54628 Johnson 01 91 (Science) Francis
Boston 56128
14 Milk Lane,Â
2000- Matt 29- 25-Apr- Shane 105 Mist Road,
South Boston Dentistry
95634 Long Apr-02 92 Cobson Faulkner 56410
56128

What is the Primary Key? 24


Student Data Table
Student Student Fees Date of Teacher Teacher Course
Address Subject 1 Subject 2 Subject 3
ID Name Paid Birth Name Address Name
3 Main Street, James
2000- John 18-Jul- 4-Aug- Economics 1 Biology 1 44 March Way,
North Boston Peterso Economics
12345 Smith 00 91 (Business) (Science) Glebe 56100
56125 n
16 Leeds Road, Business James
2000- Maria 14- 10-Sep- Biology 1 Programm 44 March Way, Computer
South Boston Intro Peterso
23456 Griffin May-01 92 (Science) ing 2 (IT) Glebe 56100 Science
56128 (Business) n
21 Arrow
2000- Susan 3-Feb- 13-Jan- Biology 2 Sarah
Street, South Medicine
54628 Johnson 01 91 (Science) Francis
Boston 56128
14 Milk Lane,Â
2000- Matt 29- 25-Apr- Shane 105 Mist Road,
South Boston Dentistry
95634 Long Apr-02 92 Cobson Faulkner 56410
56128

Functional Dependence: Are all of these columns 25

dependent on and specific to the primary key?


Student Data Table

Student Student Fees Date of Teacher Teacher Course


Address Subject 1 Subject 2 Subject 3
ID Name Paid Birth Name Address Name
3 Main Street, James
2000- John 18-Jul- 4-Aug- Economics 1 Biology 1 44 March Way,
North Boston Peterso Economics
12345 Smith 00 91 (Business) (Science) Glebe 56100
56125 n
16 Leeds Road, Business James
2000- Maria 14- 10-Sep- Biology 1 Programm 44 March Way, Computer
South Boston Intro Peterso
23456 Griffin May-01 92 (Science) ing 2 (IT) Glebe 56100 Science
56128 (Business) n
21 Arrow
2000- Susan 3-Feb- 13-Jan- Biology 2 Sarah
Street, South Medicine
54628 Johnson 01 91 (Science) Francis
Boston 56128
14 Milk Lane,Â
2000- Matt 29- 25-Apr- Shane 105 Mist Road,
South Boston Dentistry
95634 Long Apr-02 92 Cobson Faulkner 56410
56128

Transitive Dependence: Do any of the non-primary key fields 26

depend on something other than the primary key?


Table Representations for Normalization

• Relational Notation Schema

PROJECT (PROJ_NUM, EMP_NUM, PROJ_NAME, EMP_NAME, JOB_CLASS,


CHG_HOUR, HOURS)

• Dependency Diagrams

27
Relational Notation Schema

• Relational notation is a process of transforming an E/R diagram into a more friendly and usable
type of diagram that is easily readable.
• This can be done by
1. Taking the names of each table and its attributes and ordering them in a specific order.
2. Always start with the primary key(s), which are commonly notated with the underscore, Next all
other attributes are added.
3. If an attribute happens to be a foreign key it needs to be underscored with a dotted line.

TABLE_NAME(primary_key, foreign_key,
nonkey_attribute1, nonkey_attribute 2, nonkey_attribute3)
28
Dependency Diagram

• Used to illustrate the dependencies determined in a table structure


• It also provides an overview of relationships that are existing on a
table.
• It would reduce the risk of having an important dependency being
overlooked.

29
Dependency Diagram Example

• The dependency diagram indicates that authors are paid royalties for each
book they write for a publisher. The amount of the royalty can vary by
author, by book, and by edition of the book.

30
Dependency Diagram Example

• The dependency diagram indicates that authors are paid royalties for each
book they write for a publisher. The amount of the royalty can vary by
author, by book, and by edition of the book.

NonKey Attributes

Primary Keys

31
Dependency Diagram Example

• The dependency diagram indicates that authors are paid royalties for each book they
write for a publisher. The amount of the royalty can vary by author, by book, and by edition
of the book.
NonKey Attributes

Primary Keys

Transitive Dependency: Book Title → Publisher

Partial Dependencies: 32

ISBN → Book Title, Publisher, Edition


Author_Num → Last Name
Normalization Process

33
Steps in Normalization: First Normal
Form 1NF
• Repeating groups happens when a single attribute contains data of
multiple entries with the same type.
• Removing repeating groups will reduce redundancies on a database

34
Steps in Normalization: 1NF

Steps to First Normal Form


Step 1. All Step 2. A Step 3. All
repeating Primary key dependencies
groups must must be in a table must
be removed determined be identified

• No single key • Identify/Create a • Determine all


attribute unique key or dependencies
containing attribute on a that exist in the
multiple entries of given table table
data with the
same type.
35
Steps in Normalization: 1NF Step 1

All repeating groups must be removed: No single attribute


containing multiple entries of data with the same type.

36
Steps in Normalization: 1NF Step 2

A Primary key must be determined

37
Steps in Normalization: 1NF Step 3

All dependencies in a table must be identified

38
Steps in Normalization: Second Normal
Form 2NF

Conversion to 2NF occurs only when the 1NF has a composite


primary key. If the 1NF has a single-attribute primary key, then
the table is automatically in 2NF.

Characteristics of Second Normal Form


• It MUST be in 1NF
• All attributes are dependent on the primary key.
• Partial dependencies does not exist

39
Steps in Normalization: Second Normal
Form 2NF

Steps to Second Normal Form:

Step 1. Make New Step 2. Reassign


Tables to Eliminate Corresponding
Partial Dependencies. Dependent Attributes

• Specify key • Most anomalies are


components and place removed in the 2NF
them on a separate
column.
• Designate each key
component as the
primary key to new
tables. 40
Steps in Normalization: 2NF Step 1

Make New Tables to Eliminate Partial Dependencies.

41
Steps in Normalization: 2NF Step 2

Reassign Corresponding Dependent Attributes

42
Steps in Normalization: Third Normal
Form 3NF

• A table is in 3NF when it is in 2NF


• No nonkey attribute is functionally dependent on another
nonkey attribute (Does not include transitive dependencies)

43
Steps in Normalization: Third Normal
Form 3NF
Steps to Third Normal Form:
Step 1. Remove all Step 2. Reassign
transitive dependencies dependency of
attributes

• For every transitive • The goal is to


dependency, we can determine
write PK for its dependencies in a
determinant table
• A determinant is an
attribute that may be
used to determine
the value of another
row in a table. 44
Steps in Normalization: 3NF Step 1

Remove all transitive dependencies

45
Steps in Normalization: 3NF Step 2

Reassign dependency of attributes

46
Example: Normalization Process

• The DreamHome Customer Rental Details form holds details about property rented by a given
customer. – To simplify things, we will assume that a renter rents a given property once and only
one property at a time.

47
Example: Normalization Process

Characteristics of First Normal Form


• Key attributes are determined
• Repeating groups are removed
• A primary key where all other attributes are dependent is specified

48
Example: Normalization Process 1NF
Step 1

Step 1: All repeating groups must be removed


Step 2: A Primary key must be determined
Step 3: All dependencies in a table must be identified

49
Example: Normalization Process 1NF
Step 2
Step 1: All repeating groups must be removed

Step 2: A Primary key must be determined


Step 3: All dependencies in a table must be identified

Customer Rental Table

CustNo CName PropNo PAddr RntSt RntFnsh Rent OwnerNo OName

50
Example: Normalization Process 1NF
Step 3
Step 1: All repeating groups must be removed
Step 2: A Primary key must be determined

Step 3: All dependencies in a table must be identified


Customer Rental Table

CustNo CName PropNo PAddr RntSt RntFnsh Rent OwnerNo OName

51

Partial Dependencies Transitive Dependency


Example: Normalization Process 2NF

Characteristics of Second Normal Form


• It MUST be in 1NF
• All attributes are dependent on the primary key.
• Partial dependencies may exist in some tables: Partial dependencies exist when a dependency is
based on a part of a primary key.
Customer Rental Table

CustNo CName PropNo PAddr RntSt RntFnsh Rent OwnerNo OName

52

Partial Dependencies Transitive Dependency


Example: Normalization Process 2NF
Step 1
• Step 1: Make New Tables to Eliminate Partial Dependencies.
Customer Rental Table

CustNo CName PropNo PAddr RntSt RntFnsh Rent OwnerNo OName

Partial Dependencies
Transitive Dependency

Rentals Table Customer Table Property Owner Table

CustNo PropNo RntSt RntFnsh CustNo CName PropNo PAddr Rent OwnerNo OName
53
Example: Normalization Process 2NF
Step 1
• Step 1: Make New Tables to Eliminate Partial Dependencies.

54
Example: Normalization Process 2NF
Step 2
Step 2: Reassign Corresponding Dependent Attributes
Rentals Table
Customer Table

CustNo PropNo RntSt RntFnsh


CustNo CName

Property Owner Table

PropNo PAddr Rent OwnerNo OName

55

Transitive Dependency
Example: Normalization Process 3NF

• A table is in 3NF when it is in 2NF


• No nonkey attribute is functionally dependent on another nonkey attribute
• Does not include transitive dependencies.

Rentals Table Property Owner Table

CustNo PropNo RntSt RntFnsh PropNo PAddr Rent OwnerNo OName

Customer Table Transitive Dependency

56

CustNo CName
Example: Normalization Process 3NF

• Step 1: Remove all transitive dependencies


• Step 2: Reassign dependency of attributes
Rentals Table
Property Table

CustNo PropNo RntSt RntFnsh PropNo PAddr Rent OwnerNo OName

Customer Table Owner Table

CustNo CName OwnerNo OName


57
Example: Normalization Process 3NF

58
Normalization and Denormalization

• You should not assume that the highest level of normalization is


always the most desirable.

Higher More More


Normal Relational Resources
Form Join to Respond
Operations to Queries

• Denormalization produces a lower normal form that may


increase performance but has greater redundancy 59
THANK YOU

60

You might also like