0% found this document useful (0 votes)
69 views29 pages

4-1 - NormalizationWT

Normalization is the process of organizing data to minimize redundancy and dependency. It involves decomposing tables to eliminate anomalies and placing the data into optimal normal forms. The document describes the three normal forms: 1) First normal form (1NF) requires removing any multivalued attributes so that each cell contains a single value. 2) Second normal form (2NF) removes any partial dependencies to ensure that non-key attributes are dependent on the entire primary key. 3) Third normal form (3NF) removes transitive dependencies so that each non-prime attribute is directly dependent on the primary key. Normalization is achieved by decomposing relations into tables in third normal form.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views29 pages

4-1 - NormalizationWT

Normalization is the process of organizing data to minimize redundancy and dependency. It involves decomposing tables to eliminate anomalies and placing the data into optimal normal forms. The document describes the three normal forms: 1) First normal form (1NF) requires removing any multivalued attributes so that each cell contains a single value. 2) Second normal form (2NF) removes any partial dependencies to ensure that non-key attributes are dependent on the entire primary key. 3) Third normal form (3NF) removes transitive dependencies so that each non-prime attribute is directly dependent on the primary key. Normalization is achieved by decomposing relations into tables in third normal form.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

NORMALIZATION

Walk-Through
Steps in Normalization
Normalization can be accomplished in stages, each of which
corresponds to a normal form

1. First Normal Form - Any multivalued attribute (also called


repeating groups) have been removed

2. Second Normal Form – Any partial functional dependencies have


been removed

3. Third Normal Form – Any transitive dependencies have been


removed
Order ID : 1006 Order Date : Oct 24, 2019

Customer ID : 22
Customer Name : Dan’s Furniture
Customer Address : Fullerton, California

Product ID Product Description Product Finish Unit Price Ordered Quantity


7 Tea Table Walnut $450
2
5 TV Stand Oak
$300 6
4 Porch Swing Pine
$800 5
Order ID : 1007 Order Date : Oct 31, 2019

Customer ID : 65
Customer Name : Furniture Barn
Customer Address : Fort Collins, Colorado

Product ID Product Description Product Finish Unit Price Ordered Quantity


11 Table Cherry $1150
3
4 Porch Swing Pine $800
7
First Normal Form (1NF)
vAny multivalued attribute (also called repeating
groups) have been removed
vA relation is in 1NF if it contains an atomic value
vValues stored in a column should be of the same
domain
vAll the attributes in a table should have unique
names
Table with multivalued Attributes:
Not in 1 Normal Form
st
Order ID : 1007 Order Date : Oct 31, 2019

Customer ID : 65
Customer Name : Furniture Barn
Customer Address : Fort Collins, Colorado

Product ID Product Description Product Finish Unit Price Ordered Quantity


11 Table Cherry $1150 3
4 Porch Swing Pine $800 7

Order Order Cust ID Cust Cust Prod Product Prod Unit Ordered
ID Date Name Address ID Description Finish Price Qty

1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2


2019 Furniture CA 5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
1007 31 Oct 65 Furniture Ft. 11 Table Cherry $1150 3
2019 Barn Collins,Co 4 Porch Swing Pine $800 7
Table with multivalued Attributes:
Not in 1 Normal Form
st
Order ID Order Cust ID Cust Name Cust Address Prod Product Prod Finish Unit Ordered Qty
Date ID Description Price

1006 24 Oct 22 Dan’s Fullerton, CA 7 Tea Table Walnut $450 2


2019 Furniture 5 TV Stand Oak $300 6
4 Porch Swing Pine $800 5
1007 31 Oct 65 Furniture Ft. Collins,Co 11 Table Cherry $1150 3
2019 Barn 4 Porch Swing Pine $800 7

Order ID Order Cust Customer Customer Prod Product Prod Unit Ordered
Date ID Name Address ID Description Finish Price Qty
1006 24 Oct 22 Dan’s Fullerton, 7 Tea Table Walnut $450 2
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 5 TV Stand Oak $300 6
2019 Furniture CA
1006 24 Oct 22 Dan’s Fullerton, 4 Porch Swing Pine $800 5
2018 Furniture CA
1007 31 Oct 65 Furniture Ft. Collin,Co 11 Table Cherry $1150 3
2019 Barn
1007 31 Oct 65 Furniture Ft. Collin,Co 4 Porch Swing Pine $800 7
2019 Barn
Table with multivalued Attributes:
Not in 1 Normal Form
st
PatientID PatienName PatientAdd PatientCPNo
P101 Bea Makati 091777
092677
P102 Liza Manila 091933
092433
PatientID PatienName PatientAdd PatientCPNo
P103 Kim Mandaluyong 091801
097702 P101 Bea Makati 091777

P104 Angel Pasig 0926413 P101 Bea Makati 092677


0926518 P102 Liza Manila 091933

P102 Liza Manila 092433

P103 Kim Mandaluyong 091801

P103 Kim Mandaluyong 097702

P104 Angel Pasig 0926413

P104 Angel Pasig 0926518


Table with multivalued Attributes: (Having Primary Key)
Not in 1st Normal Form
Emp_Project_Details
1. Remove nested relation attributes into a Emp_ID EmpName Proj_No WorkingHrs
new relation
2. Propagate the primary into it E101 Bea 1 32
3. Unnest relation into a set of 1NF relations 2 8
E102 Liza 3 40
tions
Project E103 Kim 1 20
e la
po s ed R Emp_ID Proj_No WorkingHrs 2 20
Decom
E101 1 32 E104 Angel 2 10
3 10
Employee E101 2 8
6 10
Emp_ID EmpName E102 3 40 8 10
E101 Bea E103 1 20
E103 2 20
E102 Liza
E103 Kim E104 2 10

E104 Angel E104 3 10


E104 6 10
E104 8 10
Second Normal Form (2NF)
vAny partial functional dependencies have been
removed
vA relation should be in 1NF

Point to Remember – if the proper subset of candidate key


determines non-prime attribute, it is called partial functional
dependency
example 1

R(P, Q, R, S, T) step 2. Find closure for the identified candidate key

PQ à R partial FD PQS+ = PQS


PQSR
SàT PQSRT or PQRST = {R}

step 3. Identify the prime and non-prime attributes


step 1. Find the candidate key
prime is PQS
non-prime is RT
P Q R S T
DECOMPOSE relation R
partial FD
R1(P, Q, S)
PQS is the candidate key
R2(P, Q, R)
R3(S,T)
example 2
step 2. Find closure for the identified candidate key
R(P, Q, R, S, T)
PàQ partial FD PR+ = PR
PQR
QàT PQRST = {R}
RàS partial FD
step 3. Identify the prime and non-prime attributes
step 1. Find the candidate key prime is PR
partial FD non-prime is QST

P Q R S T DECOMPOSE relation R
R1(P, R)
partial FD

R2(P, Q, T)
PR is the candidate key
R3(R, S)
example 3

R(A, B, C, D, E, F, G, H, I, J) step 2. Find closure for the identified candidate key

AB à C partial FD ABD+ = ABD


ABCDGH
AD à GH partial FD
ABCDEFGHIJ = {R}
BD à EF partial FD
step 3. Identify the prime and non-prime attributes
AàI partial FD
prime is ABD
HàJ non-prime is CEFGH I J
step 1. Find the candidate key
partial FD DECOMPOSE relation R
partial FD
R1(A, B, D)
A B C D E F G H I J R2(A, B, C)
R3(A, I)
partial FD
partial FD R4(A, D, G, H, J)
ABD is the candidate key R5(B, D, E,F)
example 4
Student step 2. Find closure for the identified candidate key
Sid Sname Tid Tname Grade (Sid,Tid)+ = Sid, Tid
1 Bea 3 Nayre 5 Sid, Sname, Tid, Tname,
2 Angel 2 Dastas 4 Grade = {R}
3 Ivana 1 Cruz 6
step 3. Identify the prime and non-prime attributes

Tid à Tname partial FD prime is Sid, Tid


non-prime is Sname, Tname, Grade
Sid à Sname partial FD
Sid, Tid à Grade FD DECOMPOSE relation R
step 1. Find the candidate key R1 R2 R3
FD
Sid Tid Grade Tid Tname Sid Sname

1 3 5 3 Nayre 1 Bea
Sid Sname Tid Tname Grade 2 2 4 2 Dastas 2 Angel

3 1 6 1 Cruz 3 Ivana
partial FD partial FD

Sid and Tid are the candidate keys


Going to the 2nd Normal Form (2NF)
Full Dependencies

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID -> Order_Date, Customer_ID, Customer_Name,


Customer_Address

Product_ID -> Product_Description, Product_Finish, Unit_Price

Order_ID, Product_ID -> Order_Quantity


Second Normal Form
• 1NF PLUS No partial dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Third Normal Form (3NF)
vA relation should be in 2NF
vAny transitive dependencies have been removed
Transitive dependency – a functional dependency between the primary key and one or more nonkey attributes that
are dependent on the primary key via another nonkey attribute

This means if we have a primary key A and a non-key domain B and C where C is more dependent on B than A and B
is directly dependent on A, then C can be considered transitively dependant on A.

Partial Dependency Transitive Dependency


Point to Remember: x à y prime à non-prime non-prime à non-prime

• Left handside x is part of candidate key (violates 2NF)


• Left handside x is a non-prime attributes (violates 3NF)
example 1
R(A, B, C, D, E) step 2. Find closure for the identified candidate key
AC+ = AC
A à B partial FD ABCDE
B à E transitive dependency = {R}
step 3. Identify the prime and non-prime attributes
C à D partial FD
prime are A, C
step 1. Find the candidate key non-prime are BDE
transitive dependency

DECOMPOSE relation R
A B C D E
R1(A, C)
partial FD partial FD
R2(A, B)
R3(B, E)
AC is the candidate key
R4(C, D)
example 2

Student step 2. Find closure for the identified candidate key


Sname Major Dept Sname+ = Sname
Sname, Major, Dept =
Sname à Major partial FD
{Student}
Sname à Dept partial FD step 3. Identify the prime and non-prime attributes

Major à Dept transitive dependency prime is Sname


non-prime are Major, Dept
step 1. Find the candidate key
partial FD
DECOMPOSE relation R
Sname Major Dept
Student_Major Major_Dept
Sname Major Major Dept
partial FD transitive

Sname is the candidate key


Going to the 3rd Normal Form
Transitive Dependencies

Order Customer Customer Customer


Order ID
Date ID Name Address

Product Product
Product ID Unit Price
Description Finish

Ordered
Order ID Product ID
Quantity
Third Normal Form
• 2NF PLUS No transitive dependencies

Product Product Unit


Product ID
Description Finish Price

Product Ordered
Order ID
ID Quantity
Dependency Diagram
Full Dependency

Transitive Dependencies

Order Order Cust Customer Customer Prod Product Product Unit Ordered
ID Date ID Name Address ID Description Finish Price Qty

Partial Dependencies Partial Dependencies

Order_ID, Product_ID -> Order_Quantity


Order_ID -> Order_Date, Customer_ID
Product_ID -> Product_Description, Product_Finishm Unit_Price
Customer_ID -> Customer_Name, Customer_Address
(a) The table shown in Figure 1 is susceptible to update anomalies. Provide examples of insertion, deletion, and
modification anomalies.
Answers:
This table is not well structured, un-normalized containing redundant data. By using a bottom-up approach we
analyzing the given table for anomalies. First observation, we see multiple values in an appointment column and this of
course violate the 1NF. By assuming the staffNo and patientNo as candidate keys, there are many anomalies exist.
Insertion anomalies:
To insert a new patient particular that makes an appointment with the designated Doctor, we need to enter the correct detail for
the staff. For example, to insert the details of new patient in patientNo, patientName and an appointment, we must enter the
correct details of the doctor (staffNo, dentistName) so that the patient details are consistent with values for the designated
Doctor for example, S1011.
To enter new patient data that doesn’t have Doctor to be assigned we can’t insert NULL values for the primary key.
Deletion anomalies:
If we want to delete a patient named Ian MacKay for example, two records need to be deleted as in row 3 and 4. This anomaly
also obvious when we want to delete the dentistName, multiple records needs to be deleted to maintain the data integrity.
When we delete a Dentist record, for example Tony Smith, the details about his patients also lost from the database.

With redundant data, when we want to change the value of one columns of a particular Dentist, for example the dentistName,
we must update all the Dentist records that assigned to the particular patient otherwise the database will become inconsistent.
We also need to modify the appointment schedules because different Dentist has different schedules.
(b) Describe and illustrate the process of normalizing the table shown in Figure 1 to 3NF. State any assumptions you make
about the data shown in this table.

Assumptions made include that a patient is registered at only one surgery and he/she may have more than one appointment on
a given day. All the schedules have been fixed for the whole days and week.
In the 1NF we remove all the repeating groups (appointment), assigning new column (apptDate and apptTime) and assigned
primary keys (candidate keys). Then we figure out the functional dependencies (FDs). By using dependency diagram we
represent the table as shown below. (NF – stand for Normal Form)
Note: How to find the FDs is subjective!!! However, the rule is, it must reflect the real word situation.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

FD1 is already in 2NF. In this case, we can see that FD2 (just depend on staffNo) and FD4 (just depend on staffNo and apptDate)
violate the 2NF. These two NFs are partially dependent on the candidate keys not the whole keys. FD2 can stand on its own by
depending on the staffNo and meanwhile FD4 also can stand on its own by depending on the staffNo.
The FD3 violates the 3NF showing the transitive dependency where surgeryNo and patientName depend on patientNo while
patientNo depend on the staffNo that is the non-key is depending on another non-key.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

staffNo apptDate apptTime patientNo patientName

staffNo apptDate surgeryNo

staffNo dentistName

The 2NF, it is already in 1NF and there is no partial dependency. So we need to remove the FD2 and FD4 by splitting into
new tables and at the same time creating foreign keys.
staffNo apptDate apptTime dentistName patientNo patientName surgeryNo

FK

staffNo apptDate apptTime patientNo

FK
staffNo apptDate surgeryNo

staffNo dentistName

patientNo patientName

Finally in 3NF we must remove the transitive dependency. In this case we remove the FD3 by splitting into a new table.
The transitive dependency left is the patientName that depend on the patientNo, so we split this into new table while
creating a foreign key.
staffNo dentistName

Dentist(staffNo, dentistName)

FK
staffNo apptDate surgeryNo

Surgery(staffNo, apptDate, surgeryNo)

patientNo patientName
Patient(patientNo, patientName)

FK
staffNo apptDate apptTime patientNo

Appointment(staffNo, apptDate, apptTime, patientNo)

https://www.javaguicodexample.com/normalizationexercisean
swer.pdf

You might also like