Normalization
Normalization
What is Normalization
Normalization allows us to organize
data so that it:
• Allows faster access (dependencies
make sense)
• Reduced space (less redundancy)
Normal Forms
Normalization is done through
changing or transforming data into
various Normal Forms.
There are 5 Normal Forms but we
almost never use 4NF or 5NF.
We will only be concerned with 1NF,
2NF, and 3NF.
For a database to be in a normal
form, it must meet all requirements
of the previous forms:
• Eg. For a database to be in 2NF, it must
already be in 1NF. For a database to be
in 3NF, it must already be in 1NF and
2NF.
Sample Data
Manager Employees
Fatma Sayed, Tariq
Abdulaziz Tafla, Mohammed
Ali Sarai, Miriam
This data is in 1NF: all fields are atomic and the CustID
serves as the primary key
But let’s pay City State Zip
attention to the Tucson AZ 12345
City, State, and Zip St. Paul MN 54355
fields: Chicago IL 43555
• There are 2 rows of Vancouver BC V5N 1M0
repeating data:
one for Chicago, St. Paul MN 54355
and one for St. Regina SK S4T 2V8
Paul. Chicago IL 43555
• Both have the same Winnipeg MB M5W 9N7
city, state and zip Regina SK S4T 2V9
code
The CustID determines all the data in the
row, but U.S. Zip codes determines the
City and State. (eg. A given Zip code can
only belong to one city and state so
storing Zip codes with a City and State is
redundant)
In this table:
• CustomerID and ProdID depend on the
OrderID and no other column (good)
• Stated another way, “If you know the OrderID,
you know the CustID and the ProdID”
So: OrderID CustID, ProdID
OrderID CustID ProdID Price Quantity Total
1 1001 AB-111 50 1,000 50,000
2 1002 AB-111 60 500 30,000
3 1001 ZA-245 35 100 3,500
4 1003 MB-153 82 25 2,050
5 1004 ZA-245 42 10 420
6 1002 ZA-245 40 50 2,000
7 1001 AB-111 75 100 7,500