Normal
Normal
Normalization: Reminder
Why do we need to normalize?
To avoid redundancy (less storage space needed, and data is consistent) To avoid update/delete anomalies
Normalization: Reminder
Ssn 123 123 123 c-id Grade Name smith smith smith Address Main Main Main cs331 A cs351 B cs211 A
Clearly, Name and Address are redundant (larger relation + you have to update 3 rows to update the Address)
Vladimir Vacic, Temple University
3
Normalization: Reminder
Ssn c-id Grade Name 123 cs331A Address smith Main
jones Forbes
Insertion anomaly: Cannot make a record Jones address because he is not taking any classes
Vladimir Vacic, Temple University
4
Normalization: Reminder
Ssn 123 124 125 c-id Grade Name jones smith shmoo Address 123 Main 124 Broad 42 Penn cs331 A cs351 B cs211 A
Delete anomaly: Cannot delete Shmoos enrolment without loosing his address as well
Vladimir Vacic, Temple University
5
Normal Forms
First Normal Form 1NF Second Normal Form 2NF Third Normal Form 3NF Fourth Normal Form 4NF Fifth Normal Form 5NF (so far conveniently named) Boyce-Codd Normal Form BCNF
Vladimir Vacic, Temple University
6
First Name
Not in 1NF
First Name
Peter Mary John Anne Michael
Normalized to 1NF
Not in 1NF
Normalized to 1NF
10
Not in 1NF
11
Normalized to 1NF
13
Not in 2NF
Address 80 Ericsson Av. 12 Olafson St. 192 Freya Blvd. 212 Reykjavik St. Course_ID CIS331 CIS331 CIS331 CIS362 Grade A B C A
Normalized to 2NF
15
Student
Address
Course_ID
Grade
16
Grade A B C A
17
Normalized to 3NF
18
Student
Value
Course_ID
Grade
19
BCNF: Reminder
Informally: Everything depends on the full key, and nothing but the key Formally: For every FD a b in F+
a b is trivial (a is a superset of b) or a is a superkey (or both)
Vladimir Vacic, Temple University
20
21
Is it in BCNF?
Student Course
Vladimir Vacic, Temple University
Teacher
22
24
S T C
25
Normalization: Examples
Supplier S1 S1 S1 S2 S7 Name Jones Jones Jones Spiritoso Kohl Status_City 10 10 10 12 10 City Paris Paris Paris London Paris Part P3 P1 P4 P3 P4 Qty 257 500 125 (null) 342
26
Normalization: Example
FDs Supplier Name Supplier City Status_City Supplier Status_City (Supplier, Part) Qty Partial Dependencies (Supplier, Part) Name (Supplier, Part) City (Supplier, Part) Status_City
Vladimir Vacic, Temple University
27
Normalization: Example
Supplier S1 S2 S7 Supplier S1 S1 S1 S2 S7 Name Jones Spiritoso Kohl Part P1 P3 P4 P3 P4 Status_City 10 12 10 Qty 500 257 125 (null) 342
28
Normalization: Example
We took care of the partial dependencies, but what about transitive dependencies?
Supplier S1 S2 S7
Status_City 10 12 10
29
Normalization: Example
Supplier S1 S1 S1 S2 S7 Part P1 P3 P4 P3 P4 Qty 500 257 125 (null) 342 Supplier S1 S2 S7 City Paris London Name Jones Spirit Kohl City Paris London Paris
Status_City 10 12
30
Normalization: Examples
Silberschatz et al.:
7.2 (lossless-join decomposition) hint: use the theorem! A decomposition is a lossless-join decomposition if the joining attribute is a superkey in at least one of the new relations 7.16 (lossless-join decomposition) 7.18 (dependency-preserving decomposition)
31
32
Normalization: Overview
Why do we normalize?
To avoid redundancy (less storage space needed, and data is consistent) To avoid update/delete anomalies
Normalization: Overview
Boyce-Codd Normal Form (BCNF):
Everything should depend on the key, the whole key, and nothing but the key (so help me Codd joke attributed to C.J. Date )
1NF (all attributes are atomic) 2NF (no partial dependencies) 3NF (no transitive dependencies)
34