Normalization: Dr. M. Brindha Assistant Professor Department of CSE NIT, Trichy-15
Normalization: Dr. M. Brindha Assistant Professor Department of CSE NIT, Trichy-15
Dr. M. Brindha
Assistant Professor
Department of CSE
NIT, Trichy-15
Relational Database Design
• First Normal Form
• Pitfalls in Relational Database Design
• Functional Dependencies
• Boyce-Codd Normal Form
• Third Normal Form
• Multivalued Dependencies and Fourth Normal Form
First Normal Form
• Domain is atomic if its elements are considered to be
indivisible units
• Examples of non-atomic domains:
• Set of names, composite attributes
• Identification numbers like CS101 that can be broken
up into parts
•A relational schema R is in first normal form if the
domains of all attributes of R are atomic
• Non-atomic values complicate storage and encourage
redundant (repeated) storage of data
• E.g. Set of accounts stored with each customer, and set
of owners stored with each account
First Normal Form (Contd.)
• Atomicity is actually a property of how the elements of
the domain are used.
• E.g. Strings would normally be considered indivisible
• Suppose that students are given roll numbers which are
strings of the form CS0012 or EE1127
• If the first two characters are extracted to find the
department, the domain of roll numbers is not atomic.
• Doing so is a bad idea: leads to encoding of information
in application program rather than in the database.
First Normal Form
• Disallows
• composite attributes
• multivalued attributes
• nested relations; attributes whose values for
an individual tuple are non-atomic
• Design Goals:
• Avoid redundant data
• Ensure that relationships among attributes are
represented
• Facilitate the checking of updates for violation of
database integrity constraints.
Goals of Normalization
• Decide whether a particular relation R is in “good” form.
• In the case that a relation R is not in “good” form,
decompose it into a set of relations {R1, R2, ..., Rn} such
that
• each relation is in good form
• the decomposition is a lossless-join decomposition
• Redundancy:
• Data for branch-name, branch-city, assets are repeated for each loan
that a branch makes
• Wastes space
• Complicates updating, introducing possibility of inconsistency of
assets value
• Null values
• Cannot store information about a branch if no loans exist
• Can use null values, but they are difficult to handle.
Redundant Information in Tuples and
Update Anomalies
• Information is stored redundantly
• Wastes storage
• Causes problems with update anomalies
• Insertion anomalies
• Deletion anomalies
• Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY
• GUIDELINE 4:
• The relations should be designed to satisfy the
lossless join condition.
• No spurious tuples should be generated by doing a
natural-join of any relations.
The branch Relation
Account Relation
null l2 k2