0% found this document useful (0 votes)
146 views75 pages

DBMS 3.1 PDF

The document discusses schema refinement in database management systems. It defines functional dependencies and normal forms as ways to identify and address redundancy in database schemas. Redundancy can cause problems like redundant storage, update anomalies, insertion anomalies, and deletion anomalies. Schema refinement techniques like decomposition aim to break relations into smaller relations to reduce or eliminate redundancy based on identifying functional dependencies.

Uploaded by

Sony Takur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views75 pages

DBMS 3.1 PDF

The document discusses schema refinement in database management systems. It defines functional dependencies and normal forms as ways to identify and address redundancy in database schemas. Redundancy can cause problems like redundant storage, update anomalies, insertion anomalies, and deletion anomalies. Schema refinement techniques like decomposition aim to break relations into smaller relations to reduce or eliminate redundancy based on identifying functional dependencies.

Uploaded by

Sony Takur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

DATABASE MANAGEMENT SYSTEMS

UNIT – III

 SCHEMA REFINEMENT

 NORMAL FORMS
VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
2

UNIT - I
 INTRODUCTION: History of database systems, introduction to database
management systems, database system applications, database systems versus
file systems, view of data, data models, database languages- DDL & DML
commands and examples of basic SQL queries, database users and
administrators, transaction management, database system structure, application
architectures.

 DATABASE DESIGN: Introduction to database design and E-R diagrams,


entities, attributes and entity sets, relationships and relationship sets, additional
features of the E-R model, conceptual design with the E-R model, conceptual
design for large enterprises.
VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
3

UNIT - II
 THE RELATIONAL MODEL: Introduction to the relational model, integrity
constraints over relations, enforcing integrity constraints, querying relational
data, logical database design: E-R to relational, introduction to views,
destroying/altering tables and views.

 RELATIONAL ALGEBRA AND CALCULUS: Preliminaries, relational algebra


operators, relational calculus - tuple and domain relational calculus, expressive
power of algebra and calculus.

 SQL: Overview, the form of a basic SQL query, union, intersect and except
operators, nested queries, aggregate operators, null values, complex integrity
constraints in SQL, triggers and active databases, designing active databases.
VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
4

UNIT - III
 SCHEMA REFINEMENT AND NORMAL FORMS: Introduction to schema
refinement, functional dependencies, reasoning about FDs. Normal forms: 1NF,
2NF, 3NF, BCNF, properties of decompositions, normalization, schema
refinement in database design, other kinds of dependencies: 4NF, 5NF, DKNF,
case studies.
VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
5

UNIT - IV
 TRANSACTIONS MANAGEMENT: Transaction concept, transaction state,
implementation of atomicity and durability, concurrent executions, serializability,
recoverability, implementation of isolation, transaction definition in SQL, testing
for serializability.

 CONCURRENCY CONTROL AND RECOVERY SYSTEM: Concurrency control -


lock based protocols, time-stamp based protocols, validation based protocols,
multiple granularity, and deadlock handling. Recovery system - failure
classification, storage structure, recovery and atomicity, log-based recovery,
shadow paging, recovery with concurrent transactions, buffer management,
failure with loss of non-volatile storage, advanced recovery techniques, remote
backup systems.
VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
6

UNIT - V
 OVERVIEW OF STORAGE AND INDEXING: Data on external storage, file
organizations and indexing, index data structures, comparison of file
organizations, indexes and performance tuning. Tree structured indexing -
intuition for tree indexes, indexed sequential access method (ISAM), B+ Trees -
a dynamic tree structure.

 IBM DB2 FUNDAMENTALS*: DB2 product family - versions and editions, DB2
database and its objects, DB2 pure XML, backup and recovery, concurrency
and its isolation levels, working with SQL, DB2 programming fundamentals -
UDF, stored procedures.

* This topic is designed in collaboration with IBM India Private Limited.


VARDHAMAN COLLEGE OF ENGINEERING
Shamshabad – 501 218, Hyderabad
B.Tech. CSE IV Semester (VCE - R11) T P C
3+1* -- 4
(A1511) DATABASE MANAGEMENT SYSTEMS
7

TEXT BOOKS:
1. Raghurama Krishnan, Johannes Gehrke (2007), Database Management
Systems, 3rd edition, Tata McGraw Hill, New Delhi, India.

REFERENCE BOOKS:
1. Elmasri Navate (1994), Fundamentals of Database Systems, Pearson Education,
India.
2. Abraham Silberschatz, Henry F. Korth, S. Sudarshan (2005), Database System
Concepts, 5th edition, McGraw-Hill, New Delhi, India.
3. Peter Rob, Carlos Coronel (2009), Database Systems Design, Implementation
and Management, 7th edition, India.
UNIT – III
8

SCHEMA REFINEMENT
Introduction to schema refinement
Functional dependencies
Reasoning about FDs
NORMAL FORMS
1NF, 2NF, 3NF, BCNF
Properties of decompositions, normalization,
schema refinement in database design
Other kinds of dependencies: 4NF, 5NF, DKNF
Case studies
The Evils of Redundancy
9

 Redundancy is at the root of several problems associated with


relational schemas:
 redundant storage, insert/delete/update anomalies
 Integrity constraints, in particular functional dependencies, can be
used to identify schemas with such problems and to suggest
refinements.
 Main refinement technique: decomposition (replacing ABCD with,
say, AB and BCD, or ACD and ABD).
 Decomposition should be used judiciously:
 Is there reason to decompose a relation?
 What problems (if any) does the decomposition cause?
INTRODUCTION TO SCHEMA REFINEMENT
10

Problems Caused by Redundancy


Storing the same information redundantly, that is, in more than one
place within a database, can lead to several problems:
 Redundant storage: Some information is stored repeatedly.

 Update anomalies: If one copy of such repeated data is

updated, an inconsistency is created unless all copies are


similarly updated.
 Insertion anomalies: It may not be possible to store some

information unless some other information is stored as well.


 Deletion anomalies: It may not be possible to delete some

information without losing some other information as well.


INTRODUCTION TO SCHEMA REFINEMENT
11

Problems Caused by Redundancy (cont.)


 Consider a relation obtained by translating a variant of the
Hourly_Emps entity set
Ex: Hourly_Emps(ssn, name, lot, rating, hourly wages, hours worked)
 The key for Hourly_Emps is ssn.

 In addition, suppose that the hourly wages attribute is

determined by the rating attribute. That is, for a given rating


value, there is only one permissible hourly wages value.
 This IC is an example of a functional dependency.

 It leads to possible redundancy in the relation Hourly_Emps


Use of Decomposition
12

 Intuitively, redundancy arises when a relational schema forces


an association between attributes that is not natural.
 Functional dependencies (ICs) can be used to identify such
situations and to suggest revetments to the schema.
 The essential idea is that many problems arising from
redundancy can be addressed by replacing a relation with a
collection of smaller relations.
 Each of the smaller relations contains a subset of the attributes
of the original relation.
 We refer to this process as decomposition of the larger relation
into the smaller relations
Use of Decomposition (cont.)
13

 We can deal with the redundancy in Hourly_Emps by


decomposing it into two relations:
 Hourly_Emps2(ssn, name, lot, rating, hours worked)
 Wages(rating, hourly wages)

rating hourly wages


8 10

5 7
Use of Decomposition (cont.)
14

ssn name lot rating hours worked


123-22-3666 Attishoo 48 8 40

231-31-5368 Smiley 22 8 30

131-24-3650 Smethurst 35 5 30

434-26-3751 Guldu 35 5 32

612-67-4134 Madayan 35 8 40
Problems related to Decomposition
15

 Unless we are careful, decomposing a relation schema can


create more problems than it solves.
 Two important questions must be asked repeatedly:
1. Do we need to decompose a relation?
2. What problems (if any) does a given decomposition cause?
 To help with the rst question, several normal forms have been
proposed for relations.
 If a relation schema is in one of these normal forms, we know
that certain kinds of problems cannot arise.
FUNCTIONAL DEPENDENCIES (FDs)
16

 A Functional Dependency (FD) X Y (read as X determines


Y) (X ⊆ R, Y ⊆ R) holds over relation R if, for every allowable
instance r of R:
 t1 ∈r, t2 ∈r, πX(t1) = πX(t2) implies πY(t1) = πY(t2)

 i.e., given two tuples in r, if the X values agree, then the Y values
must also agree. (X and Y are sets of attributes.)
 An FD is a statement about all allowable relations.
 Must be identified based on semantics of application.

 Given some allowable instance r1 of R, we can check if it


violates some FD f, but we cannot tell if f holds over R!
 K is a candidate key for R means that K R
 However, K R does not require K to be minimal!
FUNCTIONAL DEPENDENCIES (FDs) - Examples
17

Consider the schema:


Student ( studName, rollNo, sex, dept, hostelName, roomNo)

Since rollNois a key, rollNo → {studName, sex, dept, hostelName,


roomNo}
Suppose that each student is given a hostel room exclusively, then
hostelName, roomNo → rollNo
Suppose boys and girls are accommodated in separate hostels,
then hostelName → sex
FDs are additional constraints that can be specified by designers
Trivial / Non - Trivial FDs
18

 An FD X →Y where Y ⊆ X
-called a trivial FD, it always holds good

 An FD X →Y where Y ⊈ X
-non-trivial FD

 An FD X →Y where X ∩Y = Ø
-completely non-trivial FD
FUNCTIONAL DEPENDENCIES (FDs) cont.
19

Example: Constraints on Entity Set


 Consider relation obtained from Hourly_Emps:
 Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)

 Notation: We will denote this relation schema by listing the


attributes: SNLRWH
 This is really the set of attributes {S, N, L, R, W, H}.

 Sometimes, we will refer to all attributes of a relation by using


the relation name. (e.g., Hourly_Emps for SNLRWH)
 Some FDs on Hourly_Emps:
 ssn is the key: S SNLRWH
 rating determines hrly_wages: R W
Wages R W
Example (Contd.) 8 10
Hourly_Emps2 5 7
20

Problems due to R → W :
S N L R H

123-22-3666 Attishoo 48 8 40
 Update anomaly: Can
we change W in just 231-31-5368 Smiley 22 8 30
the 1st tuple of 131-24-3650 Smethurst 35 5 30
SNLRWH? 434-26-3751 Guldu 35 5 32
 Insertion anomaly: What
612-67-4134 Madayan 35 8 40
if we want to insert an
employee and don’t know S N L R W H
the hourly wage for his 123-22-3666 Attishoo 48 8 10 40
rating?
231-31-5368 Smiley 22 8 10 30
 Deletion anomaly: If we
delete all employees with 131-24-3650 Smethurst 35 5 7 30
rating 5, we lose the 434-26-3751 Guldu 35 5 7 32
information about the 612-67-4134 Madayan 35 8 10 40
wage for rating 5!
Constraints on a Relationship Set
21

 Suppose that we have entity sets Parts, Suppliers, and


Departments, as well as a relationship set Contracts that involves
all of them. We refer to the schema for Contracts as CQPSD. A
contract with contract id
 C species that a supplier S will supply some quantity Q of a part
P to a department D.
 We might have a policy that a department purchases at most
one part from any given supplier.
 Thus, if there are several contracts between the same supplier
and department,
 we know that the same part must be involved in all of them. This
constraint is an FD, DS ! P.
Reasoning about Functional Dependencies (FDs)
22

 Given some FDs, we can usually infer additional FDs:


 ssn did, did lot implies ssn lot
 An FD f is implied by a set of FDs F if f holds whenever all FDs
in F hold.
+
 F = closure of F is the set of all FDs that are implied by F.

 Armstrong’s Axioms (X, Y, Z are sets of attributes):


 Reflexivity: If X ⊆ Y, then Y X
 Augmentation: If X Y, then XZ YZ for any Z
 Transitivity: If X Y and Y Z, then X Z
 These are sound and complete inference rules for FDs!
Reasoning About FDs (Contd.)
23

 Couple of additional rules (that follow from AA):


 Union: If X → Y and X → Z, then X → YZ
 Decomposition: If X → YZ, then X → Y and X → Z

 Example: Contracts(cid, sid, jid, did, pid, qty, value), and:


 C is the key: C → CSJDPQV
 Project purchases each part using single contract:
 JP → C
 Dept purchases at most one part from a supplier: S
D → P

 JP → C, C → CSJDPQV imply JP → CSJDPQV


 SD → P implies SDJ → JP
 SDJ → JP, JP → CSJDPQV imply SDJ → CSJDPQV
Reasoning About FDs (Contd.)
24

 Computing the closure of a set of FDs can be expensive. (Size


of closure is exponential in # attrs!)
 Typically, we just want to check if a given FD X → Y is in the
closure of a set of FDs F. An efficient check:
 Compute attribute closure of X (denoted X + ) wrt F:
 Set of all attributes A such that X → A is in F +
 There is a linear time algorithm to compute this.

 Check if Y is in X +
 Does F = {A → B, B → C, C D →E } imply A → E?
 i.e, is A → E in the closure F + ? Equivalently, is E in A+ ?
Closure of a Set of FDs
25

 The set of all FDs implied by a given set F of FDs is called the
closure of F and is denoted as F+.

 An important question is how we can infer, or compute, the


closure of a given set F of FDs.

 The following three rules, called Armstrong's Axioms, can be


applied repeatedly to infer all FDs implied by a set F of FDs.

 We use X, Y, and Z to denote sets of attributes over a relation


schema R:
Closure of a Set of FDs (or Armstrong’s Inference Rules)
26

 Reflexive Rule:
F ⊨{X →Y | Y ⊆ X} for any X. Trivial FDs
 Augmentation Rule:
{X →Y} ⊨ {XZ →YZ}, Z ⊆ R. Here XZ denotes X ⋃ ⋃Z
 Transitive Rule:
{X →Y, Y →Z} ⊨ {X →Z}
 Armstrong's Axioms are sound in that they generate only FDs in F+
when applied to a set F of FDs.
 They are complete in that repeated application of these rules will
generate all FDs in the closure F+.
Closure of a Set of FDs (or Armstrong’s Inference Rules)
27

 It is convenient to use some additional rules while reasoning about


F+:
 Union or Additive Rule:
{X →Y, X →Z} ⊨ {X →YZ}
 Decomposition or Projective Rule:
{X →YZ} ⊨ {X →Y, X →Z}
 Pseudo Transitive Rule:
{X →Y, WY →Z} ⊨ {WX →Z}
Attribute Closure
28

 If we just want to check whether a given dependency, say, X → Y, is


in the closure of a set F of FDs,
 we can do so eciently without computing F+. We rst compute the
attribute closure X+ with respect to F,
 which is the set of attributes A such that X → A can be inferred
using the Armstrong Axioms.
 The algorithm for computing the attribute closure of a set X of
attributes is
 closure = X;
repeat until there is no change: {
if there is an FD U → V in F such that U subset of closure,
then set closure = closure union of V
}
Database Normalization
29

 The main goal of Database Normalization is to restructure the


logical data model of a database to:
 Eliminate redundancy

 Organize data efficiently

 Reduce the potential for data anomalies.


Database Normalization definitions
30

 How to take a raw collection of data and break it up into


more logical units or tables, in order to reduce the occurrence
of redundant data in the database. This process of reducing
data redundancy is referred to as Normalization.
 Normalization is a body of rules addressing analysis and
conversion of data structures into relations that exhibit more
desirable properties of internal consistency, minimal
redundancy and maximum stability.
Database Normalization definitions
31

 Normalization is the process by which attributes are grouped


together to form a well-structured relation.
 We focused on the characteristics of a good relation:

 Analyzing sample relations


 Identifying design flaws
 And learning how to eliminate them
This is called Normalizing a relation
 Normalization is a process of decomposing relations to
produce smaller, well-structured relations.
 Normalization is a tool to validate and improve a logical
design, so that it satisfies certain constraints that avoid
unnecessary duplication of data.
Data Anomalies
32

 Data anomalies are inconsistencies in the data stored in a


database as a result of an operation such as update, insertion,
and/or deletion.
 Such inconsistencies may arise when have a particular record
stored in multiple locations and not all of the copies are
updated.

 We can prevent such anomalies by implementing 7 different


level of normalization called Normal Forms (NF)
Brief History/Overview
33

 Database Normalization was first proposed by Edgar F. Codd.


 Codd defined the first three Normal Forms, which we’ll look
into, of the 7 known Normal Forms.
 In order to do normalization we must know what the
requirements are for each of the three Normal Forms that we’ll
go over.
 One of the key requirements to remember is that Normal
Forms are progressive. That is, in order to have 3rd NF we must
have 2nd NF and in order to have 2nd NF we must have 1st NF.
NORMAL FORMS
34

 Normal Form is a state of a relation that result by decomposing


that relation for a good design to avoid redundancy.
 The Normal Forms defined in Relational database theory
represent guidelines for record design.
 The design guidelines are meaningful even if one is not using a
relational database system.
 We present the guidelines without referring to the concepts of the
relational model in order to emphasize their generality, and also
to make them easier to understand.
Steps in Normalization
35

 Normalization can be accomplished and understood in steps, and


each step results to a Normal Form.
 The Normal Forms are ways of measuring the levels, or depth to
which a database has been normalized.
 Normal Forms must be implemented in sequential order.
NORMAL FORMS
36

 First Normal Form (1NF)


 Included in the definition of a relation

 Table format; no repeating groups and Primary Key (PK)


identified
 Second Normal Form (2NF)
 1NF and no partial dependencies

 Third Normal Form (3NF) Defined in terms of


functional dependencies
 2NF and no transitive dependencies

 Boyce-Codd Normal Form (BCNF)


 Every determinant is a candidate key (special case of 3NF)
NORMAL FORMS
37

 Fourth Normal Form (4NF)


 Defined using multivalued dependencies

 3NF and no independent multivalued dependencies

 Fifth Normal Form (5NF) or Project Join Normal Form (PJNF)


 Defined using join dependencies
NORMAL FORMS
38

 The normal forms based on FDs are rst normal form (1NF), second
normal form (2NF), third normal form (3NF), and Boyce-Codd normal
form (BCNF).
 These forms have increasingly restrictive requirements: Every relation in
BCNF is also in 3NF,
 every relation in 3NF is also in 2NF, and every relation in 2NF is in
1NF.
 A relation is in first normal form if every field contains only atomic
values, that is, not lists or sets.
 This requirement is implicit in our definition of the relational model.
 Although some of the newer database systems are relaxing this
requirement
 2NF is mainly of historical interest.
 3NF and BCNF are important from a database design standpoint.
1st Normal Form
The Requirements
39

 The requirements to satisfy the 1st NF:


 Each table has a primary key: minimal set of attributes
which can uniquely identify a record
 The values in each column of a table are atomic (No multi-
value attributes allowed).
 There are no repeating groups: two columns do not store
similar information in the same table.
 1st NF definition:

1. A relation is in the Fisrt Normal Form, if it does not contain


any repeating elements or groups.
2. A relation is in the First Normal Form only if all underlying
domains contain only atomic values.
1st Normal Form
40

 The objective of the First Normal Form is to divide the base


data into logical units called tables.
 Once each table has been designed, a Primary Key is
assigned to most or all tables.
 A Primary Key in a table is one or more columns that make
every row of data in a table unique.
 “The First Normal Form deals with the shape of a record
type”.
 Under the First Normal Form, all occurrences of a record type
must contain the same no. of fields.
1st Normal Form
41

 First Normal Form excludes variable repeating fields and


groups.
 This is not so much a design guideline as a matter of definition.

 Relational database theory doesn’t deal with records having a

variable no. of fields.


Drawback of First Normal Form:
 The main drawback of First Normal Form is redundancy of

data.
1st Normal Form
Example - 1
42

 ORDER:
1. Order_No
2. Order_Date
3. Customer_No
4. Item_No
5. Item_Name
6. Qty_Ordered As many as items ordered in the ORDER
7. Rate_Per_Unit
8. Item_Value
1st Normal Form
Example - 1 (cont.)
43

 In this example (ORDER) contains the repeating groups in item


details. So it is not in the First Normal Form.
 To convert a relation into the First Normal Form, “ remove all
repeating (or multivalued) attributes to another (child)
relation”.
 We perform the operation on ORDER relation and arrive at
the following two relations:
ORDER_ITEM
ORDER
1. Order_No (PK)
1. Order_No (PK)
2. Item_No (PK)
2. Order_Date
3. Item_Name
3. Customer_No
4. Qty_Ordered
5. Rate_Per_Unit
6. Item_Value
1st Normal Form
Example - 1 (cont.)
44

ORDER ORDER_ITEM
1. Order_No (PK) 1. Order_No (PK)
2. Order_Date 2. Item_No (PK)
3. Customer_No 3. Item_Name
4. Qty_Ordered
5. Rate_Per_Unit
6. Item_Value

 If we examine the relations ORDER and ORDER_ITEM, we find


that there are no repeating elements or groups in both the
relations and can therefore say that both these relations are in
the First Normal Form.
 The First Normal Form is also called the Normalized Form (as
against fully normalized).
1st Normal Form
Example - 2
45

Un-normalized Students table:


Student# AdvID AdvName AdvRoom Class1 Class2
123 123A James 555 102-
102-8 104
104--9
124 123B Smith 467 209-
209-0 102
102--8

Normalized Students table:


Student# AdvID AdvName AdvRoom Class#
123 123A James 555 102-
102-8
123 123A James 555 104-
104-9
124 123B Smith 467 209-
209-0
124 123B Smith 467 102-
102-8
1st Normal Form
Example - 3
46

A table is considered to be in 1NF if all the fields contain


only scalar values (as opposed to list of values).
Example (Not 1NF)

ISBN Title AuName AuPhone PubName PubPhone Price

0-321-32132-1 Balloon Sleepy, 321-321-1111, Small House 714-000-0000 $34.00


Snoopy, 232-234-1234,
Grumpy 665-235-6532

0-55-123456-9 Main Street Jones, 123-333-3333, Small House 714-000-0000 $22.95


Smith 654-223-3455
0-123-45678-0 Ulysses Joyce 666-666-6666 Alpha Press 999-999-9999 $34.00

1-22-233700-0 Visual Roman 444-444-4444 Big House 123-456-7890 $25.00


Basic

Author and AuPhone columns are not scalar


1NF - Decomposition
47

1. Place all items that appear in the repeating group in a new


table
2. Designate a primary key for each new table produced.

3. Duplicate in the new table the primary key of the table from
which the repeating group was extracted or vice versa.
Example (1NF) ISBN AuName AuPhone

0-321-32132-1 Sleepy 321-321-1111

ISBN Title PubName PubPhone Price 0-321-32132-1 Snoopy 232-234-1234

0-321-32132-1 Balloon Small House 714-000-0000 $34.00 0-321-32132-1 Grumpy 665-235-6532

0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333

0-123-45678-0 Ulysses Alpha Press 999-999-9999 $34.00 0-55-123456-9 Smith 654-223-3455

1-22-233700-0 Visual Big House 123-456-7890 $25.00 0-123-45678-0 Joyce 666-666-6666


Basic
1-22-233700-0 Roman 444-444-4444
2nd Normal Form
The Requirements
48

 The requirements to satisfy the 2nd NF:


 All requirements for 1st NF must be met.

 Redundant data across multiple rows of a table must be


moved to a separate table.
 The resulting tables must be related to each other by use
of foreign key.
 2nd NF definition:

 A relation is in the Second Normal Form if it is in the First


Normal Form and all non-key attributes are fully functionally
dependent on the primary key.
 A relation is in 2NF if all the non-key attributes are
dependant on all of the Key attributes.
2nd Normal Form
49

 2nd NF is based on the concept of Full Functional dependency


and removal of the partial functional dependency:
 Full FD:
 A FD X → Y is a Full FD if removal of any attribute from X
means that the dependency does not hold any more; i.e for
any attribute A ∈ X ( X – {A}) does not Functionally
determines Y
 Partial FD:
 A FD X → Y is a Partial FD if some attribute A ∈ X (X – {A})
does Functionally determines Y
2nd Normal Form (cont.)
50

 The conversion to Second Normal Form has taken place by


removing attributes that are not dependent on the Full Primary
Key attributes.
 A relation schema R is in 2NF if every non-prime attribute is
fully functionally dependent on any key of R.
 Prime Attribute: A attribute that is part of some key
 Non-prime Attribute: An attribute that is not part of any key
2nd Normal Form
Example - 1
51

ORDER ORDER_ITEM
1. Order_No (PK) 1. Order_No (PK)
2. Order_Date 2. Item_No (PK)
3. Customer_No 3. Item_Name
4. Qty_Ordered
5. Rate_Per_Unit
6. Item_Value

 If we examine the relation ORDER_ITEM, we find that the


Item_Name is not Fully Functionally dependent on the Fully
Primark Key(Order_No + Item_No) as it is Functionally
dependent on a part of the Primary Key i.e Item_No.
 In otherwords, we do not need to know the Order_No to
determine the Item_Name; we can determine it from Item_No
alone.
2nd Normal Form
Example - 1 (cont.)
52

 The disadvantage of having such a relation is that if the name


of an Item changes, it has to be changed in all the
ORDER_ITEM rows where it occurs.
 To remove this disadvantage, we split the relation ORDER_ITEM
into the following two relations:
ORDER_ITEM ITEM
1. Order_No (PK) 1. Item_No (PK)
2. Item_No (PK) 2. Item_Name
3. Qty_Ordered
4. Rate_Per_Unit
5. Item_Value

Both these relations are now in the Second Normal Form (since we
have assumed that the rate_per_unit cannot be derived from the
Item_No.
1st Normal Form
Example
53

Un-normalized Students table:


Student# AdvID AdvName AdvRoom Class1 Class2
123 123A James 555 102-
102-8 104
104--9
124 123B Smith 467 209-
209-0 102
102--8

Normalized Students table:


Student# AdvID AdvName AdvRoom Class#
123 123A James 555 102-
102-8
123 123A James 555 104-
104-9
124 123B Smith 467 209-
209-0
124 123B Smith 467 102-
102-8
2nd Normal Form
Example - 2
54

Students table
Student# AdvID AdvName AdvRoom
123 123A James 555
124 123B Smith 467

Registration table
Student# Class#
123 102-
102-8
123 104-
104-9
124 209-
209-0
124 102-
102-8
Steps to convert a table to its 2nd Normal Form
55

A table is in 2nd Normal Form if:


 It is in 1st Normal Form

 It includes no partial dependencies (where an attribute is

dependent on only a part of a primary key)


The steps to convert a table to its 2nd Normal Form:
 Find and remove fields that are related to the only part of the

key.
 Group the removed items in the another table.

 Assign the new table with the key i.e. part of a whole

composite key.
Steps to convert a table to its 2nd Normal Form
Example - 3
56

EmpProj
1. Project_number (PK)
2. Project_name
3. Employee_number (PK)
4. Employee_name
5. Rate_category
6. Hourly_rate
Going through all the fields reveals the following:
 Project_name is only dependent on Project_number

 Employee_name, Rate_category and Hourly_rate are

dependent only on Employee_number


Steps to convert a table to its 2nd Normal Form
Example - 3 (cont..)
57

EmpProj
1. Project_number (PK)
2. Project_name
3. Employee_number (PK)
4. Employee_name
5. Rate_category
6. Hourly_rate
To convert the table into the Second Normal Form remove and place
these fields in a separate table, with the key being that part of the
original key they are dependent on.
This leads to the following 3 tables:
EmpProj Emp Proj
1. Project_number(PK) 1. Employee_number (PK) 1. Project_number(PK)
2. Employee_number(PK) 2. Employee_name 2. Project_name
3. Rate_category
4. Hourly_rate
2nd Normal Form Example - 4
58

1)Book (authorName, title, authorAffiliation, ISBN, publisher,


pubYear)
Keys: (authorName, title), ISBN
Not in 2NF as authorName authorAffiliation(authorAffiliation is
not fully functionally dependent on the first key)
2) Student (rollNo, name, dept, sex, hostelName, roomNo,
admitYear)
Keys: rollNo, (hostelName, roomNo)
Not in 2NF as hostelName→sex
student (rollNo, name, dept, hostelName, roomNo, admitYear)
hostelDetail(hostelName, sex)
-These are both in 2NF
2nd Normal Form Example - 5
59
2nd Normal Form Example - 5
60
3rd Normal Form
The Requirements
61

 The requirements to satisfy the 3rd NF:


 All requirements for 2nd NF must be met.

 Eliminate fields that do not depend on the primary key;

 That is, any field that is dependent not only on the


primary key but also on another field must be moved to
another table.
3rd NF definition:
 1. A relation is in the 3NF if it is in the 2NF and every non-key

attribute is non-transitively dependent on the primary key.


 2. A relation is in the 3NF if it is in the 2NF and every attribute

is independent of all other non-key attributes.


3rd Normal Form
Example - 1
62

ORDER_ITEM
1. Order_No (PK)
2. Item_No (PK)
3. Qty_Ordered
4. Rate_Per_Unit
5. Item_Value

 We find that Item_Value can be derived from Qty_Ordered


and Rate_Per_Unit
(Item_Value = Qty_Ordered * Rate_Per_Unit) and hence each
of these three attributes is dependent on the other two
attributes.
 We reduce the relation into 3NF as follows:
 We have reduced the relation into its 3NF by removing
attributes that depend on other non-key attributes.
3rd Normal Form
Example - 1
63

ORDER_ITEM
1. Order_No (PK)
2. Item_No (PK)
3. Qty_Ordered
4. Rate_Per_Unit
5. Item_Value

 Our original relation has now transformed to


ORDER ORDER_ITEM ITEM
1. Order_No (PK) 1. Order_No (PK) 1. Item_No (PK)
2. Order_date 2. Item_No (PK) 2. Item_Name
3. Customer_No 3. Qty_Ordered
4. Rate_Per_Unit
Transitive Dependencies
64

Transitive dependency:
 An FD X →Y in a relation schema R for which there is a set
ofattributes Z ⊆
⊆R such that
X →Z and Z →Y and Z is not a subset of any key of R
Ex: student (rollNo, name, dept, hostelName, roomNo, headDept)
Keys: rollNo, (hostelName, roomNo)
rollNo→dept; dept →headDept hold
So, rollNo→headDept a transitive dependency
Head of the dept of dept D is stored redundantly in every
tuplewhere D appears.
Relation is in 2NF but redundancy still exists.
3rd Normal Form
Example - 2
65

 3. Relation schema R is in 3NF if it is in 2NF and no non-prime


attribute of R is transitively dependent on any key of R
 student (rollNo, name, dept, hostelname, roomNo, headDept)is
not in 3NF
 Decompose:

student (rollNo, name, dept, hostelName, roomNo)


deptInfo(dept, headDept)
both in 3NF
Redundancy in data storage -removed
Another definition of 3rd Normal Form
66

 4. Relation schema R is in 3NF if for any nontrivial FD


X →A either (i) X is a superkey or (ii) A is prime.
 Suppose some R violates the above definition
⇒ There is an FD X →A for which both (i) and (ii) are false
⇒ X is not a superkey and A is non-prime attribute
 Two cases arise:
1) X is contained in a key –A is not fully functionally dependent
on this key
-violation of 2NF condition
2) X is not contained in a key
K →X, X →A is a case of transitive dependency (K –any
key of R)
Another definition of 3rd Normal Form
67

 5. A Relation schema R is in 3NF with respect to a set F of


functional dependencies if, for all functional dependencies in
F+ of the form X →Y, where X ⊆ R and Y ⊆ R, atleast one of
the following holds:
(i) X →Y is a trivial functional dependency
(ii) X is a Superkey for R
(iii) Each attribute A in Y – X is contained in a Candidate key
for R
3rd Normal Form
Example - 3
68

 gradeInfo(rollNo, studName, course, grade)


 Keys: (rollNo, course), (studName, course)

 Suppose the following FDs hold:

1) rollNo, course →grade


2) studName, course →grade
3) rollNo→studName
4) studName→rollNo
For 1,2 lhs is a key.
For 3,4 rhs is prime
-So gradeInfo is in 3NF
But studNameis stored redundantly along with every course being
done by the student
3rd Normal Form
Example - 4
69

Students table:
Student# AdvID AdvName AdvRoom
123 123A James 555
124 123B Smith 467

Student table: Advisor table:

Student# AdvID AdvID AdvName AdvRoom

123 123A 123A James 555

124 123B 123B Smith 467


3rd Normal Form
Example - 4 Cont.
70

Students table:
Student# AdvID

123 123A

124 123B

Registration table: Advisor table:


Student# Class# AdvID AdvName AdvRoom
123 102-
102-8 123A James 555
123 104-
104-9 123B Smith 467
124 209-
209-0
124 102-
102-8
1st Normal Form Example
71
2nd Normal Form Example
72
3rd Normal Form
Example - 5

73
3rd Normal Form
Example - 6
74

EmpProj
1. Project_number (PK)
2. Project_name
3. Employee_number (PK)
4. Employee_name
5. Rate_category
6. Hourly_rate
To convert the table into the Second Normal Form remove and place
these fields in a separate table, with the key being that part of the
original key they are dependent on.
This leads to the following 3 tables in 2NF:
EmpProj Emp Proj
1. Project_number(PK) 1. Employee_number (PK) 1. Project_number(PK)
2. Employee_number(PK) 2. Employee_name 2. Project_name
3. Rate_category
4. Hourly_rate
3rd Normal Form
Example - 6
75

Going through all the fields reveals the following:


 Employee table is the only one with more than one non-key attribute
 Employee name is not dependent on either Rate_category or
Hourly_Rate
 Hourly_Rate is dependent on Rate_Category
 To convert the table into 3NF remove and place these fields in a
separate table, with the attribute it was dependent on as key, as
follows:
 This leads to the following 4 tables: Emp
EmpProj Rate 1. Employee_number (PK)
1. Project_number(PK) 1. Rate_category (PK) 2. Employee_name
2. Employee_number(PK) 2. Hourly_rate 3. Rate_category

Proj
1. Project_number(PK)
2. Project_name

You might also like