0% found this document useful (0 votes)
12 views

Functional Dependency (DBMS)

The document discusses functional dependencies in relational databases. It defines a functional dependency as two tuples having the same value for a set of attributes (X) also having the same value for another attribute (Y). It provides examples of trivial and non-trivial functional dependencies. It also describes four rules for functional dependencies: attributes with unique values determine any other attributes; attributes with the same values are functionally determined; tuples must agree on dependent attribute values; and violations only occur with different dependent attribute values for the same determinant attributes.

Uploaded by

Sravanti Bagchi
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Functional Dependency (DBMS)

The document discusses functional dependencies in relational databases. It defines a functional dependency as two tuples having the same value for a set of attributes (X) also having the same value for another attribute (Y). It provides examples of trivial and non-trivial functional dependencies. It also describes four rules for functional dependencies: attributes with unique values determine any other attributes; attributes with the same values are functionally determined; tuples must agree on dependent attribute values; and violations only occur with different dependent attribute values for the same determinant attributes.

Uploaded by

Sravanti Bagchi
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 38

Functional Dependency-

In any relation, a functional dependency α → β holds if-


Two tuples having same value of attribute α also have same value for attribute β.

A->B

B IS FUNCTIONALLY DETERMINED BY A

FOR SAME VALUES OF A , SAME VALUES OF B ALSO OCCUR

FOR DIFFERENT VALUES OF A ,SAME VALUES OF B MAY OCCUR

BUT FOR SAME VALUES OF A DIFFERENT VALUES OF B NEVER OCCUR

Mathematically,

If α and β are the two sets of attributes in a relational table R where-

α⊆R
β⊆R
Then, for a functional dependency to exist from α to β,

If t1[α] = t2[α], then t1[β] = t2[β]

Α Β

A1 B1

A2 B1

A1 B1

A->B EXIST
IS B->A EXIST? NO

Types Of Functional Dependencies-


There are two types of functional dependencies-

1. Trivial Functional Dependencies-

 A functional dependency X → Y is said to be trivial if and only if Y ⊆ X.


 Thus, if RHS of a functional dependency is a subset of LHS, then it is called as a trivial functional dependency.

Examples-

The examples of trivial functional dependencies are-

 AB → A
 AB → B
 AB → AB

2. Non-Trivial Functional Dependencies-

 A functional dependency X → Y is said to be non-trivial if and only if Y ⊄ X.


 Thus, if there exists at least one attribute in the RHS of a functional dependency that is not a part of LHS, then it is
called as a non-trivial functional dependency.

Examples-

The examples of non-trivial functional dependencies are-

 AB → BC
 AB → CD

Rules for Functional Dependency-

Rule-01:

A functional dependency X → Y will always hold if all the values of X are unique (different) irrespective of the values
of Y.

Example-

Consider the following table-


A B C D E

5 4 3 2 2

8 5 3 2 1

1 9 3 3 5

4 7 3 3 8

The following functional dependencies will always hold since all the values of attribute ‘A’ are unique-

A→B
 A → BC
 A → CD
 A → BCD
 A → DE
 A → BCDE
In general, we can say following functional dependency will always hold-

A → Any combination of attributes A, B, C, D, E

Similar will be the case for attributes B and E.

Rule-02:

A functional dependency X → Y will always hold if all the values of Y are same irrespective of the values of X.

Example-
Consider the following table-

A B C D E
5 4 3 2 2

8 5 3 2 1

1 9 3 3 5

4 7 3 3 8

The following functional dependencies will always hold since all the values of attribute ‘C’ are same-

A→C
 AB → C
 ABDE → C
 DE → C
 AE → C

In general, we can say following functional dependency will always hold true-

Any combination of attributes A, B, C, D, E → C

Combining Rule-01 and Rule-02 we can say-

In general, a functional dependency α → β always holds-


If either all values of α are unique or if all values of β are same or both.

Rule-03:

For a functional dependency X → Y to hold, if two tuples in the table agree on the value of attribute X, then they must
also agree on the value of attribute Y.

Rule-04:

For a functional dependency X → Y, violation will occur only when for two or more same values of X, the
corresponding Y values are different.
PROVE UNION RULE:

Given , A->B, A->C we need to prove A->BC

A->B
Augment by A
A->AB …….1

A->C
Augment by B
AB->BC……..2
By applying transitivity between 1 and 2

ABC
UNION RULE IS PROVED

PROVE DECOMPOSITION RULE:

Given A->BC A->B,A->C

By reflexivity we get,
BC IS SUBSET OF A

B IS A SUBSET OF A. A->B
C IS A SUBSET OF A. A->C

PROVE PSEUDO TRANSITIVITY RULE:

Given, A->B, BC->D AC->D

A->B
Augment by C
AC->BC……1
BC->D……2
Applying transitivity between 1 and 2 we get
AC->D

CLOSURE OF ATTRIBUTE:

A→C

AC → D

E → AD

E→H

Find A+ (CLOSURE OF A)
result=A
= AC
=ACD

Therefore, A+={ACD}
Find E+
Result=E
=EADH
= EADHC
E is a candidate key.
Find AC+
result=AC
=ACD

Question: Consider the relation scheme R = {E, F, G, H, I, J, K, L, M, N} and the set of functional
dependencies {{E, F} -> {G}, {F} -> {I, J}, {E, H} -> {K, L}, K -> {M}, L -> {N} on R. What is the Candidate key for
R?

A. {E, F}
B. {E, F, H}
C. {E, F, H, K, L}
D. {E}
EF+ EFH+=EFH=EFHGIJ=EFHGIJKL=EFHGIJKLMN
result=EF=EFG=EFGIJ SO, EFH is a candidate key.

F+
Result=F=FIJ
EH+
Result=EH=EHKL=EHKLM
K+
Result=K=KM
L+
Result=L=LN

Equivalence of Two Sets of Functional Dependencies-

 If F and G are the two sets of functional dependencies, then following 3 cases are possible-

Case-01: F covers G (F ⊇ G)

Case-02: G covers F (G ⊇ F)

Case-03: Both F and G cover each other (F = G)

Case-01: Determining Whether F Covers G-

Following steps are followed to determine whether F covers G or not-

Step-01:

 Take the functional dependencies of set G into consideration.


 For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.

Step-02:

 Take the functional dependencies of set G into consideration.


 For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.

Step-03:

 Compare the results of Step-01 and Step-02.


 If the functional dependencies of set F has determined all those attributes that were determined by the functional
dependencies of set G, then it means F covers G.
 Thus, we conclude F covers G (F ⊇ G) otherwise not.

Case-02: Determining Whether G Covers F-

Following steps are followed to determine whether G covers F or not-

Step-01:

 Take the functional dependencies of set F into consideration.


 For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.

Step-02:

 Take the functional dependencies of set F into consideration.


 For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.

Step-03:

 Compare the results of Step-01 and Step-02.


 If the functional dependencies of set G has determined all those attributes that were determined by the functional
dependencies of set F, then it means G covers F.
 Thus, we conclude G covers F (G ⊇ F) otherwise not.

Case-03: Determining Whether Both F and G Cover Each Other-

 If F covers G and G covers F, then both F and G cover each other.


 Thus, if both the above cases hold true, we conclude both F and G cover each other (F = G).

PRACTICE PROBLEM BASED ON EQUIVALENCE OF FUNCTIONAL DEPENDENCIES-

Problem-

A relation R (A , C , D , E , H) is having two functional dependencies sets F and G as shown-


Set F-

A→C

AC → D

E → AD

E→H

Set G-

A → CD

E → AH

Which of the following holds true?

(A) G ⊇ F

(B) F ⊇ G

(C) F = G

(D) All of the above

Solution-

Determining whether F covers G-


Find closure of left hand side of G SET

Step-01:

 (A)+ = { A , C , D } // closure of left side of A → CD using set G


 (E) = { A , C , D , E , H }
+
// closure of left side of E → AH using set G

Step-02:

 (A)+ =A=AC= { A , C , D } // closure of left side of A → CD using set F


 (E) =E=EDH= { A , C , D , E , H }
+
// closure of left side of E → AH using set F

Step-03:

Comparing the results of Step-01 and Step-02, we find-

 Functional dependencies of set F can determine all the attributes which have been determined by the functional
dependencies of set G.
 Thus, we conclude F covers G i.e. F ⊇ G.

Determining whether G covers F-


Step-01:

 (A)+ = { A , C , D } // closure of left side of A → C using set F


 (AC) = { A , C , D }
+
// closure of left side of AC → D using set F
 (E) = { A , C , D , E , H }
+
// closure of left side of E → AD and E → H using set F

Step-02:

 (A)+ = { A , C , D } // closure of left side of A → C using set G


 (AC) = { A , C , D }
+
// closure of left side of AC → D using set G
 (E) = { A , C , D , E , H }
+
// closure of left side of E → AD and E → H using set G

Step-03:

Comparing the results of Step-01 and Step-02, we find-

 Functional dependencies of set G can determine all the attributes which have been determined by the functional
dependencies of set F.
 Thus, we conclude G covers F i.e. G ⊇ F.

Determining whether both F and G cover each other-

 From Step-01, we conclude F covers G.


 From Step-02, we conclude G covers F.
 Thus, we conclude both F and G cover each other i.e. F = G.
 That is F and G are equivalent.

Thus, Option (D) is correct.

Canonical Cover/Minimum cover in DBMS-

In DBMS,

 A canonical cover is a simplified and reduced version of the given set of functional dependencies.
 Since it is a reduced version, it is also called as Irreducible set.

Characteristics-

 Canonical cover is free from all the extraneous functional dependencies.


 The closure of canonical cover is same as that of the given set of functional dependencies.
 Canonical cover is not unique and may be more than one for a given set of functional dependencies.

Need-

 Working with the set containing extraneous functional dependencies increases the computation time.
 Therefore, the given set is reduced by eliminating the useless functional dependencies.
 This reduces the computation time and working with the irreducible set becomes easier.

PRACTICE PROBLEM BASED ON FINDING CANONICAL COVER-

Find canonical cover-

A->BC
B->C
A->C
AB->C

Apply union rule on A->BC and A->C, WE GET

F={A->BC, B->C, AB->C}

A->BC

(X ->Y)

{F-(X->Y)}U {X->(Y-t)} t=testing attribute (For Checking in RHS)

Let t=B

{B->C,AB->C}U {A->(BC-B)}
{B->C,AB->C}U {A->C}
{B->C,AB->C,A->C}

X+
A+=A=AC
As t=B not in A+, SO B is not extraneous.

Again let t=C


{B->C,AB->C}U{A->(BC-C)}
{B->C,AB->C}U{A>B}
{B->C,AB->C,A>B}

X+

A+=A=AB=ABC
As t= C is in A+ So, C is extraneous.

AB->C
X Y

{F-(X->Y)}U {(X-t)->Y} t=testing attribute (For Checking in LHS)


Let t=A
{A->BC,B->C}U{(AB-A)->C}
{A->BC,B->C}

(X-t)+=(AB-A)+=B+=B=BC

As Y=C is in (X-t)+ so, t=A is extraneous.

The canonical cover is-


Fc ={A->B,B->C}

Problem-

The following functional dependencies hold true for the relational scheme R ( W , X , Y , Z ) –

X→W

WZ → XY

Y → WXZ

Write the irreducible equivalent for this set of functional dependencies.

Solution-

Step-01:

Write all the functional dependencies such that each contains exactly one attribute on its right side-

X→W

WZ → X

WZ → Y

Y→W

Y→X

Y→Z

Step-02:

Check the essentiality of each functional dependency one by one.

For X → W:

 Considering X → W, (X)+ = { X , W }
 Ignoring X → W, (X)+ = { X }
Now,

 Clearly, the two results are different.


 Thus, we conclude that X → W is essential and can not be eliminated.

For WZ → X:

 Considering WZ → X, (WZ)+ = { W , X , Y , Z }
 Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }

Now,

 Clearly, the two results are same.


 Thus, we conclude that WZ → X is non-essential and can be eliminated.

Eliminating WZ → X, our set of functional dependencies reduces to-

X→W

WZ → Y

Y→W

Y→X

Y→Z

Now, we will consider this reduced set in further checks.

For WZ → Y:

 Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
 Ignoring WZ → Y, (WZ)+ = { W , Z }

Now,

 Clearly, the two results are different.


 Thus, we conclude that WZ → Y is essential and can not be eliminated.

For Y → W:

 Considering Y → W, (Y)+ = { W , X , Y , Z }
 Ignoring Y → W, (Y)+ = { W , X , Y , Z }

Now,
 Clearly, the two results are same.
 Thus, we conclude that Y → W is non-essential and can be eliminated.

Eliminating Y → W, our set of functional dependencies reduces to-

X→W

WZ → Y

Y→X

Y→Z

For Y → X:

 Considering Y → X, (Y)+ = { W , X , Y , Z }
 Ignoring Y → X, (Y)+ = { Y , Z }

Now,

 Clearly, the two results are different.


 Thus, we conclude that Y → X is essential and can not be eliminated.

For Y → Z:

 Considering Y → Z, (Y)+ = { W , X , Y , Z }
 Ignoring Y → Z, (Y)+ = { W , X , Y }

Now,

 Clearly, the two results are different.


 Thus, we conclude that Y → Z is essential and can not be eliminated.

From here, our essential functional dependencies are-

X→W

WZ → Y

Y→X

Y→Z

Step-03:

 Consider the functional dependencies having more than one attribute on their left side.
 Check if their left side can be reduced.
In our set,

 Only WZ → Y contains more than one attribute on its left side.


 Considering WZ → Y, (WZ)+ = { W , X , Y , Z }

Now,

 Consider all the possible subsets of WZ.


 Check if the closure result of any subset matches to the closure result of WZ.
(W)+ = { W }

(Z)+ = { Z }

Clearly,

 None of the subsets have the same closure result same as that of the entire left side.
 Thus, we conclude that we can not write WZ → Y as W → Y or Z → Y.
 Thus, set of functional dependencies obtained in step-02 is the canonical cover.

Finally, the canonical cover is-

X→W

WZ → Y

Y→X

Y→Z

Canonical Cover

Decomposition of a Relation-

The process of breaking up or dividing a single relation into two or more sub relations is called as decomposition of a
relation.

Properties of Decomposition-

The following two properties must be followed when decomposing a given relation-

1. Lossless decomposition-

Lossless decomposition ensures-

 No information is lost from the original relation during decomposition.


 When the sub relations are joined back, the same relation is obtained that was decomposed.
Every decomposition must always be lossless.

2. Dependency Preservation-

Dependency preservation ensures-

 None of the functional dependencies that holds on the original relation are lost.
 The sub relations still hold or satisfy the functional dependencies of the original relation.

Types of Decomposition-

Decomposition of a relation can be completed in the following two ways-

1. Lossless Join Decomposition-

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


 This decomposition is called lossless join decomposition when the join of the sub relations results in the same
relation R that was decomposed.
 For lossless join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R

where ⋈ is a natural join operator

Example-

Consider the following relation R( A , B , C )-

A B C

1 2 1

2 5 3

3 3 3
R( A , B , C )

Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-

The two sub relations are-

A B

1 2

2 5

3 3

R1 ( A , B )

B C

2 1

5 3

3 3

R2 ( B , C )

Now, let us check whether this decomposition is lossless or not.


For lossless decomposition, we must have-

R1 ⋈ R 2 = R

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-

A B C

1 2 1

2 5 3

3 3 3

This relation is same as the original relation R.

Thus, we conclude that the above decomposition is lossless join decomposition.

NOTE-

 Lossless join decomposition is also known as non-additive join decomposition.


 This is because the resultant relation after joining the sub relations is same as the decomposed relation.
 No extraneous tuples appear after joining of the sub-relations.

2. Lossy Join Decomposition-

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


 This decomposition is called lossy join decomposition when the join of the sub relations does not result in the same
relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
 For lossy join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R

where ⋈ is a natural join operator

Example-
Consider the following relation R( A , B , C )-

A B C

1 2 1

2 5 3

3 3 3

R( A , B , C )

Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-

The two sub relations are-

A C

1 1

2 3

3 3

R1 ( A , B )
B C

2 1

5 3

3 3

R2 ( B , C )

Now, let us check whether this decomposition is lossy or not.

For lossy decomposition, we must have-

R1 ⋈ R 2 ⊃ R

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-

A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

This relation is not same as the original relation R and contains some extraneous tuples.

Clearly, R1 ⋈ R2 ⊃ R.

Thus, we conclude that the above decomposition is lossy join decomposition.


NOTE-

 Lossy join decomposition is also known as careless decomposition.


 This is because extraneous tuples get introduced in the natural join of the sub-relations.
 Extraneous tuples make the identification of the original tuples difficult.

Determine Decomposition Is Lossless Or Lossy


Consider a relation R is decomposed into two sub relations R1 and R2.

Then,

 If all the following conditions satisfy, then the decomposition is lossless.


 If any of these conditions fail, then the decomposition is lossy.

Condition-01:

Union of both the sub relations must contain all the attributes that are present in the original relation R.

Thus,

R1 ∪ R2 = R

Condition-02:

 Intersection of both the sub relations must not be null.


 In other words, there must be some common attribute which is present in both the sub relations.
Thus,

R1 ∩ R2 ≠ ∅

Condition-03:

Intersection of both the sub relations must be a super key of either R1 or R2 or both.

Thus,

R1 ∩ R2 = Super key of R1 or R2

PRACTICE PROBLEMS BASED ON DETERMINING WHETHER DECOMPOSITION IS LOSSLESS OR LOSSY-


Problem-01:

Consider a relation schema R ( A , B , C , D ) with the functional dependencies A → B and C → D. Determine


whether the decomposition of R into R1 ( A , B ) and R2 ( C , D ) is lossless or lossy.

Solution-

To determine whether the decomposition is lossless or lossy,

 We will check all the conditions one by one.


 If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:

According to condition-01, union of both the sub relations must contain all the attributes of relation R.

So, we have-

R1 ( A , B ) ∪ R2 ( C , D )

=R(A,B,C,D)

Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

Condition-02:

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R1 ( A , B ) ∩ R2 ( C , D )

Clearly, intersection of the sub relations is null.

So, condition-02 fails.

Thus, we conclude that the decomposition is lossy.

Problem-02:

Consider a relation schema R ( A , B , C , D ) with the following functional dependencies-

A→B

B→C

C→D
D→B

Determine whether the decomposition of R into R1 ( A , B ) , R2 ( B , C ) and R3 ( B , D ) is lossless or lossy.

Solution-

Strategy to Solve

When a given relation is decomposed into more than two sub relations, then-

 Consider any one possible ways in which the relation might have been decomposed into those sub relations.
 First, divide the given relation into two sub relations.
 Then, divide the sub relations according to the sub relations given in the question.
As a thumb rule, remember-

Any relation can be decomposed only into two sub relations at a time.

Consider the original relation R was decomposed into the given sub relations as shown-

Decomposition of R(A, B, C, D) into R'(A, B, C) and R 3(B, D)-

To determine whether the decomposition is lossless or lossy,

 We will check all the conditions one by one.


 If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:

According to condition-01, union of both the sub relations must contain all the attributes of relation R.

So, we have-
R‘ ( A , B , C ) ∪ R3 ( B , D )

=R(A,B,C,D)

Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

Condition-02:

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R‘ ( A , B , C ) ∩ R3 ( B , D )

=B

Clearly, intersection of the sub relations is not null.

Thus, condition-02 satisfies.

Condition-03:

According to condition-03, intersection of both the sub relations must be the super key of one of the two sub relations
or both.

So, we have-

R‘ ( A , B , C ) ∩ R3 ( B , D )

=B

Now, the closure of attribute B is-

B+ = { B , C , D }

Now, we see-

 Attribute ‘B’ can not determine attribute ‘A’ of sub relation R’.
 Thus, it is not a super key of the sub relation R’.
 Attribute ‘B’ can determine all the attributes of sub relation R3.
 Thus, it is a super key of the sub relation R3.

Clearly, intersection of the sub relations is a super key of one of the sub relations.

So, condition-03 satisfies.

Thus, we conclude that the decomposition is lossless.

Decomposition of R'(A, B, C) into R1(A, B) and R2(B, C)-

To determine whether the decomposition is lossless or lossy,


 We will check all the conditions one by one.
 If any of the conditions fail, then the decomposition is lossy otherwise lossless.

Condition-01:

According to condition-01, union of both the sub relations must contain all the attributes of relation R’.

So, we have-

R1 ( A , B ) ∪ R2 ( B , C )

= R’ ( A , B , C )

Clearly, union of the sub relations contain all the attributes of relation R’.

Thus, condition-01 satisfies.

Condition-02:

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R1 ( A , B ) ∩ R2 ( B , C )

=B

Clearly, intersection of the sub relations is not null.

Thus, condition-02 satisfies.

Condition-03:

According to condition-03, intersection of both the sub relations must be the super key of one of the two sub relations
or both.

So, we have-

R1 ( A , B ) ∩ R2 ( B , C )

=B

Now, the closure of attribute B is-

B+ = { B , C , D }

Now, we see-

 Attribute ‘B’ can not determine attribute ‘A’ of sub relation R1.
 Thus, it is not a super key of the sub relation R1.
 Attribute ‘B’ can determine all the attributes of sub relation R2.
 Thus, it is a super key of the sub relation R2.
Clearly, intersection of the sub relations is a super key of one of the sub relations.

So, condition-03 satisfies.

Thus, we conclude that the decomposition is lossless.

Conclusion-
Overall decomposition of relation R into sub relations R1, R2 and R3 is lossless.

Normalization in DBMS-

In DBMS, database normalization is a process of making the database consistent by-

 Reducing the redundancies


 It minimizes the insert,update and delete anomalies.

Normalization is done through normal forms.

Normal Forms-

The standard normal forms used are-

1. First Normal Form (1NF)


2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
There exists several other normal forms even after BCNF but generally we normalize till BCNF only.

First Normal Form-

A given relation is called in First Normal Form (1NF) if each cell of the table contains only an atomic value.

OR

A given relation is called in First Normal Form (1NF) if the attribute of every tuple is either single valued or a null
value.

Domain of all attributes should be atomic.

Example-

The following relation is not in 1NF-

Student_id Name Subjects

100 Akshay Computer Networks, Designing

101 Aman Database Management System

102 Anjali Automata, Compiler Design

Relation is not in 1NF

However,

 This relation can be brought into 1NF.


 This can be done by rewriting the relation such that each cell of the table contains only one value.

Student_id Name Subjects

100 Akshay Computer Networks

100 Akshay Designing

101 Aman Database Management System


102 Anjali Automata

102 Anjali Compiler Design

Relation is in 1NF

This relation is in First Normal Form (1NF).

NOTE-

 By default, every relation is in 1NF.


 This is because formal definition of a relation states that value of all the attributes must be atomic.

Second Normal Form-

A given relation is called in Second Normal Form (2NF) if and only if-

1. Relation already exists in 1NF.


2. No partial dependency exists in the relation., that is all non key attributes are fully functionally depends on key
attributes.

Partial Dependency

A partial dependency is a dependency where few attributes of the candidate key determines non-prime attribute(s).

OR

A partial dependency is a dependency where a portion of the candidate key or incomplete candidate key determines non-prime
attribute(s).

In other words,

A → B is called a partial dependency if and only if-

1. A is a subset of some candidate key


2. B is a non-prime attribute.
If any one condition fails, then it will not be a partial dependency.

NOTE-
 To avoid partial dependency, incomplete candidate key must not determine any non-prime attribute.
 However, incomplete candidate key can determine prime attributes.

Example-

R(A,B,C,D)

Let Candidate key=AB

Non key attributes are C and D.

AB->C (2NF)

C->D (2NF)

B->D (PARTIALLY DEPENDS ON CANDIDATE KEY,SO NOT IN 2NF)

Thus, we conclude that the given relation is not in 2NF.

Third Normal Form-

A given relation is called in Third Normal Form (3NF) if and only if-

1. Relation already exists in 2NF.


2. No transitive dependency exists for non-prime attributes.

A → B is in 3NF if and only if-

1. A is a super key or
2. B is a prime attribute.
(Prime attributes are components of key)

NOTE-

 Transitive dependency must not exist for non-prime attributes.


 However, transitive dependency can exist for prime attributes.

OR

A relation is called in Third Normal Form (3NF) if and only if-

Any one condition holds for each non-trivial functional dependency A → B


1. A is a super key
2. B is a prime attribute

Example-

Consider a relation- R ( A , B , C , D , E ) with functional dependencies-

A → BC (3NF)

CD → E (3NF)

B → D (3NF)

E → A (3NF)

The possible super keys for this relation are-

A , E , CD , BC

From here,

 Prime attributes = { A , B , C , D , E }
 There are no non-prime attributes

Now,

 It is clear that there are no non-prime attributes in the relation.


 In other words, all the attributes of relation are prime attributes.
 Thus, all the attributes on RHS of each functional dependency are prime attributes.

Thus, we conclude that the given relation is in 3NF.

Boyce-Codd Normal Form-

A given relation is called in BCNF if and only if-

1. Relation already exists in 2NF.


2. For each non-trivial functional dependency A → B,

i) A is a super key of the relation.

Example-

Consider a relation- R ( A , B , C ) with the functional dependencies-

A → B(BCNF)
B → C(BCNF)

C → A(NOT IN BCNF)

The possible candidate keys for this relation are-

A,B

Now, we can observe that RHS of each given functional dependency is a candidate key.

Thus, we conclude that the given relation is in BCNF.

Normal Forms in DBMS-

We have discussed-

 Database normalization is a process of making the database consistent.


 Normalization is done through normal forms.
 The standard normal forms generally used are-

In this article, we will discuss some important points about normal forms.

Point-01:

Remember the following diagram which implies-

 A relation in BCNF will surely be in all other normal forms.


 A relation in 3NF will surely be in 2NF and 1NF.
 A relation in 2NF will surely be in 1NF.
Point-02:

The above diagram also implies-

 BCNF is stricter than 3NF.


 3NF is stricter than 2NF.
 2NF is stricter than 1NF.

Point-03:

While determining the normal form of any given relation,

 Start checking from BCNF.


 This is because if it is found to be in BCNF, then it will surely be in all other normal forms.
 If the relation is not in BCNF, then start moving towards the outer circles and check for other normal forms in the
order they appear.

Point-04:

 In a relational database, a relation is always in First Normal Form (1NF) at least.

Point-05:
 Singleton keys are those that consist of only a single attribute.
 If all the candidate keys of a relation are singleton candidate keys, then it will always be in 2NF at least.
 This is because there will be no chances of existing any partial dependency.
 The candidate keys will either fully appear or fully disappear from the dependencies.
 Thus, an incomplete candidate key will never determine a non-prime attribute.

Also read- Types of Keys in DBMS

Point-06:

 If all the attributes of a relation are prime attributes, then it will always be in 2NF at least.
 This is because there will be no chances of existing any partial dependency.
 Since there are no non-prime attributes, there will be no Functional Dependency which determines a non-prime
attribute.

Point-07:

 If all the attributes of a relation are prime attributes, then it will always be in 3NF at least.
 This is because there will be no chances of existing any transitive dependency for non-prime attributes.

Point-08:

 Third Normal Form (3NF) is considered adequate for normal relational database design.

Point-09:

 Every binary relation (a relation with only two attributes) is always in BCNF.

Point-10:

 BCNF is free from redundancies arising out of functional dependencies (zero redundancy).

Point-11:

 A relation with only trivial functional dependencies is always in BCNF.


 In other words, a relation with no non-trivial functional dependencies is always in BCNF.

Point-12:
 BCNF decomposition is always lossless but not always dependency preserving.

Point-13:

 Sometimes, going for BCNF may not preserve functional dependencies.


 So, go for BCNF only if the lost functional dependencies are not required else normalize till 3NF only.

Point-14:

 There exist many more normal forms even after BCNF like 4NF and more.
 But in the real world database systems, it is generally not required to go beyond BCNF.

Point-15:

 Lossy decomposition is not allowed in 2NF, 3NF and BCNF.


 So, if the decomposition of a relation has been done in such a way that it is lossy, then the decomposition will never
be in 2NF, 3NF and BCNF.

Point-16:

 Unlike BCNF, Lossless and dependency preserving decomposition into 3NF and 2NF is always possible.

Point-17:

 A prime attribute can be transitively dependent on a key in a 3NF relation.


 A prime attribute can not be transitively dependent on a key in a BCNF relation.

Point-18:

 If a relation consists of only singleton candidate keys and it is in 3NF, then it must also be in BCNF.

Point-19:

 If a relation consists of only one candidate key and it is in 3NF, then the relation must also be in BCNF.

Decomposition Algorithms
In the previous section, we discussed decomposition and its types with the help of small examples. In
the actual world, a database schema is too wide to handle. Thus, it requires algorithms that may
generate appropriate databases.
Here, we will get to know the decomposition algorithms using functional dependencies for two
different normal forms, which are:

o Decomposition to BCNF
o Decomposition to 3NF

Decomposition using functional dependencies aims at dependency preservation and lossless


decomposition.
Let's discuss this in detail.

Decomposition to BCNF
Before applying the BCNF decomposition algorithm to the given relation, it is necessary to test if the
relation is in Boyce-Codd Normal Form. After the test, if it is found that the given relation is not in
BCNF, we can decompose it further to create relations in BCNF.
There are following cases which require to be tested if the given relation schema R satisfies the BCNF
rule:
Case 1: Check and test, if a nontrivial dependency α -> β violate the BCNF rule, evaluate and
compute α+ , i.e., the attribute closure of α. Also, verify that α+ includes all the attributes of the given
relation R. It means it should be the superkey of relation R.
Case 2: If the given relation R is in BCNF, it is not required to test all the dependencies in F +. It only
requires determining and checking the dependencies in the provided dependency set F for the BCNF
test. It is because if no dependency in F causes a violation of BCNF, consequently, none of the
F+ dependency will cause any violation of BCNF.

Note: Case2 does not work if the relation gets decomposed. It means during the testing of the given relation R, we
cannot check the dependency of F for the cause of violation of BCNF.

BCNF Decomposition Algorithm


This algorithm is used if the given relation R is decomposed in several relations R 1, R2,…, Rn because it
was not present in the BCNF. Thus,
For every subset α of attributes in the relation R i, we need to check that α+ (an attribute closure of α
under F) either includes all the attributes of the relation R i or no attribute of Ri-α.
result={R};
done=false;
compute F+;
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let α->β be a nontrivial functional dependency that holds
on Ri such that α->Ri is not in F+, and α ꓵ β= ø;
result=(result-Ri) U (Ri-β) U (α,β);
end
else done=true;
Note: If some set of attributes α in Ri violates the specified condition in the algorithm, in such case consider the
functional dependency α->( α+ - α) ꓵ Ri. Such dependency can be present in the F+ dependency.

This algorithm is used for decomposing the given relation R into its several decomposers. This
algorithm uses dependencies that show the violation of BCNF for performing the decomposition of the
relation R. Thus, such an algorithm not only generates the decomposers of relation R in BCNF but is
also a lossless decomposition. It means there occurs no data loss while decomposing the given
relation R into R1, R2, and so on…
The BCNF decomposition algorithm takes time exponential in the size of the initial relation schema R.
With this, a drawback of this algorithm is that it may unnecessarily decompose the given relation R,
i.e., over-normalizing the relation. Although decomposing algorithms for BCNF and 4NF are similar,
except for a difference. The fourth normal form works on multivalued dependencies, whereas BCNF
focuses on the functional dependencies. The multivalued dependencies help to reduce some form of
repetition of the data, which is not understandable in terms of functional dependencies.

Difference between Multivalued Dependency and Functional Dependency


The difference between both dependencies is that a functional dependency expels certain tuples from
being in a relation, but a multivalued dependency does not do so. It means a multivalued dependency
does not expel or rule out certain tuples. Rather it requires other tuples of certain forms to exist in
relation. Due to such a difference, the multivalued dependency is also referred to as tuple-
generating dependency, and the functional dependency is referred to as equality-generating
dependency.

Decomposition to 3NF
The decomposition algorithm for 3NF ensures the preservation of dependencies by explicitly building a
schema for each dependency in the canonical cover. It guarantees that at least one schema must hold
a candidate key for the one being decomposed, which in turn ensures the decomposition generated to
be a lossless decomposition.

3NF Decomposition Algorithm


let Fc be a canonical cover for F;
i=0;
for each functional dependency α->β in Fc
i = i+1;
R = αβ;
If none of the schemas Rj, j=1,2,…I holds a candidate key for R
Then
i = i+1;
Ri= any candidate key for R;
/* Optionally, remove the repetitive relations*/
Repeat
If any schema Rj is contained in another schema Rk
Then
/* Delete Rj */
Rj = Ri;
i = i-1;
until no more Rjs can be deleted
return (R1, R2, . . . ,Ri)
Here, R is the given relation, and F is the given set of functional dependency for which F c maintains
the canonical cover. R1, R2, . . . , Ri are the decomposed parts of the given relation R. Thus, this
algorithm preserves the dependency as well as generates the lossless decomposition of relation R.
A 3NF algorithm is also known as a 3NF synthesis algorithm. It is called so because the normal
form works on a dependency set, and instead of repeatedly decomposing the initial schema, it adds
one schema at a time.

Drawbacks of 3NF Decomposing Algorithm

o The result of the decomposing algorithm is not uniquely defined because a set of functional
dependencies can hold more than one canonical cover.
o In some cases, the result of the algorithm depends on the order in which it considers the
dependencies in Fc.
o If the given relation is already present in the third normal form, then also it may decompose a
relation.

Multivalued dependency occurs when there are more than one independent multivalued attributes in a table.
For example: Consider a bike manufacture company, which produces two colors (Black and white) in each model
every year.
...
Multivalued dependency in DBMS.

bike_model manuf_year color

M1001 2007 Black

M1001 2007 Red

M2012 2008 Black

M2012 2008 Red


Bike_mode l-> manuf_yr

Manuf_yr -> > color

4NF:

Properties – A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. the table should not have any Multi-valued Dependency.

A table with a multivalued dependency violates the normalization standard of Fourth Normal Form (4NK) because
it creates unnecessary redundancies and can contribute to inconsistent data.
Fifth Normal Form / Project Join Normal Form (PJNF)(5NF):
A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys of R. A relation
decomposed into two relations must have loss-less join Property, which ensures that no spurious or extra tuples
are generated, when relations are reunited through a natural join.
Properties – A relation R is in 5NF if and only if it satisfies following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency)
Example – Consider the above schema, with a case as “if a company makes a product and an agent is an agent
for that company, then he always sells that product for the company”. Under these circumstances, the ACP table is
shown as:

Table – ACP

Agent Company Product

A1 PQR Nut

A1 PQR Bolt

A1 XYZ Nut

A1 XYZ Bolt

A2 PQR Nut

The relation ACP is again decompose into 3 relations. Now, the natural Join of all the three relations will be shown
as:

Table – R1

Agent Company

A1 PQR

A1 XYZ

A2 PQR

Table – R2

Agent Product

A1 Nut

A1 Bolt
Agent Product

A2 Nut

Table – R3

Company Product

PQR Nut

PQR Bolt

XYZ Nut

XYZ Bolt

Result of Natural Join of R1 and R3 over ‘Company’ and then Natural Join of R13 and R2 over ‘Agent’and
‘Product’ will be table ACP.

Hence, in this example, all the redundancies are eliminated, and the decomposition of ACP is a lossless join
decomposition. Therefore, the relation is in 5NF as it does not violate the property of lossless join.

You might also like