Module-2 Lecture Notes
Module-2 Lecture Notes
Normalization: 1NF, 2NF, 3NF, BCNF, Multi valued dependencies and Fourth Normal
Form.
Relational Model:
– Collection of multiple Relations or Tables
– Tables consist of Rows & Columns
– This is primary data model for commercial purpose.
Relational Schema:
– Defines column heads in a table
– Also specifies relation name, field name (column) and domain of
each field. For e.g.
Student(S_ID: Number(10), Name: Char(30), Address: Varchar(30))
Relational Instance:
– It is a set of tuples (also called records) in which tuples have same
no of fields as in relational schema
– Also can be called a relation or table
Degree of Relation:
Number of fields in a relation i.e. 3
Cardinality:
Number of tuples in a relation i.e. 4
Procedural language-
“What data is needed and how to find those data”
Six basic operators
select
project
union
set difference
Cartesian product
rename
The operators take one or more relations as inputs and give a new
relation as a result.
• Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting of
terms connected by : (and), (or), (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <.
• Example of selection:
branch-name=“Indore”(account)
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Select Operation – Example
• Relation r A B C D
1 7
5 7
12 3
23 10
1 7
23 10
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Project Operation
• Notation:
• Relation r: A B C
10 1
20 1
30 1
40 2
A,C (r) A C A C
1 1
1 = 1
1 2
2
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Example 1:
Display the loan number for each loan of an amount greater than $1200
loan-number (amount > 1200 (loan))
Union Operation
• Notation: r s
• Defined as:
r s = {t | t r or t s}
• For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible
1 2
2 3
1 s
r
r s: A B
1
2
1
3
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Banking Example Cont..
Display the names of all customers who have a loan, an account, or both,
from the bank
• Notation: r s
• Defined as:
r s ={ t | t r and t s }
• Assume:
• r, s have the same arity
• attributes of r and s are compatible
• Note: r s = r - (r - s)
• Relation r, s: A B A B
1 2
2 3
1
r s
• rs A B
2
Display the names of all customers who have a loan and an account at bank.
• Notation r – s
• Defined as:
r – s = {t | t r and t s}
1 2
2 3
1 s
r
r – s: A B
1
1
• Notation r x s
• Defined as:
r x s = {t q | t r and q s}
select *
from one, two;
Relations r, s: A B C D E
1 10 a
10 a
2
20 b
r 10 b
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Composition of Operations
• Can build expressions using multiple operations
• Example: A=C(r x s)
A B C D E
•rxs
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
• A=C(r x s) A B C D E
1 10 a
2 20 a
2 20 b
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Banking Example Cont..
customer (customer-name, customer-street, customer-only)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Display the names of all customers who have a loan at ‘M.P.Nagar’ branch.
customer-name (branch-name=“M.P.Nagar” borrower.loan-number =loan.loan-number(borrower x loan))
Display the names of all customers who have a loan at the ‘New Market’
branch but do not have an account at any branch of the bank.
Q. Retrieve the name and ssn of those employees working on project no ‘P101’.
SQL> select emp1.ssn, emp1.name from emp1,works1 where emp1.ssn=works1.ssn
and project#='P01’;
Q. Find the project name of employees whose salary is greater than 50000.
SQL> select project_name from sal1,works1,proj1 where salary>50000 and
proj1.project# =works1.project# and works1.ssn=sal1.ssn;
Find name & street address of those emp who works for ‘TCS’
SQL> Select emp.ename,emp.street from emp,works where compname=
‘TCS’ and emp.ename=works.ename;
Find name of those emp who live in the same street and city as the
company for which they work
37
Extended Relational-Algebra-Operations
• Rename Operation
• Division Operation
• Aggregate Functions
• Generalized Projection
Relations r, s: A B
B
1 1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
7
7
3
10
sum-C
g sum(c) (r)
27
branch-name balance
Perryridge 1300
Brighton 1500
Redwood 700
Prepared and compiled by
Bhupendra Panchal, Asst. Professor, CSE
Aggregate Functions (Cont.)
2 Types:
i. Tuple Relational Calculus
ii. Domain Relational Calculus
Tuple Relational Calculus
{t | P (t) }
emp dept
fname ssn dno salary dname dno mgrssn
Ajay 1 D5 30000 Research D5 3
Bhanu 3 D5 40000 Admin D4 6
Ram 9 D1 35000 HR D3 -
Namit 6 D4 30000
Jayesh 2 D3 45000
Tuple Relational Calculus Example:
emp(ename, ssn, dno, salary)
dept(dname, dno, mgrssn)
5. Find name of employees along with dept name for which they work.
{t | e emp d dept (t[ename]=e[ename] t[dname]=d[dname]
e[dno] =d[dno])}
Domain Relational Calculus
1. Find first name, last name of all employees using domain relational
calculus.
{<p,q>|<p,q,r,s,t> emp}
5. Find all details of employees along with dept name for which they
work.
{<p,q,r,s,t,l >| s, m (<p,q,r,s,t> emp <l,m,n> dept s=m)}
Banking Example
• customer (customer-name, customer-street, customer-city)
• account (account-number, branch-name, balance)
• loan (loan-number, branch-name, amount)
• depositor (customer-name, account-number)
• borrower (customer-name, loan-number)
Find names of all customers having a loan, an account or both at the bank.
{t | b borrower( t [customer-name] = b [customer-name])
d depositor( t [customer-name] = d [customer-name])
Find names of all customers who have a loan and an account at the bank
{t | b borrower( t [customer-name] = b [customer-name])
d depositor( t [customer-name] = d [customer-name])
Banking Example
• customer (customer-name, customer-street, customer-city)
• account (account-number, branch-name, balance)
• loan (loan-number, branch-name, amount)
• depositor (customer-name, account-number)
• borrower (customer-name, loan-number)
Find the loan number for each loan of an amount greater than $1200
Find the loan-number, branch-name and amount for loans of over $1200
DRC- {<c >| ln, l, b (<c, ln> borrower < l, b, a > loan b= “Perryridge” ln=l)}
Banking Example
• customer (customer-name, customer-street, customer-city)
• account (account-number, branch-name, balance)
• loan (loan-number, branch-name, amount)
• depositor (customer-name, account-number)
• borrower (customer-name, loan-number)
Find the names of all customers who have a loan of over $1200
c. ΠA,F (σC = D(r × s)) {t | ∃ r ∈ R ∃ s ∈ S (t[A] = r[A] ∧ t[F] = s[F] ∧ r[C] = s[D]
64
Types of KEYs in DBMS?
There are different types of Keys in DBMS and each key has
it’s different functionality:
– Super Key
– Candidate Key
– Primary Key
– Foreign Key
65
Super Key-
• A super key is a group of single or multiple keys which identifies rows
in a table.
• A Super key may have additional attributes that are not needed for
unique identification.
ATTRIBUTE SUPER KEY
SID YES
SID ROLL NAME ADDRESS PHONE
NO NO ROLL NO YES
S1 101 AJAY BHOPAL 12345 NAME NO
S2 102 AMAN UJJAIN 45678 ADDRESS NO
S6 106 AMIT BHOPAL PHONE NO NO
S8 108 AMIT INDORE 78910 SID, ROLLNO YES
S9 109 ANKIT UJJAIN 10234
NAME, ADD NO
ADD, PHONE NO
A Relation may have ‘N’ numbers of super keys
ROLLNO, ADD YES
SID, NAME,P PHONE YES
Candidate Key-
• Minimal of Super Key is called Candidate Key.
• A Super key whose proper subset is not a super key is known as
Candidate key.
SID ROLL NO NAME ADDRESS PHONE NO
S1 101 AJAY BHOPAL 12345
S2 102 AMAN UJJAIN 45678
S6 106 AMIT BHOPAL
S8 108 AMIT INDORE 78910
S9 109 ANKIT UJJAIN 10234
69
Foreign Key-
Employee
Foreign Key
Department
EmpId Emp_Name D_Id
Dept_Id Dept_Name 1622 Aman D03
D01 Admin
1625 Ankit D01
D02 Finance
1631 Ankush D03
D03 HR
1637 Ayush D02
(Master Table) Error: Parent
1639 Ajay D04
key not found
1622 Aman D01
1601 Arun -
Error: Parent
1613 Aarav D07
key not found
1625 Ankit D02
(Slave Table)
Integrity Constraints:
• Integrity constraints are set of rules. They are used to maintain
the quality of information.
• Integrity constraints ensure that the data insertion, updating, and
other processes have to be performed in such a way that data
integrity is not affected.
• Thus, integrity constraint is used to guard against accidental
damage to the database.
Integrity Constraints
Not allowed,
because AGE is an
integer attribute
Now, there are three constraints which we can study under domain
constraint-
Not Null constraint
Default constraint
Check Clause constraint 72
Domain Constraints Cont..:
a. Not Null Constraint-
create table Student (Student_id varchar (5) ,name varchar (20) not null,
depart_name varchar (20));
73
Domain Constraints Cont..:
b. Default Value Constraint-
• Using default value constraint, we are able to set a default value for
an attribute.
• In case if we don’t specify any value for an attribute on which default
constraint is specified, it holds the specified default value.
For example:
create table instructor (instructor_id varchar (5),
name varchar (20) not null,
depart_name varchar (5),
salary numeric (8,2) default 0);
• The check clause constraint ensures that when a new tuple is inserted
in relation it must satisfy the predicate specified in the check clause.
• According to the SQL standard, the predicate that is placed inside the
check clause can be a subquery.
75
2. Key Constraints:
Primary key constraints:
• A primary key always contains Unique & Not Null value in a relation.
77
4. Referential Integrity Constraints:
• This is the concept of foreign key.
• A referential integrity constraint is specified between two tables.
• In the Referential integrity constraints, if a foreign key in relation R1
refers to the Primary Key of relation R2, then every value of the
Foreign Key in R1 must be null or be available in R2.
“if there are two relations R1 & R2 having primary key K1 & K2
respectively and a subset α in R2 referencing to primary key K1 in R1
then for every tuple t2 in R2, there must be a tuple t1 in R1such that:
t2[α] =t1 [k1]
This concept is called Referential Integrity”
R1 R2
K1 K2 α β
t2 ABC
t1 ABC
79
Functional Dependency:
• Functional dependency in DBMS, as the name suggests is a relationship
between attributes of a table dependent on each other.
• Introduced by E. F. Codd, it helps in preventing data redundancy and gets to
know about bad designs.
Let we have a Department table with two attributes − DeptId and DeptName.
• Here, DeptId uniquely identifies the DeptName attribute. This is because if
you want to know the department name, then at first you need to have the
DeptId.
DeptId DeptName
001 Finance
002 Marketing
003 HR
• Union Rule :
if α -> β and α -> γ holds then α -> β γ will also hold.
• Decomposition Rule :
if α -> β γ holds then and α -> β & α -> γ will also hold.
Types of Functional Dependency:
• Trivial functional dependency:
A ->B is trivial functional dependency if B is a subset of A.
The following dependencies are trivial: A->A, AC->A
For example:
Consider a table with columns Student_id and Student_Name.
{Student_Id, Student_Name} -> Student_Id is trivial FD.
For example:
An employee table with attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)
Closure of Functional Dependency:
• The Closure of Functional Dependency means the complete set of all possible
attributes that can be functionally derived from given functional dependency
using the inference rules known as Armstrong’s Rules.
There are three steps to calculate closure of functional dependency. These are:
• Step-1 : Add the attributes which are present on Left Hand Side in the original
functional dependency.
• Step-2 : Now, add the attributes present on the Right Hand Side of the functional
dependency.
• Step-3 : With the help of attributes present on Right Hand Side, check the other
attributes that can be derived from the other given functional dependencies. Repeat this
process until all the possible attributes which can be derived are added in the closure.
Closure of Functional Dependency: Example
Example-1 : Consider the table student_details having (Roll_No, Name,Marks,
Location) as the attributes and having two functional dependencies.
FD1 : Roll_No -> Name, Marks
FD2 : Name -> Marks, Location
{Marks} + = {Marks}
{Location} + ={Location}
Canonical/Minimal Cover of Functional Dependency:
• There are three steps to calculate the canonical cover for a relational schema
having set of functional dependencies.
Canonical/Minimal Cover of Functional Dependency: Example
Solution:
In above dependencies, FD3 (i.e. A->C) is redundant because it can be derived from FD1 &
FD2 using transitivity rule.
Example 2: Consider a relation R(A,B,C,D) having some attributes and below are
mentioned functional dependencies.
• FD1 : B -> A
• FD2 : AD -> C
• FD3 : C -> ABD
Step-2 : Remove extraneous attributes from LHS
of functional dependencies by calculating the
Step-1 : Decompose the functional closure of FD’s having two or more attributes on
dependencies using Decomposition rule LHS.
i.e. single attribute on right hand side.
Here, only one FD has two or more attributes of
FD1 : B -> A LHS i.e. AD -> C.
FD2 : AD -> C
FD3 : C -> A {A}+ = {A}
FD4 : C -> B {D}+ = {D}
FD5 : C -> D
In this case, attribute “A” can only determine
“A” and “D” can only determine “D”.
Hence, no extraneous attributes are present and
the FD will remain the same and will not be
removed.
Canonical/Minimal Cover of Functional Dependency: Example
FD1 : B -> A
FD2 : AD -> C
FD3 : C -> A
FD4 : C -> B
FD5 : C -> D
Example-5 Find the minimal cover of the set of functional dependencies given;
{A → C, AB → C, C → DI, CD → I, EC → AB, EI → C}
(i) A+ = ACDI
From (i), the closure of A included the attribute C. So, B is extraneous
(ii) B+ = B in AB → C, and B can be removed.
(iii) C+ = CDI
(iv) D+ = D From (iii), the closure of C included the attribute I. So, D is
extraneous in CD → I, and D can be removed.
(v) E+ = E
(vi) I+ = I
No more extraneous attributes are found. Hence, we write F1 as F2 after removing extraneous attributes
from F1 as follows;
F2 = {A → C, C → D, C → I, EC → A, EC → B, EI → C}
Canonical/Minimal Cover of Functional Dependency: Example
Hence, set of functional dependencies F2 is the minimal cover for the set F.
Fc = { A → C,
C → D,
C → I,
EC → A,
EC → B,
EI → C
}
OR
Fc= { A →C, C →DI, EC →AB, EI →C}
Closure of Functional Dependency: Calculating Candidate Key
• {A}+ = {A, B, C}
• {B}+ = {B, C}
• {C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the attributes
present in the relation “R”.
Closure of Functional Dependency: Calculating Candidate Key
• {A}+ = {A, B, C}
• {B}+ = {B}
• {C}+ = {C, B}
• {D}+ = {E, D}
• {E}+ = {E, D}
In this case, a single attribute is unable to determine all the attribute on its own.
Here, we need to combine two or more attributes to determine the candidate keys.
• {A, D}+ = {A, B, C, D, E}
• {A, E}+ = {A, B, C, D, E}
Hence, "AD" and "AE" are the two possible keys of the given relation “R”.
Closure of Functional Dependency: Calculating Candidate Key
{E,F}+ = {EFGIJ}
{E,F,H} + = {EFHGIJKLMN}
{E,F,H,K,L} + = {{EFHGIJKLMN}
{E} + = {E}
{EFH} + and {EFHKL} + results in set of all attributes, but EFH is minimal. So it will be
candidate key. So correct option is (B).
Closure of Functional Dependency: Calculating Candidate Key
GATE Question 2:
In a schema with attributes A, B, C, D and E following set of functional dependencies
are given
{A -> B, A -> C, CD -> E, B -> D, E -> A}
Which of the following functional dependencies is NOT implied by the above set?
A. CD -> AC
B. BD -> CD
C. BC -> CD
D. AC -> BC
GATE Question 3:
Consider a relation scheme R = (A, B, C, D, E, H) on which the following functional
dependencies hold: {A–>B, BC–> D, E–>C, D–>A}. What are the candidate keys of R?
(a) AE, BE
(b) AE, BE, DE
(c) AEH, BEH, BCH
(d) AEH, BEH, DEH
Answer: (AE)+ = {ABECD} which is not set of all attributes. So AE is not a candidate
key. Hence option A and B are wrong.
(AEH)+ = {ABCDEH}
(BEH)+ = {BEHCDA}
(BCH)+ = {BCHDA}
which is not set of all attributes. So BCH is not a candidate key. Hence option C is
wrong.
So correct answer is D.
Determination of candidate keys in a relation:
Example 1: Consider a relation schema R = (A, B, C, D, E, F, G, H) on which the
following functional dependencies hold: {AB–>C, A-> DE, B–>F, F–>GH}. Find how
many candidate keys are possible in R?
R = (A, B, C, D, E, F, G, H)
Step-2: Now find the attributes that don’t have any incoming edge i.e. A & B.
It means, no other attributes can find A & B. So these are essential attributes and definitely
will be part of all possible candidate keys.
Step-4: Since {AB} + contains all the attributes of R, so it will acts as candidate key.
If essential attributes (i.e. AB) itself is a candidate key then no other combination need to
check for candidate keys.
Determination of candidate keys in a relation:
Example 2: Consider a relation schema R = (A, B, C, D, E ) on which the following
functional dependencies hold: {CB–>ADE, D-> B}. Find no. of candidate keys in R?
R = (A, B, C, D, E)
Step-2: Now find the attributes that don’t have any incoming edge i.e. C.
R = (A, B, C, D, E)
Step-2: Now find the attributes that don’t have any incoming edge i.e. B.
R = (W, X, Y, Z)
Step-2: All attributes are having incoming edge. So check all combinations.
i) AB-> C
DC-> AE Keys- {ABD, BCD}
E-> F
ii) AB-> C
BD-> EF
Keys- {ABD}
AD- GH
A-> I
iii) AB-> CD
D-> A Keys- {AB, BD, BC}
BC-> DE
What is Decomposition?
1. Lossless Decomposition
2. Dependency Preservation
• Decomposition must be lossless. It means that the information should not get
lost from the relation that is decomposed.
• It gives a guarantee that the join will result in the same relation as it was
decomposed.
• Consider there is a relation R which is decomposed into sub relations R1 , R2 ,
…. , Rn.
• This decomposition is called lossless join decomposition when the join of the
sub relations results in the same relation R that was decomposed.
• For lossless join decomposition, we always have-
⋈ =
⋈ =
This relation is not same as the original relation R and contains some extraneous
tuples. Clearly, R1 ⋈ R2 ⊃ R.
Thus, we conclude that the above decomposition is lossy join decomposition.
Lossless Join Decomposition Example 2
Lossless Join Decomposition Example 2 Cont.…
• Now, you won’t be able to join the above tables, since Emp_ID isn’t part of the
DeptDetails relation.
3. Common attribute must be a key for at least one relation (R1 or R2)
Att(R1) ∩ Att(R2) -> Super Key of R1 or R2
R1 ∩ R2 → R1
OR
R1 ∩ R2 → R2
Check for Lossless Join Decomposition Example-1
Solution:
To determine whether the decomposition is lossless or lossy, we will check all the conditions
one by one. If any of the conditions fail, then the decomposition is lossy otherwise lossless.
Condition-01: (R1 U R2 = R)
R1 ( A , B ) ∪ R2 ( C , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R. Thus, condition-01
satisfies.
Condition-02: (R1 ∩ R2 ≠ Φ)
R1 ( A , B ) ∩ R2 ( C , D ) = Φ
Clearly, intersection of the sub relations is null. So, condition-02 fails.
Solution:
When a given relation is decomposed into more than two sub relations, then-
• First, divide the given relation into two sub relations.
• Then, divide the sub relations according to the sub relations given in the question.
To determine whether the decomposition is lossless or lossy, we will check all the
conditions one by one.
If any of the conditions fail, then the decomposition is lossy otherwise lossless.
Check for Lossless Join Decomposition Example-2 Cont...
Condition-01: (R1 U R2 = R) R‘ ( A , B , C ) ∪ R3 ( B , D ) = R ( A , B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R. Thus, condition-01
satisfies.
Condition-02: (R1 ∩ R2 ≠ Φ) R‘ ( A , B , C ) ∩ R3 ( B , D ) = B
Clearly, intersection of the sub relations is not null. Thus, condition-02 satisfies.
Clearly, intersection of the sub relations is a super key of one of the sub relations. So,
condition-03 satisfies.
Thus, we conclude that the decomposition R` & R3 is lossless.
Check for Lossless Join Decomposition Example-2 Cont...
Condition-01: (R1 U R2 = R) R1 ( A , B ) ∪ R2 ( B , C ) = R’ ( A , B , C )
Thus, condition-01 satisfies.
Condition-02: (R1 ∩ R2 ≠ Φ) R1 ( A , B ) ∩ R2 ( B , C ) = B
Thus, condition-02 satisfies.
Example-4 Check for the given relations, whether they are lossless or not-
R(A,B,C,D,E)
FD = {A->BC, CD->E. B->D, E->A}
R1(A,B,C) & R2 (A,D,E)
Example-4 Check for the given relations, whether they are lossless or not-
i)
R(A,B,C) Lossy Join Decomposition
FD = {A->B} Since, R1 ∩ R2=B (B is not key in either R1 or R2)
R1(A,B) & R2 (B,C)
ii)
R(A,B,C)
Lossless Join Decomposition
FD = {A->B} Since, R1 ∩ R2=A (A is key in R1)
R1(A,B) & R2 (A,C)
iii)
R(A,B,C,D)
FD = {A->B, A->C, C->D} Lossless Join Decomposition
Since, R1 ∩ R2=C (C is key in R2)
R1(A,B,C) & R2 (C,D)
Dependency Preservation:
(F1 U F2 U F3 U … U Fn)+ = F+
where, F1, F2, F3,… Fn -set of Functional dependencies of relations R1, R2, R3, …,
Rn respectively.
If the closure of set (F1 U F2 U F3 U … U Fn)+ are equal to the set of functional
dependencies of the main relation R (before decomposition), then we would
say the decomposition is lossless dependency preserving.
Dependency Preservation Example 1:
Example 1:
Assume R(A, B, C, D) with FDs A→B, B→C, C→D.
Let us decompose R into R1 and R2 as follows;
R1(A, B, C)
R2(C, D)
Then find decomposition is dependency preserving or not?
Solution:
The FDs A→B, and B→C are hold in R1.
The FD C→D holds in R2.
Since, all the functional dependencies hold here. Hence, this decomposition is
dependency preserving.
Dependency Preservation Example 2:
Example 2:
Let, R (X, Y, Z ) is decomposed into R1 (X, Y) & R2 (Y, Z)
& given set of FDs= {X->Y, Y->Z, Z->X}
Check whether decomposition is
i) Lossless or Lossy
ii) dependency preserving or not
Solution:
i) Check for Lossless decomposition:
So, we have
FD1= {X->Y, Y->X} & FD2= {Y->Z, Z->Y}
Example 3:
Consider a relation R (P, Q, R, S) with a set of Functional Dependency
FD = {PQ→R, R→S, S→P}
Relation R is decomposed into R1 (P, Q, R) and R2(R, S).
Find whether the decomposition is dependency preserving or not.
Solution:
To solve this problem, we need to find the closure of Functional Dependencies FD1 and FD2
of the relations R1 (P, Q, R) and R2(R, S).
1) To find the closure of FD1, we have to consider all combinations of (P, Q, R). i.e., we
need to find out the closure of P, Q, R, PQ, QR, and RP.
closure (P) = {P} // Trivial
closure (Q) = {Q} // Trivial
closure (R) = {R, P, S}
= {R, P} //but S can't be in closure as S is not present in R1.
= {P} // Removing R from right side as it is trivial attribute
So FD is: R-> P
Dependency Preservation Example 3 Cont.…:
• Normalization divides the larger table into the smaller table and links them
using relationship.
• Let’s discuss about anomalies first then we will discuss normal forms with
examples.
Anomalies in DBMS
• There are three types of anomalies that occur when the database is not
normalized. These are – Insertion, update and deletion anomaly.
The above table is not normalized. We will see the problems that we face when a
table is not normalized.
Anomalies in DBMS Cont..
i. Update Anomaly:
• In the above table we have two rows for employee Raman as he belongs to two
departments of the company.
• If we want to update the address of Raman then we have to update the same in
two rows or the data will become inconsistent.
Anomalies in DBMS Cont..
• Suppose a new employee joins the company, who is under training and
currently not assigned to any department.
• Then we would not be able to insert the data into the table if Emp_dept field
doesn’t allow Nulls.
Anomalies in DBMS Cont..
For a table to be in the First Normal Form, it should follow the following
rules:
• It should only have single (atomic) valued attributes/columns.
• Values stored in a column should be of the same domain
1st Normal Form Example:
Suppose a company wants to store the names and contact details of its employees.
It creates a table that looks like this:
Emp_Id Emp_Name Emp_Address Emp_Mobile
101 Harsh New Delhi 8912312390
8812121212
102 Jay Kanpur
9900012222
103 Ravi Chennai 7778881212
9990000123
104 Lokesh Bangalore
8123450987
• Two employees (Jay & Lokesh) are having two mobile numbers so the
company stored them in the same field as you can see in the table above.
• This table is not in 1NF as the rule says “each attribute of a table must have
atomic (single) values”, the emp_mobile values for employees Jay & Lokesh
violates that rule.
1st Normal Form Example Cont..:
emp_contact
2nd Normal Form:
Or
Partial Dependency-
• Partial Dependency occurs when a nonprime attribute is functionally
dependent on part of a candidate key.
TEACHER_DETAIL TEACHER_SUBJECT
25 30 25 Chemistry
47 35 25 Biology
83 38 47 English
97 35 83 Math
83 Chemistry
97 English
2nd Normal Form Example 1
Example-1:
Let a relation R (A,B,C,D,E) has following functional dependencies-
{ AB->C, B->D, A->E}. Normalize it up to 2NF.
Solution:
Check for 1NF:
• Relation is already in 1 Normal Form, since no records are given in tabular format so
assume that relation R (A,B,C,D,E) is having all atomic values.
Now from the definition of 2NF, we have to check that every non prime attributes (i.e. C, D,
E) should be fully functional dependent on key of R
• Thus, D & E violates the definition of 2NF since both are partially dependent.
• In order to remove partial dependencies, we have to decompose relation R
• For the decompositions, we’ll remove the attributes from R that are not fully functional
dependent, i.e.
R1 (A, B, C)
• Put D & E in another relations, with the attribute upon which they hold partial
dependencies, i.e.
R2 (B, D) //for B->D
R3 (A, E) //for A->E
Example-2:
Let a relation R (A, B, C, D, E, F) has following functional dependencies-
{ A->BCDEF, BC->A } Check whether it is in 2NF or not.
Solution:
Check for 1NF:
• Relation is already in 1 Normal Form, since no records are given in tabular format so
assume that relation R (A,B,C,D,E,F) is having all atomic values.
Now from the definition of 2NF, we have to check that every non prime attributes (i.e. D, E,
F) should be fully functional dependent on key of R
Example-3:
Let a relation R (A, B, C, D, E, F) has following functional dependencies-
{ A->BCDEF, BC->A, B->F, C->E } Check whether it is in 2NF or not.
Solution:
Check for 1NF:
• Relation is already in 1 Normal Form, since no records are given in tabular format so
assume that relation R (A,B,C,D,E,F) is having all atomic values.
Now from the definition of 2NF, we have to check that every non prime attributes (i.e. D, E,
F) should be fully functional dependent on key of R
Transitive dependency-
if α -> β and β ->γ hold,
then α -> γ will also holds
3rd Normal Form:
That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP>
table, with EMP_ZIP as a Primary key.
Example-1:
Normalize a relation R (A, B, C, D, E) up to 3NF, when following functional
dependencies are given { A->BCDE, C->D, B->E }
Solution:
Check for 1NF:
• Relation is already in 1 Normal Form, since no records are given in tabular format so
assume that relation R (A,B,C,D,E) is having all atomic values.
Since, there is single attribute in candidate key i.e. A, so all the dependencies will be fully
functional dependent. No need to check.
So, given relation R is in 2 NF already
3rd Normal Form Example 1
• A->C
C is directly dependent on A, thus there is no transitivity. // So it is in 3NF
• A->D
D is Transitive dependent on C { A->C & C->D} //So not in 3NF
• A->E
{Transitive dependent: A->B & B->E} //So not in 3NF
Example-2:
Normalize a relation R (A, B, C, D, E, F) up to 3NF, when following functional
dependencies are given { AB->C, B->E, D->F, AB->D }
Solution:
Check for 1NF:
• Relation is already in 1 Normal Form, since no records are given in tabular format so
assume that relation R (A,B,C,D,E) is having all atomic values.
Example: Find the highest normal form of a relation R(A, B, C, D, E) with FD set as: { BC-
>D, AC->BE, B->E }
Explanation:
Step-1: Candidate Key: (AC)+ ={A, C, B, E, D}
Step-3:
• The relation R is in 1st normal form as a relational DBMS does not allow multi-valued
or composite attribute.
• The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a
proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate
key) and B->E is in 2nd normal form (B is not a proper subset of candidate key AC).
• The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor
D is a prime attribute) and in B->E (neither B is a super key nor E is a prime attribute)
but to satisfy 3rd normal for, either LHS of an FD should be super key or RHS should be
prime attribute.
• After the application of the 2NF and 3NF, some dependencies can still exist
that will cause redundancy to be present in relations.
• Although, 3NF is adequate normal form for relational database, still, this
(3NF) normal form may not remove 100% redundancy because of X?Y
functional dependency, if X is not a candidate key of given relation.
• This weakness in 3NF, resulted in the presentation of a stronger normal form
called Boyce–Codd Normal Form (Codd, 1974).
• Boyce and Codd Normal Form is a higher version of the Third Normal form. It
is stricter than 3NF. This form deals with certain type of anomaly that is not
handled by 3NF.
Example: Let's assume there is a company where employees work in more than one
department.
EMPLOYEE
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it.
Boyce and Codd Normal Form (BCNF)-Example Cont.….
EMPLOYEE
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
Candidate keys:
For the first table: EMP_ID Now, this is in BCNF because left
For the second table: EMP_DEPT side part of both the functional
dependencies is a key.
For the third table: {EMP_ID, EMP_DEPT}
Boyce and Codd Normal Form (BCNF)-Example 1
Solution:
Given, FDs= {A->BC, B->A}
Solution:
Given, FDs= {A-> BCD, BC -> AD, D->B}
Solution:
Given, FDs= {AB-> CD, D->B}
Solution:
Given, FDs= {AB-> C, C->B }
OR
If all above three are true then a table is having multivalued dependency.
since there is no relation between course and hobby, so it may create two more
additional rows which may be a bad design-
Lets consider a relation Employee ( Ename, Pname , Dname ) which has following non
trivial MVD’s:
Ename Pname Dname
Ename -> -> Pname
Ename -> -> Dname Jay P1 D1
Jay P1 D2
Jay P2 D1
Jay P2 D2
Solution:
Ename is not the super key of the relation. Hence the relation is not in 4NF.
If the relation is decomposed into two relations-
R1 (Ename Pname ) , Ename -> -> Pname R2 (Ename Dname) , Ename -> -> Dname
Ename Pname Ename Dname
Jay P1 Jay D1
Jay P2 Jay D2
Now in R1 & R2, Ename is again not the super key, but as Ename and Pname together make
the relation, and similarly Ename and Dname together make the relation, so these are the
trivial Multivalued Dependencies and hence these two relations are in 4NF.
4th Normal Form Example 2-
5NF is satisfied when all the tables are broken into as many tables as possible in
order to avoid redundancy.
174
Thank You
175