chap 5 dbms
chap 5 dbms
Diagram-2
The relation dept_employees would cause serious redundancy storage problem if at least 5 to 10 employees
are working in each department. That is for n number of employees working for the same department the
values for dept_id, dept_name and dept_location remains same and will be repeated n times
Where as in dig(1) only dept_id value repeated n times for n employees working for the same department in
employee relation and all other information regarding department is maintained only once in department
relation.
Another serious problem that we come across while working with dept_employees relation is
Update anomalies :-
Update anomalies can be classified as:
1) Insertion anomalies
2) Deletion anomalies
3) Modification anomalies
Whenever a new department comes into existence and yet employees are to be appointed- how to fill values
for attributes such as emp_no, name, dob, salary & only informationfoe dept_id,dept_name etc are known
assuming emp_no as the primary key, this situation introduce a serious insertion anomaly problem, since
primary key of a relation cannot be null.
But if we go with separate relations employees and department as shown in diagram(1) the department
details could be entered in department relation irrespective of employee details.
Similarly if is difficult to enter new employees details in dept_employe’s relation, where in for such
employees get the department has not been allocated.
Guideline 2: Design the base table such that it avoids insertion, deletion and modification anomalies.
Attributes whose values are unknown or is not distinct from others in many rows then such attributes
should be avoided null values lead to wastage of storage space.
Guideline 3:- Design the table such that all of its attributes assume distinct values other that null values.
If at all null values are to be used for a particular attribute, if must be in exceptional cases only.
While joining tables, if we do not use attributes that represent either primary key or foreign key to join then
the resulting relation from such in appropriate grouping of attributes give rise to large no of take tuples such
tuples represent most of the time wrong information. So that the end user cannot make useful conclusions by
employing queries on such relations.
Guideline 4:-Design tables such that they can be joined on attributes that are primary key of foreign key. If
joined on some other attributes, it may result in spurious tuples.
Functional Dependencies :-
A functional dependency (FD) is a constraint between two sets of attributes from the given database.
A functional dependency is denoted as X Y
i.e FD : X Y
This notation is read as “Y is functionally dependent on X”
The set of attributes X is termed as left hand side of FD
X or LHS is sometimes referred to as determinant.
The set of attributes Y is termed right hand side of FD
Y or RHS is sometimes referred to as dependent
The meaning of this constraint is that whenever any two tuples t1 and t2 of a relation T agree on this X
values i.e t1[x]=t2[x], then they must also agree upon their values i.e t1[y]=t2[y].
In other words, a functional dependencies refers to the meaning or semantics of the attributes. i.e. , as &
when the semantics of 2 sets of attributes in relation R specify that a FD should exist, then this indicates as a
constraint.
Relation states r of R satisfying the constraints of functional dependency are termed as legal relation states
of legal extensions of R.
Thus FD’s help in assisting the database designers to understand and interpret as to how sets of attributes
relate to one another such that this dependency could be maintained on all relation instances r of R.
Eg of functional dependencies:
1. Emp_id emp_name
2. Reg_no std_name
Note that in each of the abore the LHS uniquely determines the right hand side.
- For instance the values of emp_id is distinct for each employee and hence uniquely identifies individual
employee names.
- Similarly in eg-2 the students reg_no uniquely determines each and every student.
Also, alternatively it is said that student name or employee name is functionally determined by their
respectime reg_no or emp_id.
The following six rules are well known inference rules for functional dependencies
IR1 (Reflexive rule) : x y, thenx y.
IR2(Augmentation rule) : {x y }
IR3(Transitive rule) : { x y, yz} xz
IR4(Decomposition or projective rule) : { x yz} xy
IR5(Union or additive rule) : { xy ,xz} xyz
IR6(Pseudotransitive rule) : { x y ,wyz } wx z.
Proof of IR1 :-
Suppose that x y and that 2 tuples t1 and t2 exist in some relation instance r of R such that t1[x]=t2[x].
Then t1[y]=t2[y], because x y, hence xy must hold in r.
Proof of IR3 :-
Assume that (1) xy and (2) xz both hold in a relation r. Then for any two tuples t1 & t2 in r such that
t1[x]=t2[x], we must have (3) t1[y]=t2[y], from assumption(1), hence we must also have (4) t1[z]=t2[z], from (3)
& assumption (2), hence xz must hold in r.
We assume that each dept can have a number of locations. The relation is not in INF because dlocation
is not single valued attribute.
There are 3 main techniques to achieve INF for such a relation.
1) Remove the attribute dlocations that violates INF and place it in a separate relation dept_locations along
with the primary key as show in fig.
Department
Dname Dnumber Dmgrssn
Dept_locations
Dnumber Dlocation
Primary key is a combination of dnumber and dlocation a distinct tuple in dept_locations exists for
each location of a department. This decomposes non 1NF relation into 2 1NF relations.
2) Expand the key so that there will be separate tuple in the original dept relation for each location of dept as
shown in the diagram.
In this care primary key becomes { dnumbers’,Dlocation}. But this solution has the disadvantage of
introducing redundancy in the relation.
3) If a max no of values is known for the attribute for eg if it is known that at most 3 locations can exist for a
department- replace dlocations attribute by 3 atomic attributes dloc1,dloc2, dloc3.
Dname Dnumber Dmgrssn Dloc1 Dloc2 Dloc3
This solution has the disadvantage of introducing null values if dept’s have fewer than 3 locations of the
3 solutions, first is superb because if does not suffer from redundancy.
Employee_dept
Emp_dept
Third normal form :
A relation R is said to be in 3NF if and only if it is 2NF and a non key field should not be detemined by another
non-key field.
In other words, a non_key field cannot depend upon another non-key field in a given relation R.
3NF is based on eliminating transitive dependency
For eg : consider the following relation.
HOSPITAL
Ward_no Ward_name Ward_capacity Unit in charge
36 Mortury 100 Dr.vijaya
41 Canality 10 Dr.rinay
82 Labour 150 Dr.shantala
90 Paediatics 50 Dr.sanjeev
In this eg, a particular ward in a hospital should be headed by respective specialist doctors
Only
It is observed that although ward_no is the primary key and ward_name is fully dependent on ward_no,
wardname and unit in charge fields are dependent on each other (non-keys)
i.e ward_no ward_name.
ward_name unit in charge.
Both unit in charge and ward name fields are non-key fields one depends over the other for thir
existence.
Hence this relation is not in 3NF.
To normalize this relation into 3NF, perform grouping of ward_no ward_name and ward_caparity into
HOSPITAL relation and ward_name and unit incahrge into specialists relation.
Hospital
Specialists.
Boyce codd normal form(BCNF):
BCNF definition makes no explicit reference to 1NF and 2 NF nor to the concepts of full functional
dependency and transitive dependency.
“ A relation R is said to be in boyce _codd normal form (BCNF) if and only if every determinant
(i.e LHS of FD) is a candidate key”
For eg : consider student relation.
Student
Reg_no Name Class Telephone
Telephone no is as good as contact_id that distinguish each and every student. Let us assume reg_no, name
and telephone attributes are distinct or unique.
Now the following FD’s can be interred for the STUDENT relation.
Fd : Reg_no name
Fd : Reg_noclass
Fd3: Reg_notelephone.
Fd4: name class
Fd5: name telephone.
Fd6: telephonename
Fd7 : telephoneclass.
Now student relation is said to be in BCNF, since all contain a candidate key as the determinant (LHS).
BCNF was proposed as a simpler form of 3NF, but it was found to be stricter than 3NF because every
relation in BCNF is also in3NF but vice versa is not true.