Dbms Module 3
Dbms Module 3
Anomalies in a Database
Anomalies are problems caused by bad database design. The problems arise
from relations that are generated directly from user views are called anomalies.
Types:
1. Insertion Anomaly
2. Updation Anomaly
3. Deletion Anomaly
To understand these anomalies let us take an example of a Student table.
Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data
of the student cannot be inserted, or else we will have to set the branch
information as NULL. Also, if we have to insert data of 100 students of same
branch, then the branch information will be repeated for all those 100
students. These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? Or is no longer the HOD of computer science
department? In that case all the student records will have to be updated, and if
by mistake we miss any record, it will lead to data inconsistency. This is
Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student
information and Branch information. Hence, at the end of the academic year, if
student records are deleted, we will also lose the branch information. This is
Deletion anomaly.
Cause of Anomalies
Anomalies are primarily caused by:
• Data redundancy: replication of the same field in multiple times, other than
foreign keys
• Functional dependencies
Fixing Anomalies Anomalies can be corrected by
Decomposition
Normalization
Functional Dependencies
Normalization theory
Database normalization is a database schema design technique, by which an
existing schema is modified to minimize redundancy and dependency of data.
Normalization split a large table into smaller tables and define relationships
between them to increases the clarity in organizing data.
Normalization is the process of organizing the data in the database.
Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like
Insertion, Update and Deletion Anomalies. Normalization divides the larger
table into the smaller table and links them using relationship. The normal form
is used to reduce redundancy from the database table.
We see here in Student_Project relation that the prime key attributes are
Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name
and Proj_Name must be dependent upon both and not on any of the prime key
attribute individually. But we find that Stu_Name can be identified by Stu_ID
and Proj_Name can be identified by Proj_ID independently. This is called partial
dependency, which is not allowed in Second Normal Form.
Student
Stu_ID Stu_Name Proj_ID
Project
Pro_ID Pro_Name
We broke the relation in two as depicted in the above picture. So there exists
no partial dependency.
Student_id P_id
101 1
101 2
And, professor Table
P_id professor subject
1 P.java Java
2 P.cpp C++
Multivalued dependencies
A table is said to have multi-valued dependency, if the following conditions are
true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists,
then the table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
3. And, for a relation R(A,B,C), if there is a multi-valued dependency between,
A and B, then B and C should be independent of each other.
If all these conditions are true for any relation (table), it is said to have multi-
valued dependency.
For an Example
Below we have a college enrolment table with columns s_id, course and hobby
S_id course hobby
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 php hockey
As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more records,
as shown below, because for one student, two hobbies exists, hence along with
both the courses, these hobbies should be specified.
And, in the table above, there is no relationship between the columns course
and hobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetition of
data and other anomalies as well.
Below we have a college enrolment table with columns s_id, course and hobby.
And, in the table above, there is no relationship between the columns course
and hobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetition of
data and other anomalies as well.
How to satisfy 4th Normal Form?
To make the above relation satify the 4th normal form, we can decompose the
table into 2 tables.
CourseOpted Table
S_id Course
1 Science
1 Maths
2 C#
2 PHP
And ,hobbies Table
S_id hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey
Now this relation satisfies the fourth normal form.
Join Dependencies
If a table can be recreated by joining multiple tables and each of this table have
a subset of the attributes of the table, then the table is in Join Dependency.it is
a generalization of Multivalued Dependency.
Join Dependency can be related to 5NF, wherein a relation is in 5NF,only if it is
already in 4NF and it cannot be decomposed further.