0% found this document useful (0 votes)
12 views

Dbms Module 3

Uploaded by

csmic.2022
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Dbms Module 3

Uploaded by

csmic.2022
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

lOMoARcPSD|12926640

Dbms 3 - Lecture notes of Fourth semester dbms module 3

Database management system (University of Calicut)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Kumbidi Gaming ([email protected])
lOMoARcPSD|12926640

DATABASE MANAGEMENT SYSTEM AND RDBMS


Module III
Relational Database Design
 A relational database stores its data in 2- dimensional tables.
 A table is a two-dimensional structure made up of rows (tuples, records)
and columns (attributes, fields).
 A relational database organizes data in tables (or relations). A table is
made up of rows and columns.
 A row is also called a record (or tuple).
 A column is also called a field (or attribute).
 A database table is similar to a spreadsheet. However, the relationships
that can be created among the tables enable a relational database to
efficiently store huge amount of data, and effectively retrieve selected
data.
 A language called SQL (Structured Query Language) was developed to
work with relational databases.
Database Design Objective
A well-designed database shall:
• Eliminate Data Redundancy: the same piece of data shall not be stored in
more than one place. This is because duplicate data not only waste storage
spaces but also easily lead to inconsistencies.
• Ensure Data Integrity and Accuracy

Anomalies in a Database
Anomalies are problems caused by bad database design. The problems arise
from relations that are generated directly from user views are called anomalies.
Types:
1. Insertion Anomaly
2. Updation Anomaly
3. Deletion Anomaly
To understand these anomalies let us take an example of a Student table.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

rollno name branch hod Office tel


401 Anu maths Mr.X 53337
402 Anju Maths Mr.X 53337
403 Raju Maths Mr.X 53337
404 Devi maths Mr.X 53337

Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data
of the student cannot be inserted, or else we will have to set the branch
information as NULL. Also, if we have to insert data of 100 students of same
branch, then the branch information will be repeated for all those 100
students. These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? Or is no longer the HOD of computer science
department? In that case all the student records will have to be updated, and if
by mistake we miss any record, it will lead to data inconsistency. This is
Updation anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student
information and Branch information. Hence, at the end of the academic year, if
student records are deleted, we will also lose the branch information. This is
Deletion anomaly.
Cause of Anomalies
Anomalies are primarily caused by:
• Data redundancy: replication of the same field in multiple times, other than
foreign keys
• Functional dependencies
Fixing Anomalies Anomalies can be corrected by
 Decomposition
 Normalization

Functional Dependencies

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

Functional dependency is defined as the attributes of a table is said to be


dependent on each other when an attribute of a table uniquely identifies
another attribute of the same table. If column A of a table uniquely identifies
the column B of same table then it can represented as A->B (Attribute B is
functionally dependent on attribute A)
Types of Functional Dependencies
 Trivial functional dependency
A → B has trivial functional dependency if B is a subset of A.
The following dependencies are also trivial like: A → A, B → B
 non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a subset of A.
When A intersection B is NULL, then A → B is called as complete non-trivial.
 Multivalued dependency
 Transitive dependency
When an indirect relationship causes functional dependency it is called
Transitive Dependency.
If P -> Q and Q -> R is true, then P-> R is a transitive dependency.
To achieve 3NF, eliminate the Transitive Dependency
 Fully and partial functional dependency

Normalization theory
Database normalization is a database schema design technique, by which an
existing schema is modified to minimize redundancy and dependency of data.
Normalization split a large table into smaller tables and define relationships
between them to increases the clarity in organizing data.
Normalization is the process of organizing the data in the database.
Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like
Insertion, Update and Deletion Anomalies. Normalization divides the larger
table into the smaller table and links them using relationship. The normal form
is used to reduce redundancy from the database table.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

There are the four types of normal forms:


 First normal form(1NF)
 Second normal form(2NF)
 Third normal form(3NF)
 Boyce & Codd normal form (BCNF)
 Fifth normal form or project normal form(5NF or PJNF)

First Normal Form(1NF)


For a table to be in the First Normal Form, it should follow the following 4
rules:
1. It should only have single (atomic) valued attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. And the order in which data is stored, does not matter.
Rule 1: Single Valued Attributes
Each column of your table should be single valued which means they should
not contain multiple values. We will explain this with help of an example later,
let's see the other rules for now.
Rule 2: Attribute Domain should not change
This is more of a "Common Sense" rule. In each column the values stored must
be of the same kind or type.
Rule 3: Unique name for Attributes/Columns
This rule expects that each column in a table should have a unique name. This
is to avoid confusion at the time of retrieving data or performing any other
operation on the stored data. If one or more columns have same name, then
the DBMS system will be left confused.
Rule 4: Order doesn't matters
This rule says that the order in which you store the data in your table doesn't
matter.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

Second Normal Form(2NF)


For a table to be in the Second Normal Form,
1. It should be in the First Normal form.
2. And, it should not have Partial Dependency.
To be in second normal form, a relation must be in first normal form and
relation must not contain any partial dependency. A relation is in 2NF if it has
No Partial Dependency, i.e., no non-prime attribute (attributes which are not
part of any candidate key) is dependent on any proper subset of any candidate
key of the table.
Partial Dependency
If we follow second normal form, then every non-prime attribute should be
fully functionally dependent on prime key attribute. That is, if X → A holds,
then there should not be any proper subset Y of X, for which Y → A also holds
true.
Student_Project

Stu_ID Pro_ID Stu_Name Proj_Name

We see here in Student_Project relation that the prime key attributes are
Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name
and Proj_Name must be dependent upon both and not on any of the prime key
attribute individually. But we find that Stu_Name can be identified by Stu_ID
and Proj_Name can be identified by Proj_ID independently. This is called partial
dependency, which is not allowed in Second Normal Form.
Student
Stu_ID Stu_Name Proj_ID
Project
Pro_ID Pro_Name

We broke the relation in two as depicted in the above picture. So there exists
no partial dependency.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

Third Normal forms(3NF)


A table is said to be in the Third Normal Form when,
1. It is in the Second Normal form.
2. And, it doesn't have Transitive Dependency.
3. For any non-trivial functional dependency, X → A, then either −X is a
superkey
Transitive Dependency
We find that in the above Student_detail relation, Stu_ID is the key and only
prime key attribute. We find that City can be identified by Stu_ID as well as Zip
itself. Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID
→ Zip → City, so there exists transitive dependency.
Fully-Functionally Dependency
An attribute is fully functional dependent on another attribute, if it is
Functionally Dependent on that attribute and not on any of its proper subset.
Partial Functional Dependency
Partial Dependency occurs when a nonprime attribute is functionally
dependent on part of a candidate key.The 2nd Normal Form (2NF) eliminates
the Partial Dependency.

Relations with more than one Candidate key


Good and bad Dependencies
 A good functional dependency is the natural(or trivial)
functional dependency.
 All other kinds of functional dependencies are bad functional
dependencies.
Boycee codd Normal Form
Boyce and Codd Normal Form is a higher version of the Third Normal form. This
form deals with certain type of anomaly that is not handled by 3NF. A 3NF table
which does not have multiple overlapping candidate keys is said to be in BCNF.
For a table to be in BCNF, following conditions must be satisfied:

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

 R must be in 3rd Normal Form


 And, for each functional dependency ( X → Y ), X should be a super Key.
For an Example
Below we have a college enrolment table with columns student_id, subject and
professor.
As you can see, we have also added some sample data to the table. In the table
above:
 One student can enrol for multiple subjects. For example, student with
student_id 101, has opted for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors teaching one subject like we have for
Java.
What do you think should be the Primary Key?
Well, in the table above student_id, subject together form the primary key,
because using student_id and subject, we can find all the columns of the table
. One more important point to note here is, one professor teaches only one
subject, but one subject may have two different professors.Hence, there is a
dependency between subject and professor here, where subject depends on
the professor name.
This table satisfies the 1st Normal form because all the values are atomic,
column names are unique and all the values stored in a particular column are
of same domain.
This table also satisfies the 2nd Normal Form as there is no Partial
Dependency.And, there is no Transitive Dependency, hence the table also
satisfies the 3rd Normal Form.But this table is not in Boyce-Codd Normal
Form.To make this relation (table) satisfy BCNF, we will decompose this table
into two tables, student table and professor table.
Below we have the structure for both the tables
Student Table

Student_id P_id
101 1

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

101 2
And, professor Table
P_id professor subject
1 P.java Java
2 P.cpp C++

Multivalued dependencies
A table is said to have multi-valued dependency, if the following conditions are
true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists,
then the table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
3. And, for a relation R(A,B,C), if there is a multi-valued dependency between,
A and B, then B and C should be independent of each other.
If all these conditions are true for any relation (table), it is said to have multi-
valued dependency.
For an Example
Below we have a college enrolment table with columns s_id, course and hobby
S_id course hobby
1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 php hockey

As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more records,
as shown below, because for one student, two hobbies exists, hence along with
both the courses, these hobbies should be specified.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

S_id course hobby


1 science Cricket
1 maths Hockey
1 science Hockey
1 maths Cricket

And, in the table above, there is no relationship between the columns course
and hobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetition of
data and other anomalies as well.

Fourth Normal form


A table is said to be in the Fourth Normal Form when,
1. It is in the Boyce-Codd Normal Form.
2. And, it doesn't have Multi-Valued Dependency.
For an Example

Below we have a college enrolment table with columns s_id, course and hobby.

S_id Course hobby


1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey
As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more records,
as shown below, because for one student, two hobbies exists, hence along with
both the courses, these hobbies should be specified

S_id course hobby


1 Science Cricket
1 Maths Hockey
1 Science Hockey
1 Maths Cricket

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

And, in the table above, there is no relationship between the columns course
and hobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary repetition of
data and other anomalies as well.
How to satisfy 4th Normal Form?
To make the above relation satify the 4th normal form, we can decompose the
table into 2 tables.
CourseOpted Table

S_id Course
1 Science
1 Maths
2 C#
2 PHP
And ,hobbies Table

S_id hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey
Now this relation satisfies the fourth normal form.

Join Dependencies
If a table can be recreated by joining multiple tables and each of this table have
a subset of the attributes of the table, then the table is in Join Dependency.it is
a generalization of Multivalued Dependency.
Join Dependency can be related to 5NF, wherein a relation is in 5NF,only if it is
already in 4NF and it cannot be decomposed further.

Fifth Normal Form


A relation R is in 5NF if and only if every join dependency in R is implied by the
candidate keys of R. A relation decomposed into two relations must have loss-
less join Property, which ensures that no spurious or extra tuples are
generated, when relations are reunited through a natural join. A relation is in
Fifth Normal Form (5NF), if it is in 4NF, and won’t have lossless decomposition
into smaller tables.

Downloaded by Kumbidi Gaming ([email protected])


lOMoARcPSD|12926640

Properties – A relation R is in 5NF if and only if it satisfies following conditions:


1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency)

Downloaded by Kumbidi Gaming ([email protected])

You might also like