0% found this document useful (0 votes)
20 views

DBMS PPT (1).pptx

A Database Management System (DBMS) is software that organizes and manages data, allowing users to create, modify, and query databases while ensuring data integrity and security. DBMS can be classified into Relational and Non-Relational systems, each with distinct data organization methods. While DBMS offers advantages such as data organization, security, and concurrent access, it also has drawbacks including complexity, performance overhead, and cost.

Uploaded by

Random Videos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

DBMS PPT (1).pptx

A Database Management System (DBMS) is software that organizes and manages data, allowing users to create, modify, and query databases while ensuring data integrity and security. DBMS can be classified into Relational and Non-Relational systems, each with distinct data organization methods. While DBMS offers advantages such as data organization, security, and concurrent access, it also has drawbacks including complexity, performance overhead, and cost.

Uploaded by

Random Videos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

Introduction of DBMS (Database Management System)

A Database Management System (DBMS) is a software system that is designed to manage and
organize data in a structured manner. It allows users to create, modify, and query a database, as
well as manage the security and access controls for that database.

Key Features of DBMS


Data modeling: A DBMS provides tools for creating and modifying data models, which define the
structure and relationships of the data in a database.

Data storage and retrieval: A DBMS is responsible for storing and retrieving data from the
database, and can provide various methods for searching and querying the data.

Concurrency control: A DBMS provides mechanisms for controlling concurrent access to the
database, to ensure that multiple users can access the data without conflicting with each other.
Data integrity and security: A DBMS provides tools for enforcing data integrity and
security constraints, such as constraints on the values of data and access controls that
restrict who can access the data.

Backup and recovery: A DBMS provides mechanisms for backing up and recovering the
data in the event of a system failure.

DBMS can be classified into two types: Relational Database Management System
(RDBMS) and Non-Relational Database Management System (NoSQL or Non-SQL)

RDBMS: Data is organized in the form of tables and each table has a set of rows and
columns. The data are related to each other through primary and foreign keys.

NoSQL: Data is organized in the form of key-value pairs, documents, graphs, or


column-based. These are designed to handle large-scale, high-performance scenarios.
File System Approach

File based systems were an early attempt to computerize the manual system. It is also called
a traditional based approach in which a decentralized approach was taken where each
department stored and controlled its own data with the help of a data processing specialist.
The main role of a data processing specialist was to create the necessary computer file
structures, and also manage the data within structures and design some application
programs that create reports based on file data.
Consider an example of a student's file system. The student file will contain information
regarding the student (i.e. roll no, student name, course etc.). Similarly, we have a subject file
that contains information about the subject and the result file which contains the information
regarding the result.
Some fields are duplicated in more than one file, which leads to data redundancy. So to
overcome this problem, we need to create a centralized system, i.e. DBMS approach.
DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion, deletion,
selection, sorting etc.
Disadvantages of File System:

Redundancy of data: Data is said to be redundant if the same data is copied at many places.
If a student wants to change their Phone number, he or she has to get it updated in various
sections. Similarly, old records must be deleted from all sections representing that student.

Inconsistency of Data: Data is said to be inconsistent if multiple copies of the same data do
not match each other. If the Phone number is different in Accounts Section and Academics
Section, it will be inconsistent. Inconsistency may be because of typing errors or not
updating all copies of the same data.

Difficult Data Access: A user should know the exact location of the file to access data, so the
process is very cumbersome and tedious. If the user wants to search the student hostel
allotment number of a student from 10000 unsorted students’ records, how difficult it can
be.
Unauthorized Access: File Systems may lead to unauthorized access to data. If a student gets
access to a file having his marks, he can change it in an unauthorized way.

No Concurrent Access: The access of the same data by multiple users at the same time is
known as concurrency. The file system does not allow concurrency as data can be accessed
by only one user at a time.
No Backup and Recovery: The file system does not incorporate any backup and recovery of
data if a file is lost or corrupted.
Advantages of DBMS

Data organization: A DBMS allows for the organization and storage of data in a structured
manner, making it easy to retrieve and query the data as needed.

Data integrity: A DBMS provides mechanisms for enforcing data integrity constraints,
such as constraints on the values of data and access controls that restrict who can access
the data.

Concurrent access: A DBMS provides mechanisms for controlling concurrent access to the
database, to ensure that multiple users can access the data without conflicting with each
other.

Data security: A DBMS provides tools for managing the security of the data, such as
controlling access to the data and encrypting sensitive data.

Backup and recovery: A DBMS provides mechanisms for backing up and recovering the
data in the event of a system failure.

Data sharing: A DBMS allows multiple users to access and share the same data, which can
be useful in a collaborative work environment.
Disadvantages of DBMS

Complexity: DBMS can be complex to set up and maintain, requiring specialized


knowledge and skills.

Performance overhead: The use of a DBMS can add overhead to the performance of an
application, especially in cases where high levels of concurrency are required.

Scalability: The use of a DBMS can limit the scalability of an application, since it requires
the use of locking and other synchronization mechanisms to ensure data consistency.

Cost: The cost of purchasing, maintaining and upgrading a DBMS can be high, especially
for large or complex systems.

Limited Use Cases: Not all use cases are suitable for a DBMS, some solutions don’t need
high reliability, consistency or security and may be better served by other types of data
storage.
Basics File System DBMS
The file system is a way of
DBMS is software for managing
Structure arranging the files in a storage
the database.
medium within a computer.
Data
Redundant data can be present In DBMS there is no redundant
Redundan in a file system. data.
cy
Backup It doesn’t provide Inbuilt It provides in house tools for
and mechanism for backup and backup and recovery of data
Recovery recovery of data if it is lost. even if it is lost.

Query There is no efficient query Efficient query processing is there


processing processing in the file system. in DBMS.
There is more data consistency
Consisten There is less data consistency in
because of the process of
cy the file system.
normalization.
It has more complexity in
Complexit It is less complex as compared to
handling as compared to the file
y DBMS.
system.
Security DBMS has more security
Basics File System DBMS
It has a comparatively higher cost
Cost It is less expensive than DBMS.
than a file system.
Data In DBMS data independence
exists, mainly of two types:
Independe There is no data independence.
1) Logical Data Independence.
nce 2)Physical Data Independence.
User Only one user can access data at a Multiple users can access data at a
Access time. time.
The users are not required to write The user has to write procedures
Meaning procedures. for managing databases
Data is distributed in many files. Due to centralized nature data
Sharing So, it is not easy to share data. sharing is easy
Data It give details of storage and It hides the internal details of
Abstraction representation of data Database

Integrity Integrity Constraints are difficult to Integrity constraints are easy to


Constraints implement implement
To access data in a file , user
Attributes requires attributes such as file No such attributes are required.
name, file location.
Application of DBMS:

There are different fields where a database management system is utilized. Following are a
few applications which utilize the information base administration framework –

Railway Reservation System –


In the rail route reservation framework, the information base is needed to store the record or
information of ticket appointments, status about train’s appearance, and flight. Additionally, if
trains get late, individuals become acquainted with it through the information base update.

Library Management System –


There are lots of books in the library so; it is difficult to store the record of the relative
multitude of books in a register or duplicate. Along these lines, the data set administration
framework (DBMS) is utilized to keep up all the data identified with the name of the book,
issue date, accessibility of the book, and its writer.

Banking –
Database the executive’s framework is utilized to store the exchange data of the client in the
information base.
Education Sector –
Presently, assessments are led online by numerous schools and colleges. They deal with all
assessment information through the data set administration framework (DBMS). In spite
of that understudy’s enlistments subtleties, grades, courses, expense, participation,
results, and so forth all the data is put away in the information base.

Credit card exchanges –


The database Management framework is utilized for buying on charge cards and age of
month to month proclamations.

Social Media Sites –


We all utilization of online media sites to associate with companions and to impart our
perspectives to the world. Every day, many people group pursue these online media
accounts like Pinterest, Facebook, Twitter, and Google in addition to. By the utilization of
the data set administration framework, all the data of clients are put away in the
information base and, we become ready to interface with others.
Broadcast communications –
Without DBMS any media transmission organization can’t think. The Database the
executive’s framework is fundamental for these organizations to store the call subtleties
and month to month postpaid bills in the information base.

Account –
The information base administration framework is utilized for putting away data about
deals, holding and acquisition of monetary instruments, for example, stocks and bonds in
a data set.

Online Shopping –
These days, web-based shopping has become a major pattern. Nobody needs to visit the
shop and burn through their time. Everybody needs to shop through web based
shopping sites, (for example, Amazon, Flipkart, Snapdeal) from home. So all the items
are sold and added uniquely with the assistance of the information base administration
framework (DBMS). Receipt charges, installments, buy data these are finished with the
assistance of DBMS.
Human Resource Management –
Big firms or organizations have numerous specialists or representatives working under
them. They store data about worker’s compensation, assessment, and work with the
assistance of an information base administration framework (DBMS).

Manufacturing –
Manufacturing organizations make various kinds of items and deal them consistently.
To keep the data about their items like bills, acquisition of the item, amount, inventory
network the executives, information base administration framework (DBMS) is utilized.

Airline Reservation System –


This framework is equivalent to the railroad reservation framework. This framework
additionally utilizes an information base administration framework to store the records
of flight takeoff, appearance, and defer status.
Healthcare: DBMS is used in healthcare to manage patient data, medical records, and
billing information.

Data retrieval: DBMS provides a way to retrieve data quickly and easily using search
queries.

Data manipulation: DBMS provides tools to manipulate data, such as sorting, filtering,
and aggregating data.

Security: DBMS provides security features to ensure that only authorized users have
access to the data.

Data backup and recovery: DBMS provides tools to back up data and recover it in case of
system failures or data loss.

Multi-user access: DBMS allows multiple users to access and modify data
simultaneously.

Reporting and analysis: DBMS provides tools to generate reports and analyze data to
gain insights and make informed decisions.
Database Languages

Data Definition Language

Data Manipulation Language

Data Control Language

Transactional Control Language


Data Definition Language
DDL is the short name for Data Definition Language, which deals with database
schemas and descriptions, of how the data should reside in the database.

CREATE: to create a database and its objects like (table, index, views, store procedure,
function, and triggers)

ALTER: alters the structure of the existing database

DROP: delete objects from the database

TRUNCATE: remove all records from a table, including all spaces allocated for the
records are removed

COMMENT: add comments to the data dictionary

RENAME: rename an object


Data Manipulation Language
DML is the short name for Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT, INSERT, UPDATE,
DELETE, etc., and it is used to store, modify, retrieve, delete and update data in a
database.

SELECT: retrieve data from a database

INSERT: insert data into a table

UPDATE: updates existing data within a table

DELETE: Delete all records from a database table

MERGE: UPSERT operation (insert or update)

CALL: call a PL/SQL or Java subprogram

EXPLAIN PLAN: interpretation of the data access path

LOCK TABLE: concurrency Control


Data Control Language
DCL is short for Data Control Language which acts as an access specifier to the
database.(basically to grant and revoke permissions to users in the database

GRANT: grant permissions to the user for running DML(SELECT, INSERT, DELETE,…)
commands on the table

REVOKE: revoke permissions to the user for running DML(SELECT, INSERT, DELETE,…)
command on the specified table
Transactional Control Language
TCL is short for Transactional Control Language which acts as an manager for all types of
transactional data and all transactions.Some of the command of TCL are

Roll Back: Used to cancel or Undo changes made in the database

Commit: It is used to apply or save changes in the database

Save Point: It is used to save the data on the temporary basis in the database
DBMS Architecture:DBMS architecture is depending on its
design and can be of the following types:
1)Centralized
2)Decentralized
3)Hierarchical
One tier architecture:
One tier architecture has all the layers such as Presentation, Business, Data Access
layers in a single software package. Applications which handles all the three tiers such
as MP3 player, MS Office are come under one tier application. The data is stored in the
local system or a shared drive.
Two-Tier Architecture:
1. Client Application (Client Tier)
2. Database (Data Tier)

Client system handles both Presentation and Application layers and Server system
handles Database layer. It is also known as client server application. The
communication takes place between the Client and the Server. Client system sends the
request to the Server system and the Server system processes the request and sends
back the data to the Client System
Three-tier architecture :is divided into three parts:
1. Presentation layer (Client Tier)
2. Application layer (Business Tier)
3. Database layer (Data Tier)
Database Models:
A Database model defines the logical design and structure of a database and
defines how data will be stored, accessed and updated in a database
management system.

The Importance of Data Models


Data models can facilitate interaction among the designer, the applications
programmer, and the end user, a well- developed data model can even foster
improved understanding of the organization for which the database design is
developed.

While the Relational Model is the most widely used database model,

Types of Data Model:


Hierarchical Model
Network Model
Entity-relationship Model
Relational Model
Advantages of Data model:
1)The main goal of a designing data model is to make certain that data
objects offered by the functional team are represented accurately.

2)The data model should be detailed enough to be used for building the
physical database.

3)The information in the data model can be used for defining the
relationship between tables, primary and foreign keys, and stored
procedures.

4)Data Model helps business to communicate the within and across


organizations.

5)Help to recognize correct sources of data to populate the model.


Hierarchical Model
This database model organises data into a tree-like-structure,

a single root, to which all the other data is linked.

The hierarchy starts from the Root data, and expands like a tree, adding child
nodes to the parent nodes.

In this model, a child node will only have a single parent node.
Network Model

This is an extension of the Hierarchical model. In this model data is


organised more like a graph, and are allowed to have more than one
parent node.
Entity-relationship Model

relationships are created by dividing object of interest into entity and its characteristics
into attributes.

E-R Models are defined to represent the relationships into pictorial form to
make it easier for different stakeholders to understand.
Relational Model
Data is organised in two-dimensional tables and the relationship is maintained by
storing a common field.

This model was introduced by E.F Codd in 1970, and since then it has been the
most widely used database model, infact, we can say the only database model
used around the world.

The basic structure of data in the relational model is tables. All the information
related to a particular type is stored in rows of that table.

Hence, tables are also known as relations in relational model.


Entity–relationship model (ER model) :

Describes the structure of a database with the help of a diagram, which is known
as Entity Relationship Diagram (ER Diagram). An ER model is a design or blueprint of a
database that can later be implemented as a database.

An ER diagram shows the relationship among entity sets. An entity set is a


group of similar entities and these entities can have attributes.

An Entity may be an object with a physical existence – a particular person, car,


house, or employee – or it may be an object with a conceptual existence – a
company, a job, or a university course.
Components of a ER Diagram
1. Entity
An entity is an object or component of data. An entity is represented as rectangle in
an ER diagram.
For example: In the following ER diagram we have two entities Student and College
and these two entities have many to one relationship as many students study in a
single college. We will read more about relationships later, for now focus on
entities.

Student

Weak Entity:
An entity that cannot be uniquely identified by its own attributes and relies on the
relationship with other entity is called weak entity. The weak entity is represented
by a double rectangle. For example – a bank account cannot be uniquely identified
without knowing the bank to which the account belongs, so bank account is a weak
entity.
Attribute(s):
Attributes are the properties which define the entity type. For example, Roll_No,
Name, DOB, Age, Address, Mobile_No are the attributes which defines entity type
Student. In ER diagram, attribute is represented by an oval.

1)Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key
attribute.For example, Roll_No will be unique for each student. In ER diagram, key attribute
is represented by an oval with underlying lines.
Composite Attribute –
An attribute composed of many other attribute is called as composite attribute. For
example, Address attribute of student Entity type consists of Street, City, State, and
Country. In ER diagram, composite attribute is represented by an oval comprising of
ovals.
Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, multivalued
attribute is represented by double oval.

Derived Attribute –
An attribute which can be derived from other attributes of the entity type is known
as derived attribute. e.g.; Age (can be derived from DOB). In ER diagram, derived
attribute is represented by dashed oval.
The complete entity type Student with its attributes can be represented as:
Relationship:

The association among entities is called a relationship. For example, an


employee works_at a department, a student enrolls in a course. Here,
Works_at and Enrolls are called relationships.

Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a
relationship too can have attributes. These attributes are called descriptive
attributes.

Degree of Relationship
The number of participating entities in a relationship defines the degree of
the relationship.

1)Binary = degree 2
2)Ternary = degree 3
3)n-ary = degree
Mapping Cardinalities:

Cardinality defines the number of entities in one entity set, which can be
associated with the number of entities of other set via relationship set.

One-to-one − One entity from entity set A can be associated with at most one
entity of entity set B and vice versa.
One-to-many −:

One entity from entity set A can be associated with more than one entities of
entity set B however an entity from entity set B, can be associated with at most
one entity.
Many-to-many :
One entity from A can be associated with more than one entity from B
and vice versa.
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.

Binary Relationship and Cardinality:


A relationship where two entities are participating is called a binary relationship. Cardinality is
the number of instance of an entity from a relation that can be associated with the relation.

One-to-one − When only one instance of an entity is associated with the relationship, it is
marked as '1:1'. The following image reflects that only one instance of each entity should be
associated with the relationship. It depicts one-to-one relationship.
One-to-many −:
When more than one instance of an entity is associated with a relationship, it is marked
as '1:N'. The following image reflects that only one instance of entity on the left and
more than one instance of an entity on the right can be associated with the relationship.
It depicts one-to-many relationship.
Many-to-many −:
The following image reflects that more than one instance of an entity on the left and
more than one instance of an entity on the right can be associated with the relationship.
It depicts many-to-many relationship.
Participation Constraints:

Total Participation − Each entity is involved in the relationship. Total participation is


represented by double lines.

Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
Keys: Keys play an important role in the relational database.

It is used to uniquely identify any record or row of data from the table. It is also
used to establish and identify relationships between tables.

Types of key:
1) Primary key:

It is the first key which is used to identify one and only one instance of an
entity uniquely. An entity can contain multiple keys as we saw in PERSON
table. The key which is most suitable from those lists become a primary
key.
2. Candidate key:
A candidate key is an attribute or set of an attribute which can uniquely
identify a tuple.

The remaining attributes except for primary key are considered as a candidate
key. The candidate keys are as strong as the primary key.

3. Super Key
Super key is a set of an attribute which can uniquely identify a tuple. Super key is a
superset of a candidate key.
4.Foreign key:
•Foreign keys are the column of the table which is used to point to the primary key of another table.

•In a company, every employee works in a specific department, and employee and department are two
different entities. So we can't store the information of the department in the employee table. That's
why we link these two tables through the primary key of one table.

•We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.

•Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
Dr Edgar F. Codd, after his extensive research on the Relational Model of database
systems, came up with twelve rules of his own, which according to him, a database
must obey in order to be regarded as a true relational database.
These rules can be applied on any database system that manages stored data using
only its relational capabilities. This is a foundation rule, which acts as a base for all the
other rules.

Rule 1: Information Rule


The data stored in a database, may it be user data or metadata, must be a value of
some table cell. Everything in a database must be stored in a table format.

Rule 2: Guaranteed Access Rule


Every single data element (value) is guaranteed to be accessible logically with a
combination of table-name, primary-key (row value), and attribute-name (column
value). No other means, such as pointers, can be used to access data.

Rule 3: Systematic Treatment of NULL Values


The NULL values in a database must be given a systematic and uniform treatment. This
is a very important rule because a NULL can be interpreted as one the following − data
is missing, data is not known, or data is not applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online
catalog, known as data dictionary, which can be accessed by authorized users.
Users can use the same query language to access the catalog which they use to
access the database itself.

Rule 5: Comprehensive Data Sub-Language Rule


A database can only be accessed using a language having linear syntax that
supports data definition, data manipulation, and transaction management
operations. This language can be used directly or by means of some application.
If the database allows access to data without any help of this language, then it is
considered as a violation.

Rule 6: View Updating Rule


All the views of a database, which can theoretically be updated, must also be
updatable by the system.

Rule 7: High-Level Insert, Update, and Delete Rule


A database must support high-level insertion, updation, and deletion. This must
not be limited to a single row, that is, it must also support union, intersection
and minus operations to yield sets of data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications that access
the database. Any change in the physical structure of a database must not have
any impact on how the data is being accessed by external applications.

Rule 9: Logical Data Independence


The logical data in a database must be independent of its user’s view
(application). Any change in logical data must not affect the applications using it.
For example, if two tables are merged or one is split into two different tables,
there should be no impact or change on the user application. This is one of the
most difficult rule to apply.

Rule 10: Integrity Independence


A database must be independent of the application that uses it. All its integrity
constraints can be independently modified without the need of any change in the
application. This rule makes a database independent of the front-end application
and its interface.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed over
various locations. Users should always get the impression that the data is
located at one site only. This rule has been regarded as the foundation of
distributed database systems.

Rule 12: Non-Subversion Rule


This rule states that there should be no other access path to the database
other than SQL.
Extended Entity-Relationship Model:

1)Specialization

2)Generalization

3)Aggregation
Generalization
Generalization is like a bottom-up approach in which two or more entities of lower level
combine to form a higher level entity if they have some attributes in common.

In generalization, an entity of a higher level can also combine with the entities of the lower
level to form a further higher level entity.

Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.

In generalization, entities are combined to form a more generalized entity, i.e., subclasses are
combined to make a superclass.
Specialization
Specialization is a top-down approach, and it is opposite to Generalization. In
specialization, one higher level entity can be broken down into two lower level
entities.

Specialization is used to identify the subset of an entity set that shares some
distinguishing characteristics.

Normally, the super class is defined first, the subclass and its related attributes are
defined next, and relationship set are then added.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In
aggregation, relationship with its corresponding entities is aggregated into a higher level
entity.
Reduction of ER diagram to Table
Entity type becomes a table.
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual tables.

All single-valued attribute becomes a column for the table.


In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.

A key attribute of the entity type represented by the primary key.


In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the
key attribute of the entity.

The multivalued attribute is represented by a separate table.


In the student table, a hobby is a multivalued attribute. So it is not possible to represent
multiple values in a single column of STUDENT table. Hence we create a table
STUD_HOBBY with column name STUDENT_ID and HOBBY. Using both the column, we
create a composite key.

Composite attribute represented by components.


In the given ER diagram, student address is a composite attribute. It contains CITY, PIN,
DOOR#, STREET, and STATE. In the STUDENT table, these attributes can merge as an
individual column.
Derived attributes are not considered in the table.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point
of time by calculating the difference between current date and Date of Birth.

Table structure for the given ER diagram is as below:


Develop an entity-relationship (ER) diagram, of Library Management System.

Book: Book id,Price, Title, ISBN NO., Author, Publication, Available or Not.

Publication: Year,Name,id.

Librarian: id, User name, Password , Name, Address,Phone.

Student: library card no,branch,name,mobile,no. of books issue, date of issue and


returen,fine,due
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It
typically exists between the primary key and non-key attribute within a table.

X → Y

The left side of FD is known as a determinant, the right side of the production is
known as a dependent.

Assume we have an employee table with attributes: Emp_Id, Emp_Name,


Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of


employee table because if we know the Emp_Id, we can tell that employee name
associated with it.

Functional dependency can be written as:


Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
Types of Functional dependency

1. Trivial functional dependency

A → B has trivial functional dependency if B is a subset of A.


The following dependencies are also trivial like: A → A, B → B

2. Non-trivial functional dependency

A → B has a non-trivial functional dependency if B is not a subset of A.


When A intersection B is NULL, then A → B is called as complete non-trivial.
Normalization
1)Normalization is the process of organizing the data in the database.

2)Normalization is used to minimize the redundancy from a relation or set of


relations. It is also used to eliminate the undesirable characteristics like Insertion,
Update and Deletion Anomalies.

3)Normalization divides the larger table into the smaller table and links them using
relationship.

4)The normal form is used to reduce redundancy from the database table.
Types of Normal Forms:

There are the four types of normal forms:

Normal Form Description

1NF A relation is in 1NF if it contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are
fully functional dependent on the primary key.

3NF A relation will be in 3NF if it is in 2NF and no transition dependency


exists.
First Normal Form (1NF)
A relation will be 1NF if it contains an atomic value.

It states that an attribute of a table cannot hold multiple values. It must hold only
single-valued attribute.

First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute


EMP_PHONE.

Employee table:
EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238
20 Harry 8574783832 Bihar
12 Sam 7390372389, Punjab
8589830302
The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP
14 John 9064738238 UP
20 Harry 8574783832 Bihar
12 Sam 7390372389 Punjab
12 Sam 8589830302 Punjab
Second Normal Form (2NF)
In the 2NF, relational must be in 1NF.

In the second normal form, all non-key attributes are fully functional dependent on the
primary key

Closure of an Attribute Set-

The set of all those attributes which can be functionally determined from an
attribute set is called as a closure of that attribute set.

Closure of attribute set {X} is denoted as {X}+.


Example-

Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-


A → BC
BC → DE
D→F
CF → G

Closure of attribute A- Closure of attribute D-

A+ = { A } D+ = { D }
={A,B,C} ( Using A → BC ) = { D , F } ( Using D → F )
={A,B,C,D,E} ( Using BC → DE ) We can not determine any other attribute using
={A,B,C,D,E,F} ( Using D → F ) attributes D and F contained in the result set.
= { A , B , C , D , E , F , G } ( Using CF → G ) Thus,
Thus, D+ = { D , F }
A+ = { A , B , C , D , E , F , G }

Closure of attribute set {B, C}-

{ B , C } += { B , C }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Example

R= {A, B, C, D}
FD = {A→B, B→C, C→D}

Example
Let suppose R= {A, B, C, D} and FD = {A→B, B→C, C→D, D→A}
Closure of attribute “A”
According to the recursive rule, Attribute “A” can determine Attribute “A” itself.
According to the given FD, Attribute “A” can directly determine Attribute “B”.
According to the transitive property, Attribute “A” can determine “C” through “B.”
According to the transitive property, As Attribute “A” already determines C, Attribute “A” can
determine “D” through “C.”
So, Closure of A = A+ = ABCD
Closure of attribute “B”
According to the recursive rule, Attribute “B” can determine Attribute “B” itself.
According to the given FD, Attribute “B” can directly determine Attribute “C”.
According to the transitive property, Attribute “B” can determine “D” through “C”.
Attribute “B” cannot determine attribute “A”
So, Closure of B = B+ = BCD
Closure of attribute “C”
According to the recursive rule, Attribute “C” can determine Attribute “C” itself.
According to the given FD, Attribute “C” can directly determine Attribute “D.”
Attribute “C” cannot determine attributes “A” and “B”
So, Closure of C = C+ = CD
Closure of attribute “D”
According to the recursive rule, Attribute “D” can determine Attribute “C” itself.
Attribute “D” cannot determine attributes “A,” “B,” and “C.”
So, Closure of D = D+ = D
Conclusion: As we see, only the closure of attribute “A” can determine all attributes of the
relation, so attribute “A” can be used as the Candidate key.
So, Candidate Key = {A}
Closure of attribute “A”
As Attribute “A” can determine itself.
According to FD, Attribute “A” can directly determine Attribute “B”.
According to the transitive property, Attribute “A” can determine “C” through “B”.
According to the transitive property, As Attribute “A” already determines C, Attribute “A” can determine “D” through
“C”.
So, Closure of A = A+ = ABCD
Closure of attribute “B”
As Attribute “B” can determine itself.
According to FD, Attribute “B” can directly determine Attribute “C”.
According to the transitive property, Attribute “B” can determine “D” through “C”.
According to the transitive property, As Attribute “B” already determines D, Attribute “B” can determine “A” through
“D”.
So, Closure of B = B+ = BCDA
Closure of attribute “C”
As Attribute “C” can determine itself.
According to FD, Attribute “C” can directly determine Attribute “D.”
According to the transitive property, As Attribute “C” already determines D, Attribute “C” can determine “A” through
“D”.
As Attribute “C” already determines A So, Attribute “C” can determine “B” through “A”.
So, Closure of C = C+ = CDAB
Closure of attribute “D”
As Attribute “D” can determine itself.
According to FD, Attribute “D” can directly determine Attribute “A.”
According to transitive property, As Attribute “D” already determines A, Attribute “D” can determine “B” through “A”.
As Attribute “D” already determines B, Attribute “D” can determine “C” through “B”.
So, Closure of D = D+ = DABC
Conclusion: As we see, the closure of all attributes, “A,” “B,” “C,” and “D,” can determine all attributes of relation so
all attributes can be used as Candidate keys.
So, Candidate Key = {A, B, C, D}
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a school, a
teacher can teach more than one subject.

TEACHER table
TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID


which is a proper subset of a candidate key. That's why it violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_DETAIL table:
TEACHER_SUBJECT table
TEACHER_DETAIL table:

TEACHER_ID TEACHER_AGE

25 30
47 35
83 38

TEACHER_SUBJECT table:

TEACHER_ID SUBJECT

25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Third Normal Form (3NF)
•A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.

•3NF is used to reduce the data duplication. It is also used to achieve the data integrity.

•If there is no transitive dependency for non-prime attributes, then the relation must be in
third normal form.

•A relation is in third normal form if it holds atleast one of the following conditions for every
non-trivial function dependency X → Y.

•X is a super key.

•Y is a prime attribute, i.e., each element of Y is part of some candidate key.


Example:
EMPLOYEE_DETAIL table:

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida


333 Stephan 02228 US Boston
444 Lan 60007 US Chicago
555 Katharine 06389 UK Norwich
666 John 462007 MP Bhopal

Super key in the table above:


{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on

Candidate key: {EMP_ID}

Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on
EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on
super key(EMP_ID). It violates the rule of third normal form.

That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMPLOYEE table:

EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010


333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007

EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
Boyce Codd normal form (BCNF)
•BCNF is the advance version of 3NF. It is stricter than 3NF.
•A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.

•For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one
department.

EMPLOYEE table: EMP_ID EMP_COUNTR EMP_DEPT DEPT_TYPE EMP_DEPT_NO


Y

264 India Designing D394 283


264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549

In the above table Functional dependencies are as follows:


EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:

EMP_ID EMP_COUNTRY

264 India
264 India
EMP_DEPT table:

EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283


Testing D394 300
Stores D283 232
Developing D283 549

EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT

D394 283
D394 300
D283 232
D283 549
Functional dependencies:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

Now, this is in BCNF because left side part of both the functional dependencies is
a key.
4:Relational Algebra :

Relational Algebra:
Relational algebra is a procedural query language, which takes instances of relations as
input and yields instances of relations as output. It uses operators to perform queries.
An operator can be either unary or binary. They accept relations as their input and
yield relations as their output. Relational algebra is performed recursively on a relation
and intermediate results are also considered relations.

Relational algebra mainly provides theoretical foundation for relational databases


and SQL.
Relational Algebra works on the whole table at once, so we do not have to use loops etc
to iterate over all the rows(tuples) of data one by one. All we have to do is specify the
table name from which we need the data, and in a single line of command, relational
algebra will traverse the entire given table to fetch data for you.
Types of Relational operation
1. Select Operation:
The select operation selects tuples that satisfy a given predicate.
It is denoted by sigma (σ).

Notation: σ p(r)

Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND
OR and NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.
For
example −
σsubject = "database"(Books)

Output − Selects tuples from books where subject is 'database'.

σsubject = "database" and price = "450"


(Books)

Output − Selects tuples from books where subject is 'database' and 'price' is
450.

σsubject = "database" and price = "450" or year > "2010"


(Books)

Output − Selects tuples from books where subject is 'database' and 'price'
is 450 or those books published after 2010.
Select Operation: Notation: σ p(r)

For example: LOAN Relation


BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000


Redwood L-23 2000
Perryride L-15 1500
Downtown L-14 1500
Mianus L-13 500
Roundhill L-11 900
Perryride L-16 1300

Input:
σ BRANCH_NAME="perryride" (LOAN)
Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500


Perryride L-16 1300
Project Operation (∏)

Project operation is used to project only a certain set of attributes of a relation. In


simple words, If you want to see only the names all of the students in
the Student table, then you can use Project Operation.
It will only project or show the columns or attributes asked for, and will also remove
duplicate data from the columns.

Syntax: ∏A1, A2...(r)


where A1, A2 etc are attribute names(column names).

For example,
∏Name, Age(Student)
Above statement will show us only the Name and Age columns for all the
rows of data in Student table.
2. Project Operation: Notation: ∏ A1, A2, An (r)

Example: CUSTOMER RELATION


NAME STREET CITY

Jones Main Harrison


Smith North Rye
Hays Main Harrison
Curry North Rye
Johnson Alma Brooklyn
Brooks Senator Brooklyn

Input: ∏ NAME, CITY (CUSTOMER)

Output:
NAME CITY

Jones Harrison
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
Union Operation:
Suppose there are two tuples R and S. The union operation contains all the tuples
that are either in R or S or both in R & S.
It eliminates the duplicate tuples. It is denoted by ∪.

Notation: R ∪ S
A union operation must hold the following condition:
R and S must have the attribute of the same number.
Duplicate tuples are eliminated automatically.
DEPOSITOR RELATION CUSTOMER_NAME ACCOUNT_NO

Johnson A-101
Smith A-121
Mayes A-321
Turner A-176
Johnson A-273
Jones A-472
Lindsay A-284

BORROW RELATION
CUSTOMER_NAME LOAN_NO

Jones L-17
Smith L-23
Hayes L-15
Jackson L-14
Curry L-93
Smith L-11
Williams L-17
∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)
Input:

Output: CUSTOMER_NAME

Johnson
Smith
Hayes
Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes
Set Intersection:
Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in both R & S.
It is denoted by intersection ∩.

Notation: R ∩ S

Example: Using the above DEPOSITOR table and BORROW table

Input: ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:
CUSTOMER_NAME

Smith
Jones
Set Difference:
Suppose there are two tuples R and S. The set intersection operation contains all
tuples that are in R but not in S.
It is denoted by intersection minus (-).

Notation: R - S

Example: Using the above DEPOSITOR table and BORROW table


Input:
∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

Output: CUSTOMER_NAME

Jackson
Hayes
Willians
Curry
Cartesian product
The Cartesian product is used to combine each row in one table with each row in
the other table. It is also known as a cross product.

It is denoted by X.

Example:
EMPLOYEE
EMP_ID EMP_NAME EMP_DEPT

1 Smith A
2 Harry C
3 John B

DEPARTMENT

DEPT_NO DEPT_NAME

A Marketing
B Sales
C Legal

Input: EMPLOYEE X DEPARTMENT


Output:

EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME

1 Smith A A Marketing
1 Smith A B Sales
1 Smith A C Legal
2 Harry C A Marketing
2 Harry C B Sales
2 Harry C C Legal
3 John B A Marketing
3 John B B Sales
3 John B C Legal
Join Operations:
A Join operation combines related tuples from different relations, if and only if a given
join condition is satisfied.

It is denoted by ⋈.

Example: EMP_CODE EMP_NAME


EMPLOYEE 101 Stephan
102 Jack
103 Harry

SALARY EMP_CODE SALARY

101 50000
102 30000
103 25000

Operation: (EMPLOYEE ⋈ SALARY)

Result: EMP_CODE EMP_NAME SALARY

101 Stephan 50000


102 Jack 30000
103 Harry 25000
Types of Join operations:
1. Natural Join:
A natural join is the set of tuples of all combinations in R and S that are equal on their
common attribute names.

It is denoted by ⋈.

Example: Let's use the above EMPLOYEE table and SALARY table:

Input:
∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)

Output:

EMP_NAME SALARY

Stephan 50000
Jack 30000
Harry 25000
2. Outer Join:
The outer join operation is an extension of the join operation. It is used to deal with
missing information.
Example:
EMPLOYEE

EMP_NAME STREET CITY

Ram Civil line Mumbai


Shyam Park street Kolkata
Ravi M.G. Street Delhi
Hari Nehru nagar Hyderabad

FACT_WORKERS

EMP_NAME BRANCH SALARY

Ram Infosys 10000


Shyam Wipro 20000
Kuber HCL 30000
Hari TCS 50000
Input: (EMPLOYEE ⋈ FACT_WORKERS)

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000


Shyam Park street Kolkata Wipro 20000
Hari Nehru nagar Hyderabad TCS 50000

An outer join is basically of three types:


Left outer join
Right outer join
Full outer join
a. Left outer join:
•Left outer join contains the set of tuples of all combinations in R and S that
are equal on their common attribute names.

•In the left outer join, tuples in R have no matching tuples in S.


•It is denoted by ⟕.

Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:
1.EMPLOYEE ⟕ FACT_WORKERS

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL


b. Right outer join:
•Right outer join contains the set of tuples of all combinations in R and S
that are equal on their common attribute names.

•In right outer join, tuples in S have no matching tuples in R.

•It is denoted by ⟖.

Example: Using the above EMPLOYEE table and FACT_WORKERS Relation

Input:
1.EMPLOYEE ⟖ FACT_WORKERS

Output:

EMP_NAME BRANCH SALARY STREET CITY

Ram Infosys 10000 Civil line Mumbai


Shyam Wipro 20000 Park street Kolkata
Hari TCS 50000 Nehru street Hyderabad
Kuber HCL 30000 NULL NULL
c. Full outer join:
Full outer join is like a left or right join except that it contains all rows from both tables.
In full outer join, tuples in R that have no matching tuples in S and tuples in S that have
no matching tuples in R in their common attribute name.

It is denoted by ⟗.

Example: Using the above EMPLOYEE table and FACT_WORKERS table

Input:
EMPLOYEE ⟗ FACT_WORKERS

Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000


Shyam Park street Kolkata Wipro 20000
Hari Nehru street Hyderabad TCS 50000
Ravi M.G. Street Delhi NULL NULL
Kuber NULL NULL HCL 30000
3. Equi join:
It is also known as an inner join. It is the most common join. It is based on matched data
as per the equality condition. The equi join uses the comparison operator(=).
Example:

CUSTOMER RELATION

CLASS_ID NAME

1 John
2 Harry
3 Jackson

PRODUCT

PRODUCT_ID CITY

1 Delhi
2 Mumbai
3 Noida
Input:
CUSTOMER ⋈ PRODUCT

CLASS_ID NAME PRODUCT_ID CITY

1 John 1 Delhi
2 Harry 2 Mumbai
3 Harry 3 Noida
Find right outer join ,left outer join and full join from following table?

Table A

Number Square

2 4

3 9

4 16

Table B

Number Cube

2 8

3 27

5 75
Left Outer Join Number Square Cube

2 4 8
A⟕B
3 9 27

4 16 –

Number Square Cube


Right Outer Join 2 4 8

A⟖B 3 9 27

5 – 75

Number Square Cube

2 4 8

3 9 27
Full Outer Join
4 16 –
A⟗B
5 – 75
Relational Calculus:

Relational calculus is a non-procedural query language. In the non-procedural query


language, the user is concerned with the details of how to obtain the end results.

The relational calculus tells what to do but never explains how to do.

Types of Relational calculus:


1. Tuple Relational Calculus (TRC)

The tuple relational calculus is specified to select the tuples in a relation. In TRC, filtering
variable uses the tuples of a relation.

The result of the relation can have one or more tuples.

Notation:
{T | P (T)} or {T | Condition (T)}

Where
T is the resulting tuples

P(T) is the condition used to fetch T.


For example:

{ T.name | Author(T) AND T.article = 'database' }

OUTPUT: This query selects the tuples from the AUTHOR relation. It returns a tuple with
'name' from Author who has written an article on 'database'.

TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and
Universal Quantifiers (∀).

For example:

{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output: This query will yield the same result as the previous one.
2. Domain Relational Calculus (DRC)

The second form of relation is known as Domain relational calculus. In domain relational
calculus, filtering variable uses the domain of attributes.

Domain relational calculus uses the same operators as tuple calculus. It uses logical
connectives ∧ (and), ∨ (or) and ┓ (not).

It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable.

Notation:

{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where
a1, a2 are attributes

P stands for formula built by inner attributes


For example:

{< article, page, subject > | ∈ javatpoint ∧ subject = 'database'}

Output: This query will yield the article, page, and subject from the relational
javatpoint, where the subject is a database.

Example2:

{< name, age > | ∈ Student ∧ age > 17}

above query will return the names and ages of the students in the table Student who are older than. 17

Example 3

{< Fname, Emp_ID > | ∈ Employee ∧ Salary > 10000 v deptid=10}

The result here will be returning the Fname and Emp_ID values for all the rows in the employee table where salary is greater than 10000
or department is 10.

You might also like