Module - 1 DBMS Notes
Module - 1 DBMS Notes
This course provides fundamental and practical knowledge on database concepts by means
of organizing the information, storing and retrieve the information in an efficient and a
flexible way from a well-structured relational model. This course ensures that every student
will gain experience in creating data models and database design.
Course Objectives
Demonstrate basic database concepts, including the structure and operation of the
relational data model.
Introduce simple and moderately advanced database queries using Structured Query
Language (SQL).
Explain and successfully apply logical database design principles, including E-R
diagrams and database normalization.
Demonstrate the concept of a database transaction and related database facilities,
including concurrency control, and data object locking and protocols
A Database table is a collection of Rows and Columns that is used to organize information
about a single topic or object. Each Row within a Table corresponds to a single record and
contains several different attributes that describe the row.
A Database Table is the most common and simplest form of data storage in a relational
Database.
History of DBMS:
Early 1960’s: First general-purpose DBMS, designed by Charles Bachman at General
Electric called Integrated Data Store-Network data model.
Late 1960’s: IBM developed the Information Manager System (IMS). It is an
alternative data representation framework- Hierarchical Data model
1970: Edgar Codd at IBM’s San Jose Research Laboratory, proposed a new data
representation framework-Relational Data model.
1980: Relational data model is most widely used the SQL query language for
relational databases developed as part of IBM’s system R Project
Late 1980’s: SQL was standardized, SQL:1999.
James Gray won the Turing Award for his contributions to Database Transaction
Management.
9. Security is less- backup, restore Security is more – backup and restore can
problems be done.
Advantages of DBMS
1.Data Independence:
Application programs should not, ideally, be exposed to details of data representation and
storage. It provides an abstract view of data by hiding such details (It is a separation of data
and metadata)
2. Efficient Data Access:
DBMS uses a variety of techniques to store and retrieve data efficiently i.e accessing data
is an easier task by using standardized SQL commands.
3.Data Integrity and Security:
If data is accessed through DBMS, the DBMS can enforce some Integrity Constraints
on the data stored in tables while creating the schema and what data should be visible to
different classes of users
Ex: bank (minimum amt in account), Student result (pass mark), Faculty (Salary)
4.Data Administration:
When several users share the data, centralizing the administration of data. It is the responsible
for organizing the data representation to minimize redundancy.
5.Concurrent Access and Crash Recovery:
A DBMS Schedules concurrent accesses to the data in such a manner that users can think of
the data as being accessed by only one user at a time.
6.Reduced Application Development Time:
The DBMS Supports important functions that are common to many applications accessing
data in the DBMS. This in conjunction with the high-level interface to the data, facilities
quick application development.
Information regarding about the three levels of schemas is stored in system catalog
Note: the collection of files corresponding to user tables and indexes represent data.
The data in a DBMS is described at 3 levels of abstraction.
The database description consists of a schema at each of these 3 levels of Abstraction.
1) The Conceptual
2) The Physical
3) External
Fig: Levels of Schema
External Schema:
External Schema or view level: data access to be authorized at the level of one user
or group of users
the uppermost level is called view level. It describes only the part of database, a
variety of information is stored in database , but user want only some information to
access.
The external schema is designed to guide the user requirement
This level simplifies the user interaction with the system and hides the complexity
that arises in the conceptual schema
The external schema consists of collection of one or more views and relations from
conceptual schema
A user can only view the data in the form of a table, but it is not stored explicitly
Eg: Course info(Sid:string,fname:string, enrollment: integer)
Conceptual Schema:
The conceptual schema is called as Logical schema describes the stored data in terms
of the data model.
In a Relational DBMS, Conceptual schema describes all relations that are stored in the
database.
The choice of relations, and the choice of fields for each relation/table makes a good
conceptual schema called conceptual database design
Physical Schema:
A Physical Data Model is a representation of a data design as implemented or
intended to be implemented in a DBMS.
It typically derives from a logical data model, though it may be reverse-engineered
from a given database Implementation.
It specifies additional storage details ie. How the relations in the conceptual schema is
stored on secondary storage devices such as Disks and Tapes.
Stores all relations as unsorted files of records(A file in a DBMS is either a collection
of records or a collection of pages, rather than collection of characters as in operating
system)
Eg: Create Indexes on the first column of the students, Faculty courses relations the sal
column of faculty and the capacity column of rooms.
Decision about the physical schema are based on understanding how data is accessed. This
process of arriving a good physical schema is called physical database design
Data Independence
Applications programs are insulated from changes in the way the data is structured
and stored.
Data Independence is achieved through the three levels of Abstraction
Relations in external schema are generated on demand from the relations to the
conceptual schema.
QUERIES:
A Query is a request for data or information from a database table or combination of tables.
This data may be generated as results returned by Structured Query Language(SQL) or as
pictorials, graphs or complex results)
Eg: Consider the sample University Database?
1) What is the name of the student with student ID 1234567?
2) What is the average salary of professors who teach course CS5647?
3) How many students are enrolled in CS5647?
Such questions involving the data stored in a DBMS are called queries.
A DBMS provides a specialized language called the Query Language , in which queries can
be posed.
A feature of Relational Model is that supports powerful Query LANGUAGE(SQL)
SQL is a database query language used for storing and managing data in relational DBMS.
RDBMS(MySQL, Oracle, Infomix, MS Access) use SQL as the standard database Query
Language.
There are two Two formal Query Language:-
1) Relational Calculus
2) Relational Algebra
Relational Calculus is based on mathematical logic and queries in this language have an
intuitive , precise meaning.
Relational Algebra is based on a collection of operators for manipulating relations, which is
equivalent in power to the calculus.
Transaction Management:
A Transaction can be defined as a group of tasks. A single task is the minimum processing
unit which cannot be divided further.
Eg: Consider a database that holds information about Airline Reservations. Many agents
looking up information about available seats on various flights and making new seat
reservations.
When several users access a database concurrently, the DBMS must order their requests
carefully to avoid conflicts.
The DBMS accepts SQL commands generated from a variety of user interfaces.
The Query Evaluation Engine evaluates and executes a plan against the database, and returns
the answer
1) When a user issues a query, the parser, parses the given request i/p into machine
understandable language, and query optimizer optimizes and produces a plan for
evaluating the query like how the data is stored to produce an efficient plan for
evaluating the query
2) An execution plan is a blueprint for evaluating a query, represented as a tree of
relational operators (operators serves as a building blocks for evaluating queries posed
against data, which brings the needed data to the main memory)
1) Plan executor uses operator evaluator to evaluate the plan
2) The code that implements relational operators are on top of the file and access
method.
3) This information is taken by file and access method for accessing the data which is
requested by the user that is present in the file system.
4) Buffer manager takes the responsibility for taking the data from disk to main memory
for execution.
5) Disk space manager takes the responsibility by providing space in the disk when the
data is modified.
6) Transaction manager ensures that transactions request and release locks according to a
locking protocol and schedules the execution transactions
7) Lock manager keeps track of request of locks and grants on database objects when
they are available.
8) DBMS supports Concurrency control and Crash Recovery by carefully scheduling
user requests and maintaining a log of all changes to the database.
9) Recovery manager maintains a log and restores the system to a consistent state when a
crash occurs.
Responsibilities of DBA:
DBA designs the conceptual schema (what relations to store) and physical schema
(how to store them)
DBA ensures that unauthorized data access is not permitted.
DBA ensures security by granting permissions to different users to access the only
certain views and relations.
DBA ensures crash recovery and takes necessary steps to restore the data to a
consistent state. DBA ensures data tuning i.e, takes the responsibility for
modifying the database, in particular conceptual and physical schema (basing on
users’ requirements)
Data Models:
A database Model defines the Logical design and structure of a database and defines
how data will be stored, accessed and updated in a database Management System.
It describes the design of database to reflect entities, attributes, relationship among
data , constraints etc.
A data model is a collection of high-level data description constructs that hides many
low-level storage details. DBMS allows a users to define the data to be stored in terms
of data model.
Disadvantages:
Complex to implement
Difficult to manage
Lacks structural independence
Implementation’s limitations
Lack of standards
Eg: Schema:
Students(Sid: string, sname: string, age:integer)
The Entity Relationship Model
Widely accepted and adapted graphical tool for data modeling
Introduced by Chen in 1976
Graphical representation of entities and their relationships in a database structure
Entity relationship diagram (ERD)
o Uses graphic representations to model database components
1) Requirement Analysis:
Understand what data is to be stored in database. This process is done by system
analyst of the enterprise/software company by conducting discussions with
groups/owners of the organization.
2) Conceptual Design:
The information gathered in the above step is used to develop a high-level description
of data to be stored in the database, along with constraints. It is transformed to ER
data Model.
3) Logical Data Design:
The ER diagrams are converted into database schema in the data model of the chosen
DBMS.
4) Schema Refinement and Normalization: The initial set of tables that we got from
logical data design may not be exact table, so we need to refine them by applying
normalization techniques to get the final set of exact tables.
5) Physical Database Design: For the above refined tables, the corresponding physical
files and indexes are designed.
6) Application and Security Design: Design Methodologies like UML try to address
complete s/w design and development cycle. Identify entities and processes involved
in application.
ER MODEL
It is a Graphical or Visual Representation of Data and describes how data is related to
each other.
What are the entities and relationships in the enterprise?
What type of information about these entities and relationships should we store in the
database?
What are the integrity constraints or business rules that hold?
A database `schema’ in the ER Model can be represented pictorially (ER diagrams).
Can map an ER diagram into a relational schema
Entities - An entity is a real-world object that are represented in database. Data are
stored about such entities. It can be any object, place, person or class.
Example – University DB- Student, Faculty, Department,
Employees,Courses( Entities)
Entity sets – An Entity set is a set of entities of the same type that shares same
properties or attributes.
Example – In our University DB- Entity set is Faculty – set of all Faculties
In our University DB – Entity set is Student – Set of all students
Types of Attributes
1. Simple attributes
2. Composite attributes
3. Single valued attributes
4. Multi valued attributes
5. Derived attributes
6. Key attributes
7. Descriptive Attribute
Simple Attributes- Simple attributes are those attributes which can not be divided
further. Ex: age, class
Composite Attributes- Composite attributes are those attributes which are composed
of many other simple attributes. Ex: Address, name
Single Valued Attributes- Single valued attributes are those attributes which can
take only one value for a given entity from an entity set. Ex: gender, age
Multi Valued Attributes- Multi valued attributes are those attributes which can take
more than one value for a given entity from an entity set. Ex: mobile no, email-id
Derived Attributes- Derived attributes are those attributes which can be derived from
other attribute(s). Ex: Age is derived from DOB
Key Attributes- Key attributes are those attributes which can identify an entity
uniquely in an entity set.
Ex: Sno
Minimal set of attributes who values are uniquely identify an entity in the entity set
{Sno, sno+sname, Sno+saddr+sname}
Descriptive Attribute-
Descriptive attributes are used to record information about the relationship, rather than about
any one of the participating entities
Types of Relationships:
The Association between more than one entity is called a Relationship. It represents in
Diamond shape.
Unary Relationship
Binary Relationship
Ternary Relationship
Fig: Unary Relationship
Conceptual Schema:
Employees(ssn: integer,name:string, iot:real)
Departments(did:integer,dname:string,budget:real)
Manages( ssn:integer,did:string,since:string)
Fig: Example of ER Model
Mapping Cardinality:
Cardinality tells how many times the entity of an entity set participates in a relationship
1) One- to –One
2) One-to-Many
3) Many-to-One
4) Many-to-Many
1) One-to-one: An Entity in A is related to one entity in B, and an entity in B is related
to atmost one entity in A.
P1 C1
P2 C2
P3 C3
P4 C4
Fig: One-To-One
2) One- to-Many: An entity in A is related to any number of entities in B, but an entity
in B is related to atmost one entity in A.
E1
O1
E2
O2
E3
O3
E4
E5
Fig: One-to-Many
3) Many-to-One:
An Entity in A is related to atmost one entity in B, but an entity in B is related to any
number of entities in A.
E1
D1
E2
D2
E3
D3
E4
E5
Fig: Many-to-One
4) Many-to-Many:
An Entity in A is related to any number of entities in B, an entity in B is related to any
number of entities in A.
S1 C1
S2 C2
S3 C3
S4 C4
Fig: Many-to-Many
Keys:
A Key is a minimal set of Attributes whose values uniquely identify an entity in the set.
Types of Keys:
1) Super Key
2) Candidate Key
3) Primary Key
4) Composite Key
5) Alternate Key
6) Foreign Key
Super Key:
A super key is a set of one of more columns(attributes) to uniquely identify the rows/tuples in
a table/relation.
Student Table
SID SNAME Phone Age CGPA
number
1 Raghu 98756 25 8.8
2 Sravya 95678 22 9.0
3 Kavya 98515 23 8.5
4 kavya 6556 23 9.0
5 Sravya 921451 26 8.5
Step 1:{sid}{sname}{ph number}{age}{cgpa}
Step 2: {sid,sname}{sid, phone number}{sid, age}{sid,cgpa}{sname,sid}{sname,ph num}
{sname,age} -Not Found {sname,cgpa} -Not Found {Phone number,age}
{Phone number,cgpa}{Age,sid}{age,sname}{age,ph num}{age,cgpa}
Step 3:
{sid, sname, phone number} ……Etc
Super Key Set:{ Sid,sname,phone number,cgpa}
Candidate Key:
A super key set with no redundant attribute is known as candidate key.
Or
Minimal set of Fields/attribute uniquely identify each record in a table/relation.
1) A Candidate Key can never be NULL or Empty. Its values should be unique.
2) There can be more than one candidate key for a relation or table
or
3) A Candidate key can be combination of more than one attributes.
Candidate Key{Sid, Phone number}
PRIMARY KEY
Candidate Key is also called as Primary Key
{SID}
Criteria(PK)
1) The Primary Key Values should NOT BE NULL/EMPTY
2) The values entered should be UNIQUE
3) It should not contain any Redundancy Data.
Foreign Key:
Foreign Key are the columns of a table that points to the primary key of another table.
Eg: Student Table- Sid(PK)
Course Table- Cid(PK)
Enrolls -sid, cid
(Sid references student)
(cid references courses)
Composite Key:
Key that consists of 2 or more attributes that uniquely any record in a table is called
composite Key.
But the attributes which together form the composite key are not a key independent or
individually.
Eg: Sid,sub id marks
Sid and sub id both are required for getting data of marks.
Alternate Key:
Out of all candidate keys, only one gets selected as Primary key, remaining keys are known
as alternate keys or secondary keys.
Features of ER Model:
1) Key Constraints
2) Participation Constraints
3) Weak Entity
4) Class Hierarchies
5) Aggregation
Key Constraints: It is a condition or restriction, that each department has atmost one
manager is an Eg of Key Constraint.
Weak entity:
A weak entity set is one which does not have any primary key associated with
it.
A weak entity type normally has partial key which is the set of attributes that
can uniquely identify weak entities that are related to same owner entity.
A Weak Entity can be identified uniquely only by considering the primary key
of another entity.
Owner Entity set and Weak entity set must participate in a one-to-many
Relationship set.
Weak Entity set must have Total Participation in this identifying relationship
set.
a) Overlap Constraint: It determines whether two subclasses are allowed to contain the
same entity.
Eg:- Consider a person, can be both hour_Emp Entity and Contract_Emp Entity? Ans-
No
2 Approaches:
1) Specialization
2) Generalization
1) Specialization:
Employees is specialized into sub classes.
Specialization is the process of identifying subsets of an Entity set(Super Class) that
share some characteristic.
Superclass is defined first, then the sub classes are defined next, and later specific
attributes and relationship sets are added.
2) Generalization:
Generalization is a bottom-up-approach in which two lower level entities combine to
form a higher level entity.
Its more like superclass and sub class system, but the only difference is the approach,
which is bottom-up. Hence, entities are combined to form a more generalized Entity.
Fig: Generalization – Bottom-Up Approach
Aggregation:
Aggregation is a process when relation between two entities is treated as a single Entity.
Aggregation allows us to treat a relationship set as entity set for purposes of participation in
other relationships.
Fig: Aggregation
1) Entity Vs Attribute
2) Entity Vs Relationship
3) Binary Vs Ternary Relationship
4) Aggregation Vs Ternary Relationships
Entity Vs Attribute:
Should address be an attribute of Employees or an Entity(Connected to Employees by
a relationship)?
Depends upon the use we want to make of address information, and the semantics of
the data:
o If we have several addresses per employee, address must be an entity(since
attributes cannot be set valued).
o If the structure( city, street, etc) is important eg., we want to retrieve
employees in a given city, address must be modelled as an entity(since
attribute values are atomic)
Entity Vs Relationship:
What if a manager gets discretionary budget that covers all managed depts?
Redundancy – dbudget stored for each department managed by manager.
Misleading: suggests dbudget associated with department-mgr. combination.
Fig: Entity Vs Relationship
Ist Requirement: we can impose a Key Constraint on policies with respect to covers, but that
the policy can cover only one dependent.
IInd Requirement: We can impose a total participation constraint on policies.
IIIrd Requirement: In given fig, we cannot identify
Fig: Binary Vs Ternary Relationship