IV Sem Dbms Notes
IV Sem Dbms Notes
1
Students, such as the courses they are enrolled in, details about their
scholarships, the courses they have studied in previous years or are taking
this year, and examination results. There may also be a database containing
details relating to the next year’s admissions and a database containing
details of the staff working at the university, giving personal details and
salary related details for the payroll system.
TRADITIONAL FILE PROCESSING SYSTEM
File processing systems was an early attempt to computerize the manual
filing system that we are all familiar with. A file system is a method for
storing and organizing computer files and the data they contain to make it
easy to find and access them. File systems may use a storage device such as
a hard disk or CD-ROM and involve maintaining the physical location of the
files.
Characteristics of File Processing System
Here is the list of some important characteristics of file processing system:
It is a group of files storing data of an organization.
Each file is independent from one another.
Each file is called a flat file.
Each file contained and processed information for one specific
function, such as accounting or inventory.
Files are designed by using programs written in
programming languages such as COBOL, C, C++.
The physical implementation and access procedures are written into
database application; therefore, physical changes resulted in intensive
rework on the part of the programmer.
As systems became more complex, file processing systems offered little
flexibility, presented many limitations, and were difficult to maintain.
Managing data with file system is now out dated. But the following are the
reasons for studying them in detail.
An understanding of the relatively simple characteristics of file
systems makes the complexity of database design easier to
understand.
An awareness of the problems of file system can help us to avoid those
problems in DBMS Software.
If we want to convert the file systems data to database system, the
knowledge of file system and its limitations are useful.
The limitations of File System are explained below:
Program-Data Dependence:
The file descriptions are stored in the application programs that accesses
the given file. Due to this reason when we want to change the file, we should
also change all the application programs that access the file. And when we
2
change the application program we have to change the related file also. This
is denoted as program – data dependence.
Duplication of Data:
In the file system the applications are developed independently. This
process causes duplication of files. The duplication causes wastage of
storage space. It also leads to loss of data integrity and meta data integrity. It
means the same field names in different files may represent different data
items and different field names in different files may represent same data
items.
Limited Data Sharing:
Each application has its own private files with little opportunity to share
data outside of their own applications. A requested report may require data
from several incompatible files in separate systems. Generally such report is
always not possible with the file system.
Lengthy Development Times:
There is little opportunity to use previous development efforts. For each
new application the developer has to start from scratch by designing new
file formats and descriptions.
Excessive Program Maintenance:
The preceding factors such as program data dependence, duplication of data
Lengthy development times and limited data sharing causes heavy program
maintenance load.
BASIC DEFINATIONS
Data:
It is the collection of raw facts.
Data consists text, numbers, images, audio and video segments.
Data is the plural of the Latin term datum.
Example: In a college admission form the fields such as student name,
father’s name, dob, address and phone etc. are treated as data.
Information:
The processed data is called information.
Information increases the knowledge of the user who uses the data.
Data is the lowest level of knowledge and information is the second
level of knowledge.
Data by itself alone is not significant. But information is significant by
it self.
Observations and recording are done for data, analysis and
calculation are done for information.
Example:
Suppose that we have entered the student’s data such as student’s name
and marks in 3 subjects. After performing calculations we can generate the
information such total marks, percentage etc.
3
Database:
A shared collection of logically related data and its description,
designed to meet the information needs of an organization.
The description of the data is known as the system catalog (or data
dictionary or metadata—the “data about data”). It is the self-
describing nature of a database that provides program–data
independence.
For example, a sales person may maintain a small database of
customer contacts on his/her laptop that consists of a few mega bytes
of data.
A large corporation may build a very large database consisting of
several tera bytes of data on a large mainframe computer that is used
for decision support applications.
A data warehouse contains peta bytes of data
DBMS (DATABASE MANAGEMENT SYSTEM)
It is a software that manages the data stored in a database. It enables
the user to store, modify and extract the data/information from a
database.
In SQL (Structured Query Language) we have three facilities for
Definition, Manipulation, and Control. There are denoted as DDL,DML
& DCL.
The DBMS serves as the mediator between the user and database.
The DBMS receives all application requests from user and provides
the response.
There are many different types of Database Management Systems
ranging from small systems that run on personal computers to huge
systems that run on mainframes.
DBMS provides controlled access to the database. For example, it may
provide:
A a security system, which prevents unauthorized users accessing
the database;
An integrity system, which maintains the consistency of stored
data;
A concurrency control system, which allows shared access of the
database;
A recovery control system, which restores the database to a
previous consistent state following a hardware or software failure;
A user-accessible catalog, which contains descriptions of the data
in the database.
Examples for different DBMS Softwares are Ms. Access, Oracle, Ms SQL
server.
Meta Data:
Data that describe the properties or characteristics of other data.
4
Example: consider the following data
5
Hardware
Hardware identifies all the system's physical devices. It includes computers,
computer peripherals, network components etc.
Software Software refers to the collection of programs used with in the
database system. It includes :
1. Operating System
The operating System manages all the hardware components and
makes it possible for all other software to run on the computers.
UNIX, LINUX, Microsoft Windows etc are the popular operating
systems used in database environment.
2. DBMS Software
DBMS software manages the database with in the database
system. Oracle Corporation's ORACLE, IBM's DB2, Sun's MYSQL,
Microsoft's MS Access and SQL Server etc are the popular DBMS
(RDBMS) software used in the database environment.
3. Application Programs and Utilities
Application programs and utilities software are used to access and
manipulate the data in the database and to manage the operating
environment of the database.
People in a Database System Environment
People component includes the following five types of users in a database
system
System Administrators
Data designer
Database Administrators
System Analysts and Programmers
End Users
System Administrators oversees the database system's general
operations.
Data Designer (Architect) prepare the conceptual design
Database Administrator (DBA) physically implements and
maintains the database according to the logical design.
System Analysts and programmers design and implement the
application programs.
End Users are the people who use the application. For example
in case of a banking system, the employees, customer using
ATM or online banking facility are end users.
6
Procedures in a Database Environment
Procedures are the instructions and business rules that govern the design
and use of the database system.
Data in the Database
Data are the very important basic entity in a database. It is the collection of
facts stored in the database.
Roles in the Database Environment
There are 4 distinct types of people who involve in Database Environment:
Data & Database Administrators
Database Designers
Application Developers
End-Users
DA (data administrator) is responsible for the management of the data
resource:
Database planning
Development & maintenance of standards
Policies and procedures
Conceptual/logical database design
DBA (Database Administrator) is responsible for the physical realisation of
the database:
Physical database design & implementation
Security & integrity control
Maintenance of operational system
Ensuring satisfactory performance of the applications for
users Database designers is concerned with:
Identifying the data
Identifying relationship between entities and attributes
Identify the relationships between the data
Understand the constraints on the data (business rules)
The work of the logical database designers can be split into
two stages:
Conceptual database design
o Independent of implementation details
o Application programs
o Programming languages
Logical database design
o Specific data models
o E.g.: relational, network, hierarchical or object-oriented
Physical database designer decides how the logical database design
is to be physically realised. It involves:
Mapping the logical database design into a set of tables &
integrity constraints
Selecting specific storage structures and access methods for
the data
Designing any security measures
Application Developers
They works from the specification produced by systems analysts
7
Each program may contain statements that request the DBMS to
perform some operation:
Retrieving data
Insert data
Delete data
Updating data
End Users
End Users are the Clients of the database and they can be
classified as:
o Naïve users
Typically unaware of the DBMS
o Sophisticated users
Familiar with the structure of the DBMS
May use a high-level query language to perform
required operation
History of Database Management Systems
It is believed that the Lack of structural independence was the main cause.
8
1970's- 1990's: The emergence of the relational DBMS on the hands of
Edgar Codd. He worked at IBM, and he was unhappy with the navigational
model of the CODASYL APPROACH. To him, a tool for searching, such as a
search facility was very useful, and it was absent . In 1970, he proposed a
new approach to database construction, which made the creation of a
Relational DBMS intended for Large Shared Data Banks, possible and easy
to grab.
This was a new system for entering data and working with big databases,
where the idea was to use a table of records. All tables will be then linked by
either one to one relationships, one to many, or many to many. when
elements took space and were not useful, it was easy to remove them from
the original table, and all the other "entries" in other tables linked to this
record were removed. Worth mentioning, is that two initial projects were
launched, the R program at IBM, and INGRES program at the university of
California. In 1985, the object oriented DBMS was developed, but it did not
have any booming commercial profit because of the high unjustified costs to
change systems, and format.
In 1990, the DBMS took on a new object oriented approach joint with
relational DBMS. In this approach, text, multimedia, internet and web use
in conjunction with DBMS were available and possible.
9
A user view is a logical description of some portion of database that is
required by a user to perform some task. A user view is often a form or
report that comprises data from more than one table.
Increased Productivity:
There are two reasons for the rapid development of applications:
The programmer concentrates on the specific functions required for
new application, without having to worry about file design or low level
implementation details.
DBMS provides a number of high level productivity tools such as form
and report generators and high-level languages that automate the
activities of database design and implementation.
Enforcement of standards:
The standards include naming conventions, data quality standards and
uniform procedures for accessing, updating and protecting data.
DBMS provides powerful set of tools for developing and enforcing the above
standards.
Improved Data Quality:
The DBMS provides number of tools and processes to improve data quality.
Two of the more important processes are the following.
Database designers can specify the integrity constraints that are
enforced by the DBMS. A constraint is a rule that can’t be violated by
database users.
One of the objectives of data warehouse environment is to clean up
operational data before they are placed in the data warehouse.
10
Greater impact on system failure:
This is similar to “All eggs in a same basket”. As database is a shared
resource, it must be available to all users all times. So the failure of system
causes greater impact on the organization.
Complex Backup and Recovery procedures:
The organization must maintain complex procedures for backup and
recovery. Back up refers maintaining an additional copy of data to use when
there is damage to the data in the database. Recovery procedures recover
the database when damage occurs.
The Three-Level ANSI-SPARC (DATABASE) Architecture
The Architecture of most of commercial dbms are available today is mostly
based on this ANSI-SPARC database architecture.
ANSI SPARC THREE-TIER architecture has main three levels:
1. Internal Level
2. Conceptual Level
3. External Level
These three levels provide data abstraction. It means it hides the
low level complexities from end users .
A database system should be efficient in performance and convenient
in use.
Using these three levels,it is possible to use complex structures at
internal level for efficient operations and to provide simpler convenient
interface at external level.
1. Internal level:
This is the lowest level of data abstraction.
It describes how the data are actually stored on storage devices.
It is also known as physical level.
It provides internal view of physical storage of data.
It deals with complex low level data structures, file structures and
access methods in detail.
It also deals with Data Compression and Encryption techniques,if
used.
2. Conceptual level:
This is the next higher level than internal level of data abstraction.
It describes what data are stored in the database and what
relationships exist among those data.
It is also known as Logical level.
11
It hides low level complexities of physical storage.
Database administrator and designers work at this level to
determine what data to keep in database.
Application developers also work on this level.
3. External Level:
This is the highest level of data abstraction.
It describes only part of the entire database that a end user concern.
It is also known as an view level.
End users need to access only part of the database rather than entire
database.
Different user need different views of database.And so,there can
be many view level abstractions of the same database.
Advantages of Three-tier Architecture:
The main objective of it is to provide data abstraction.
Same data can be accessed by different users with different
customized views.
The user is not concerned about the physical data storage details.
Physical storage structure can be changed without requiring changes
in internal structure of the database as well as users view.
Conceptual structure of the database can be changed without
affecting end users.
12
DATA INDEPENDENCE
A major objective for the three-level architecture is to provide data
independence, which means that upper levels are unaffected by changes to
lower levels. There are two levels of data independence:
1. Physical Data Independence
2. Logical Data Independence
These are described below:
1. Physical Data Independence:
Physical Data Independence is the ability to modify the physical
schema without requiring any change in application programs.
Modifications at the internal levels are occasionally necessary to
improve performance. Possible modifications at internal levels are
change in file structures, compression techniques, hashing
algorithms, storage devices, etc.
Physical data independence separates conceptual levels from
the internal levels.
This allows to provide a logical description of the database without the
need to specify physical structures.
Comparatively, it is easy to achieve physical data independence.
2. Logical Data Independence:
Logical data independence is ability to modify the conceptual schema
without requiring any change in application programs.
13
Modification at the logical level is necessary whenever the logical
structures of the database are altered.
Logical data independence separates external level from the
conceptual view.
Comparatively it is difficult to achieve logical data independence.
Application programs are heavily dependent on logical structures of
the data they access.so any change in logical structure also requires
programs to change.
Database Languages
A data sublanguage consists of two parts: a Data Definition Language
(DDL) and a Data Manipulation Language (DML). The DDL is used to
specify the database schema and the DML is used to both read and
update the database. These languages are called data sublanguages
because they do not include constructs for all computing needs, such as
conditional or iterative statements, which are provided by the high-level
programming languages. Many DBMSs have a facility for embedding the
sublanguage in a high-level programming language such as COBOL,
Fortran, Pascal, Ada, C, C++, C#, Java, or Visual Basic. In this case, the
high-level language is sometimes referred to as the host language.
DATA DEFINITION LANGUAGE (DDL) COMMANDS.
The DDL commands are used to define a database including creating,
altering, and dropping tables and establishing constraints. DDL deals with
meta data.
The commands are explained below.
Create command:
The CREATE TABLE Statement is used to create tables to store data.
The Syntax for the CREATE TABLE Statement is:
CREATE TABLE table_name
(column_name1 datatype,
column_name2 datatype,
... column_nameN datatype
);
table_name - is the name of the table.
column_name1, column_name2.... - is the name of the columns
datatype - is the datatype for the column like char, date, number
etc.
14
For Example: If you want to create the employee table, the statement would
be like,
CREATE TABLE employee
( id number(5),
name char(20),
dept char(10),
age number(2),
salary number(10),
location char(10)
);
ALTER COMMAND:
The SQL ALTER TABLE command is used to modify the definition
(structure) of a table by modifying the definition of its columns. The
ALTER command is used to perform the following functions.
1) Add, drop, modify table columns
2) Add and drop constraints
3) Enable and Disable constraints
Syntax to add a column
ALTER TABLE table_name ADD column_name datatype;
For Example: To add a column "experience" to the employee table, the
query would be like
ALTER TABLE employee ADD experience number(3);
Syntax to drop a column
ALTER TABLE table_name DROP column_name;
For Example: To drop the column "location" from the employee table, the
query would be like
ALTER TABLE employee DROP
location; Syntax to modify a column
ALTER TABLE table_name MODIFY column_name datatype;
For Example: To modify the column salary in the employee table, the query
would be like
ALTER TABLE employee MODIFY salary number(15,2);
DROP COMMAND:
It is used to delete the database objects like tables, indexes and views.
15
Syntax:
DROP object_type object_name;
Example : DROP table employee;
TRUNCATE command:
It is used to delete rows with auto commit
Syntax :
TRUNCATE table table_name;
Example:
TRUNCATE table employee;
RENAME COMMAND
The SQL RENAME command is used to change the name of the table or a
database object.
Syntax to rename a table
RENAME old_table_name To new_table_name;
For Example: To change the name of the table employee to my_employee,
the query would be like
RENAME employee TO my_emloyee;
DATA MANIPULATIN COMMANDS (DML).
DML commands are used to maintain and access a database, including
updating, inserting, modifying and retrieving data. It deals with data. The
commands are explained below.
SELECT Command:
The SQL SELECT statement is used to query or retrieve data from a table in
the database. A query may retrieve information from specified columns or
from all of the columns in the table
Syntax of SQL SELECT Statement:
SELECT column_list FROM table-name
[WHERE Clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause];
table-name is the name of the table from which the information is
retrieved.
column_list includes one or more columns from which data is
retrieved.
16
The code within the brackets is optional.
Example:
SELECT first_name FROM student_details;
INSERT Command:
The INSERT Statement is used to add new rows of data to a table.
We can insert data to a table in two ways,
1) Inserting the data directly to a table.
Syntax for SQL INSERT is:
INSERT INTO TABLE_NAME
[ (col1, col2, col3,...colN)]
VALUES (value1, value2, value3,...valueN);
col1, col2,...colN -- the names of the columns in the table into which
you want to insert data.
Example:
INSERT INTO employee (id, name, dept, age, salary location) VALUES
Example
INSERT INTO employee
SELECT * FROM temp_employee;
SQL UPDATE STATEMENT
The UPDATE Statement is used to modify the existing rows in a table.
Syntax
UPDATE table_name
SET column_name1 = value1,
17
column_name2 = value2,
... [WHERE condition]
table_name - the table name which has to be updated.
column_name1, column_name2.. - the columns that gets changed.
value1, value2... - are the new values.
Example:
UPDATE employee
SET location ='Mysore'
WHERE id = 101;
Delete command:
The DELETE Statement is used to delete rows from a table.
Syntax:
DELETE FROM table_name [WHERE condition];
table_name -- the table name which has to be updated.
Example: To delete an employee with id 100 from the employee table, the
sql delete query would be like,
DELETE FROM employee WHERE id = 100;
To delete all the rows from the employee table, the query would be like,
DELETE FROM employee
DATA MODELS
The structure of the database is called the data models.A Collection of
conceptual tools for describing data, data relationship, data semantic and
consistency constraint.
There are three different groups.
Record-based logical models
1. Relational model
2. Network model
3. Hierarchical model
Object-based logical models
1. The entity-relationship model
2. The object-oriented model
3. The semantic data model
4. The functional data model
18
Physical models
1. Unifying model
2. Frame-memory model
Record-based logical models
Record based logical models are used in describing data at the logical
and view levels.
Record-base models are named as database structure have fixed
format records of several types.
Each record type define a fixed number of fields or attributes.
Each attributes and each fields is a usually of a fixed length.
There are three most widely used record-based models are:
Relational model:
The relational model was introduced by E.F. Codd in 1970. The basic data
structure of the relational model is the table. In a table the information
about an entity is stored. The are represented in rows and columns. Each
row is an instance of entity type. Each column is an attribute of an entity. In
model there are no physical links as in hierarchical and network model.
Advantages
o Structural independence
o Improved conceptual simplicity
o Easier database design, implementation, management, and use
o Ad hoc query capability with SQL
o Powerful database management system
19
Disadvantages
o Large hardware and system software overhead
o Possibility of Poor design and implementation
May promote “islands of information” problems
Network model:
Data in the network model are represented by collections of record
and relationships among data are represented by links, which can be
viewed as pointers.
The records in the database are organized as collection of arbitrary
graphs
Disadvantages
o System complexity
o Lack of structural independence
Hierarchical model:
In hierarchical model the data and relationships among
the data are represented by records and links.
It is same as network model but differs in terms of
organization of records as collections of trees rather than
graphs.
20
Hierarchical Model
Advantages Hierarchical model :
1. Simplicity
2. Data Security and Data Integrity
3. Efficiency
Disadvantages Hierarchical model :
1. Implementation Complexity
2. Lack of structural independence
Object-based logical models
Object-based logical models are used in describing data at the logical
and the view levels.
They provide fairly flexible structuring capabilities and allow data
constraints to be specified explicitly.
There are many different models more widely known models are:
Entity-relationship model:
It is the graphical representation of entities, attributes and
relationships. It is developed by peter chen in 1976. ER models are
normally represented in an entity relationship diagram(ERD). The ER
model is based on the following components.
The entity name is a noun and is generally written in capital letters
and is written in the singular form. An entity is represented in the
ERD by a rectangle.
An attribute is represented by an ellipse. Each entity is described by a
set of attributes.
Relationships describe associations among data. The name of the
relationship usually is a verb. For example, a PAINTER paints many
PAINTINGS. Relationship is represented by a diamond.
21
The following diagram is an example for entity relationship model.
• Advantages
– Exceptional conceptual simplicity
– Visual representation
– Effective communication tool
– Integrated with the relational database model
• Disadvantages
– Limited constraint representation
– Limited relationship representation
– No data manipulation language
– Loss of information content
Object-oriented model:
In the object oriented data model, both data and their relationships
(operations) are contained in a single structure known as an object. In turn
the OODM is the basis for the object-oriented database management
system (OODBMS). A traditional database stores just data and not
procedures. In contrast, an Object oriented database (OODB) stores objects.
An object consists of data and related methods. The OODM is said to be a
semantic (meaningful) data model.
The OO data model is based on the following componets.
Object: It is an abstraction of real world entity. It is equivalent to ER model
entity.
Attribute: It describes the properties of an object.
Method: It specifies how an operation is performed on data.
Class: It includes the data members and a set of methods.
Inheritance: It is the ability of object to inherit attributes and methods of
classes above it
22
• Advantages
– Adds semantic content
– Visual presentation includes semantic content
– Database integrity
– Both structural and data independence
• Disadvantages
– Lack of Standards
– Complex navigational data access
– High system overhead slows transactions
Semantic model:
The Semantic Data Model (SDM), like other data models, is a way of
structuring data to represent it in a logical way. SDM differs from other data
models, however, in that it focuses on providing more meaning of the data
itself, rather than solely or primarily on the relationships and attributes of
the data.
SDM provides a high-level understanding of the data by abstracting it
further away from the physical aspects of data storage
Functional data model:
In conventional database systems, procedures, data structures and actual
content are usually separated. Thus, a conventional database management
systems (DBMS) provides users with a possibility to store, modify or
retrieve data that structured in accordance with a current database
schema.
23
It should be especially noted, that a DBMS retrieves data as they were
stored into the database and additional procedures can be applied to such
data as an independent level of application programs.
In contrast, the functional data model provides an unified approach to
manipulation both data and procedures. Main idea of the functional data
model is a definition of all components of an information system in the form
of functions. Thus, for example, the functional data model defines data
objects, attributes and relationships as so-called database functions.
Moreover, a Functional Data Manipulation Language is a number of data
manipulation functions which can be applied to database functions. Finally,
users are provided with a special mechanism which is called Lambda
Calculus to define their own functions which can be seamlessly combined
with database and data
Physical Data Models
Physical data models describe how data is stored in the computer,
representing information such as record structures, record orderings, and
access paths. There are not as many physical data models as logical data
models; the most common ones are the unifying model and the frame
memory.
FUNCTIONS OF DBMS
There are several functions that a DBMS performs to ensure data integrity
and consistency of data in the database. The functions in the DBMS are
explained below
1. Data Dictionary Management
Data Dictionary is a loction where the DBMS stores definitions of the data
and their relationships (metadata). This function removes structural and
data dependency and provides data abstraction. The Data Dictionary is
hidden from the user and is used by Database Administrators and
Programmers.
2. Data Storage Management
This particular function is used for the storage of data and any related data
entry forms or screen definitions, report definitions etc. Users do not need to
know how data is stored or manipulated. Data storage structure effects the
speed of operation.
3. Data Transformation and Presentation
This function exists to transform any data entered into required data
structures. By using the data transformation and presentation function the
DBMS can determine the difference between logical and physical data
formats.
4. Security Management
This is one of the most important functions in the DBMS. Security
management determines specific users that are allowed to access the
database. Users are given a username and password. This function
determines what specific data any user can see or manage.
24
5. Multiuser Access Control
Data integrity and data consistency are the basis of this function. Multiuser
access control is a very useful tool in a DBMS, it enables multiple users to
access the database simultaneously without affecting the integrity of the
database.
6. Backup and Recovery Management
Backup and recovery functions are essential in a database. Recovery
management is how to recover the database after the outage. Backup
management refers maintaing additional copy of data for the data safety
and integrity.
7. Data Integrity Management
The DBMS enforces the rules to reduce data redundancy, and maximize
data consistency, due to this data integrity management is possible.
8. Database Access Languages and Application Programming Interfaces
Database approah provids non procedural query languages such as SQL to
provide easy access to database. It also provides API with the concepts such
as VB.NET, JAVA.
9. Database Communication Interfaces
A DBMS can provide access to the database using the Internet through Web
Browsers (Mozilla Firefox, Internet Explorer, Netscape).
Components of DBMS
DBMSs are highly complex and sophisticated pieces of software that aim to
provide the services. It is not possible to generalize the component structure
of a DBMS, as it varies greatly from system to system.
The major software components in a DBMS environment are depicted below:
25
Query processor. This is a major DBMS component that transforms
queries into a series of low-level instructions directed to the database
manager.
Database manager (DM). The DM interfaces with user-submitted
application programs and queries. The DM accepts queries and
examines the external and conceptual schemas to determine what
conceptual records are required to satisfy the request. The DM then
places a call to the file manager to perform the request.
File manager. The file manager manipulates the underlying storage
files and manages the allocation of storage space on disk. It
establishes and maintains the list of structures and indexes defined in
the internal schema. If hashed files are used, it calls on the hashing
functions to generate record addresses. However, the file manager
does not directly manage the physical input and output of data.
Rather, it passes the requests on to the appropriate access methods,
which either read data from or write data into the system buffer (or
cache).
DML preprocessor: This module converts DML statements embedded
in an application program into standard function calls in the host
26
language. The DML preprocessor must interact with the query
processor to generate the appropriate code.
DDL compiler :The DDL compiler converts DDL statements into a set of
tables containing metadata. These tables are then stored in the
system catalog while control information is stored in data file headers.
Catalog manager: The catalog manager manages access to and
maintains the system catalog. The system catalog is accessed by most
DBMS components.
The major software components for the database manager are as follows:
Authorization control. This module confirms whether the user has
the necessary authorization to carry out the required operation.
Command processor. Once the system has confirmed that the user has
authority to carry out the operation, control is passed to the command
processor.
Integrity checker. For an operation that changes the database, the
integrity checker checks whether the requested operation satisfies
all necessary integrity constraints (such as key constraints).
Query optimizer. This module determines an optimal strategy for the
query execution.
Transaction manager. This module performs the required processing of
operations that it receives from transactions.
Scheduler. This module is responsible for ensuring that concurrent
operations on the database proceed without conflicting with one
another. It controls the relative order in which transaction operations
are executed.
Recovery manager. This module ensures that the database remains in
a consistent state in the presence of failures. It is responsible for
transaction commit and abort.
Buffer manager. This module is responsible for the transfer of data
between main memory and secondary storage, such as disk and tape.
The recovery manager and the buffer manager are sometimes referred
to collectively as the data manager. The buffer manager is sometimes
known as the cache manager.
RELATIONAL DATA MODEL.
The relational model was introduced by E.F. Codd in 1970. The basic data
structure of the relational model is the table. In a table the information
about an entity is stored. The are represented in rows and columns. Each
row is an instance of entity type. Each column is an attribute of an entity. In
model there are no physical links as in hierarchical and network model.
27
Advantages
o Structural independence
o Improved conceptual simplicity
o Easier database design, implementation, management, and use
o Ad hoc query capability with SQL
o Powerful database management system
Disadvantages
o Large hardware and system software overhead
o Possibility of Poor design and implementation
o May promote “islands of information” problems
Terminology related to Relational Data Model:
The relational model is based on the mathematical concept of a relation,
which is physically represented as a table. Codd, a trained mathematician,
used terminology taken from mathematics, principally set theory and
predicate logic.
In the relational model, relations are used to hold information about the
objects to be represented in the database. A relation is represented as a two
dimensional table in which the rows of the table correspond to individual
records and the table columns correspond to attributes. Attributes can
28
appear in any order and the relation will still be the same relation, and
therefore will convey the same meaning.
Domain A domain is the set of allowable values for one or more attribute
A domain is the set of allowable values for one or more attributes.Domains
are an extremely powerful feature of the relational model. Every attribute in
a relation is defined on a domain. Domains may be distinct for each
attribute, or two or more attributes may be defined on the same domain
The elements of a relation are the rows or tuples in the table. In the Branch
relation, each row contains four values, one for each attribute. Tuples can
appear in any order and the relation will still be the same relation, and
therefore convey the same meaning.
The structure of a relation, together with a specification of the domains and
any other restrictions on possible values, is sometimes called its intension,
which is usually fixed, unless the meaning of a relation is changed to
include additional attributes. The tuples are called the extension (or state)
of a relation, which changes over time.
The Branch relation in Figure 4.1 has four attributes or degree four. This
means that each row of the table is a four-tuple, containing four values. A
relation with only one attribute would have degree one and be called a
unary relation or one-tuple.
A relation with two attributes is called binary, one with three attributes is
called ternary, and after that the term n-ary is usually used. The degree of
a relation is a property of the intension of the relation.
29
Cardinality The cardinality of a relation is the number of tuples it contains.
Alternative Terminology
.
Mathematical Relations
Suppose that we have two sets, D1 and D2, where D1 = {2, 4} and D2 = {1, 3,
5}. The Cartesian product of these two sets, written as D1 X D2, is the set
of all ordered pairs such that the first element is a member of D1 and the
second element is a member of D2. An alternative way of expressing this is
to find all combinations of elements with the first from D1 and the second
from D2. In our case, we have:
D1 X D2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}
Any subset of this Cartesian product is a relation. For example, we could
produce a relation R such that:
30
Using these same sets, we could form another relation S in which the first
element is always twice the second. Thus, we could write S as:
S = {(x, y) | x ∈ D1, y ∈ D2, and x = 2y}
or, in this instance, S = {(2, 1)} as there is only one ordered pair in the
Cartesian product that satisfies this condition.
We can easily extend the notion of a relation to three sets. Let D1, D2, and
D3 be three sets. The Cartesian product D1 X D2 X D3 of these three sets is
the set of all ordered triples such that the first element is from D1, the
second element is from D2, and the third element is from D3. Any subset of
this Cartesian product is a relation.
For example, suppose we have:
D1 = {l, 3} D2 = {2, 4} D3 = {5, 6}
D1 X D2 X D3 = {(1, 2, 5), (1, 2, 6), (1, 4, 5), (1, 4, 6), (3, 2, 5), (3, 2, 6), (3, 4,
5), (3, 4,6)}
Any subset of these ordered triples is a relation. We can extend the three
sets and define a general relation on n domains. Let D1, D2, . . . , Dn be n
sets. Their Cartesian product is defined as:
D1 X D2 X . . . X Dn = {(d1, d2, . . . , dn)|d1 ∈ D1, d2 ∈ D2, . . . , dn ∈ Dn}
and is usually written as:
Any set of n-tuples from this Cartesian product is a relation on the n sets.
In defining these relations we have to specify the sets, or domains, from
which we choose values.
Relation schema
A named relation defined by a set of attribute and domain name
pairs.Let A1, A2, . . . , An be attributes with domains D1, D2, . . . , Dn.
Then the set {A1:D1, A2:D2, . . . , An:Dn} is a relation schema. A relation R
defined by a relation schema S is a set of mappings from the attribute
names to their corresponding domains. Thus, relation R is a set of n-
Each element in the n-tuple consists of an attribute and a value for that
attribute. Normally, when we write out a relation as a table, we list the
attribute names as column headings and write out the tuples as rows
having the form (d1, d2, . . . , dn), where each value is taken from the
appropriate domain. In this way, we can think of a relation in the relational
model as any subset of the Cartesian product of the domains of the
31
attributes. A table is simply a physical representation of such a relation.
32
Eg:
{(B005, 22 Deer Rd, London, SW1 4EH)}
More correctly it can be written as:
{(branchNo: B005, street: 22 Deer Rd, city: London, postcode: SW1 4EH)}
We refer to this as a relation instance
Relational database schema
A set of relation schemas, each with a distinct name. If R1, R2, . . . , Rn are
a set of relation schemas, then we can write the relational database schema,
or simply relational schema, R, as:
R = {R1, R2, . . . , Rn}
Properties of a relation
1. A table is a two-dimensional structure composed of rows and
columns.
2. Each table row (tuple) represents a single entity occurrence within
the entity set and must be distinct.
3. Duplicate rows are not allowed in a relation.
4. Each table column represents an attribute, and each column has a
distinct name.
5. Each cell or column/row intersection in a relation should contain only
an atomic value – that is a single data value. Multiple values are not
allowed in the cells of a relation.
6. All values in a column must be in same data format. For example, if
the attribute is assigned an integer data format, all values in the
column representing that attribute must be integers.
7. Each column has a specific range of values known as the attribute
domain.
8. The order of the rows and columns is immaterial to the DBMS.
9. Each table must have an attribute or a combination of attributes that
uniquely identifies each row.
Representing Relational Database Schemas
A relational database consists of any number of normalized relations. The
relational schema for part of the DreamHome case study is:
Branch (branchNo, street, city, postcode)
Staff (staffNo, fName, IName, position, sex, DOB, salary, branchNo)
PropertyForRent(propertyNo, street, city, postcode, type, rooms, rent,
ownerNo, staffNo, branchNo)
Client (clientNo, fName, IName, telNo, prefType, maxRent, eMail)
PrivateOwner (ownerNo, fName, IName, address, telNo, eMail, password)
Viewing (clientNo, propertyNo, viewDate, comment)
33
Registration (clientNo, branchNo, staffNo, dateJoined)
The common convention for representing a relation schema is to give the
name of the relation followed by the attribute names in parentheses.
Normally, the primary key is underlined.
The conceptual model, or conceptual schema, is the set of all such schemas
for the database.
Null
A null can be taken to mean the logical value “unknown.” It can mean
that a value is not applicable to a particular tuple, or it could merely
mean that no value has yet been supplied.
Nulls are a way to deal with incomplete or exceptional data. However, a
null is not the same as a zero numeric value or a text string filled with
spaces; zeros and spaces are values, but a null represents the absence of
a value. Therefore, nulls should be treated differently from other values.
Some authors use the term “null value”; however, as a null is not a value
but represents the absence of a value, the term “null value” is
deprecated.
Nulls can cause implementation problems, arising from the fact that the
relational model is based on first-order predicate calculus, which is a two-
valued or Boolean logic—the only values allowed are true or false.
Allowing nulls means that we have to work with a higher-valued logic,
such as three- or four-valued logic
The incorporation of nulls in the relational model is a contentious issue.
Codd later regarded nulls as an integral part of the model
INTEGRITY RULES. (OR)
ENTITY INTEGRITY AND REFERENTIAL INTEGRITY
Relational database integrity rules are very important to good database
design. Many (RDBMSs enforce integrity rules automatically. Our
application design implements entity and referential integrity rules
mentioned below:
Entity integrity:
Entity integrity is an integrity rule which states that every table must have a
primary key and that the column or columns chosen to be the primary key
should be unique and not null.
Eg: CUS_NUM in CUSTOMER Table cannot be null and and should not be
repeated.
Referential integrity:
Referential integrity is the relational property that each foreign key value in
a table exists as a primary key in the referenced table otherwise null.
34
Example:
RELATIONAL KEYS.
The relational keys in DBMS are explained below.
Primary key
An attribute or combination of attributes that have unique identification
property is called primary key. It cannot contain null values and repeated
values.
Eg. STU_NUM in STUDENT Table is the primary key.
Composite primary key
Not every relation will have single-attribute primary key. There can be a
possibility that some combination of attributes have the unique
identification property. These attributes as a group is called composite
primary key.
Candidate key
In a relation, there can be more than one attribute combination possessing
the unique identification property. These combinations, which can act as
primary key, are called candidate keys.
35
Table having “EmpNo” and “SocSecurityNo” as candidate keys
Secondary key An attribute (or combination of attributes) used strictly for
data retrieval purposes.
Foreign key An attribute (or combination of attributes) in one table whose
values must either match the primary key in another table or be null.
Recursive Foreign key: It is an attribute (or combination of attributes) in
one table which refers to the values of primary key in same table.
Super key:
An attribute or combination of attributes that have unique identification
property is called super key.
Views
A view is a virtual or derived relation: a relation that does not necessarily
exist in its own right, but may be dynamically derived from one or more base
relations. Thus, an external model can consist of both base (conceptual-
level) relations and views derived from the base relations.
Base relation
A named relation corresponding to an entity in the conceptual schema,
whose tuples are physically stored in the database.
We can define a view in terms of base relations: The dynamic result of one or
more relational operations operating on the base relations to produce
another relation. A view is a virtual relation that does not necessarily exist in
the database but can be produced upon request by a particular user, at the
time of request.
A view is a relation that appears to the user to exist, can be manipulated as
if it were a base relation, but does not necessarily exist in storage in the
sense that the base relations do. The contents of a view are defined as a
query on one or more base relations. Any operations on the view are
automatically translated into operations on the relations from which it is
derived. Views are dynamic, meaning that changes made to the base
relations that affect the view are immediately reflected in the view. When
users make permitted changes to the view, these changes are made to the
underlying relations
Purpose of Views
The view mechanism is desirable for several reasons:
It provides a powerful and flexible security mechanism by hiding
parts of the database from certain users. Users are not aware of the
existence of any attributes or tuples that are missing from the view.
36
It permits users to access data in a way that is customized to their
needs, so that the same data can be seen by different users in
different ways, at the same time.
It can simplify complex operations on the base relations. For example,
if a view is defined as a combination (join) of two relations users may
now perform more simple operations on the view, which will be
translated by the DBMS into equivalent operations on the join.
A view should be designed to support the external model that the user finds
familiar. For example:
A user might need Branch tuples that contain the names of managers
as well as the other attributes already in Branch. This view is created
by combining the Branch relation with a restricted form of the Staff
relation where the staff position is “Manager.”
Some members of staff should see Staff tuples without the salary
attribute.
Attributes may be renamed or the order of attributes changed. For
example, the user accustomed to calling the branchNo attribute of
branches by the full name Branch Number may see that column
heading.
Some members of staff should see property records only for those
properties that they manage.
Updating Views
All updates to a base relation should be immediately reflected in all views
that reference that base relation. Similarly, if a view is updated, then the
underlying base relation should reflect the change. However, there are
restrictions on the types of modification that can be made through views. We
summarize here the conditions under which most systems determine
whether an update is allowed through a view:
Updates are allowed through a view defined using a simple query
involving a
Single base relation and containing either the primary key or a
candidate key of the base relation.
Updates are not allowed through views involving multiple base
relations. Updates are not allowed through views involving aggregation
or grouping operations.
The Relational Algebra
The relational algebra is a theoretical language with operations that work on
one or more relations to define another relation without changing the
original relation(s). Thus both the operands and the results are relations,
and so the output from one operation can become the input to another
37
operation. This ability allows expressions to be nested in the relational
algebra, just as we can nest arithmetic operations. This property is called
closure: relations are closed under the algebra, just as numbers are closed
under arithmetic operations.
The relational algebra is a relation-at-a-time (or set) language in which all
tuples, possibly from several relations, are manipulated in one statement
without looping. There are several variations of syntax for relational algebra
commands and we use a common symbolic notation for the commands and
present it informally.
Unary Operations:
Selection (or Restriction)
The Selection operation works on a single relation R and defines a
relation that contains only those tuples of R that satisfy the specified
condition (predicate).
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is
prepositional logic formula which may use connectors like and, or, and not.
These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example −
σ "database"(Books)
Selects tuples from books where subject is 'database'.
σ "database" and price = "450"(Books)
Selects tuples from books where subject is 'database' and 'price' is 450.
Projection
The Projection operation works on a single relation R and defines a
relation that contains a vertical subset of R, extracting the values of
specified attributes and eliminating duplicates.
38
Notation : ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation
Books.
Set Operations
Union:
The union of two relations R and S defines a relation that contains all the
tuples of R, or S, or both R and S, duplicate tuples being eliminated. R
and S must be union-compatible
Notation − r U s
Where r and s are either database relations or relation result set (temporary
relation).
For a union operation to be valid, the following conditions must hold −
r, and s must have the same number of attributes.
Attribute domains must be compatible.
(Books) 𝖴 ∏
Duplicate tuples are automatically eliminated.
Notation − r ∩ s
Finds all the tuples that are present in r and s.
∏ author (Books) ∩ ∏ author (Articles)
Provides the name of authors who have written books and articles.
Cartesian product:
The Cartesian product operation defines a relation that is the concatenation
of every tuple of relation R with every tuple of relation S.
40
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Where r and s are relations and their output will be defined as −
R1 ⋈θ R2
Notation
R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,.. ,Bn)
such that the attributes don’t have anything in common, that is R1 ∩ R2 =
Φ.
Theta join can use all kinds of comparison operators.
Student Subjects
41
Class Subject
10 Math
SID Name Std 10 English
11 Music
11 Sports
101 Alex 10
Student_detail
SID Name Std Class Subject
101 Alex 10 10 Math
101 Alex 10 10 English
102 Maria 11 11 Music
102 Maria 11 11 Sports
Equijoin
When Theta join uses only equality comparison operator, it is said to be
equijoin. The above example corresponds to equijoin.
Natural Join (⋈)
Natural join does not use any comparison operator. It does not concatenate
the way a Cartesian product does. We can perform a Natural Join only if
there is at least one common attribute that exists between two relations. In
addition, the attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes
in both the relations are same.
Courses HoD
42
Courses ⋈ HoD
Outer Joins
Theta Join, Equijoin, and Natural Join are called inner joins. An inner join
includes only those tuples with matching attributes and the rest are
discarded in the resulting relation. Therefore, we need to use outer joins to
include all the tuples from the participating relations in the resulting
relation. There are three kinds of outer joins − left outer join, right outer
join, and full outer join.
Left Outer Join (R S)
All the tuples from the Left relation, R, are included in the resulting relation.
If there are tuples in R without any matching tuple in the Right relation S,
then the S-attributes of the resulting relation are made NULL.
Left
AB
100 Database
101 Mechanics
102 Electronics
Right
A B
100 Alex
102 Maya
104 Mira
43
Courses HoD
A B C D
Courses HoD
A B C D
Courses HoD
A B C D
44
--- --- 104 Mira
Division Operation
The Division operation defines a relation over the attributes C that consists
of the set of tuples from R that match the combination of every tuple in S.
For an example see the tables Completed, DBProject and their division:
COUNT (): This function returns the number of rows in the table that
satisfies the condition specified in the WHERE condition. If the WHERE
condition is not specified, then the query returns the total number of
rows in the table.
Example: SELECT COUNT (*) FROM employee
WHERE dept = 'Electronics';
45
SELECT DISTINCT dept FROM employee;
MAX(): This function is used to get the maximum value from a column.
To get the maximum salary drawn by an employee, the query would be:
MIN(): This function is used to get the minimum value from a column.
To get the minimum salary drawn by an employee, he query would be:
TYPES OF AGGREGATIONS
Scalar aggregate: If a single value is returned from an SQL query that
includes an aggregate function then it is scalar aggregate. The following
example returns single value. i.e. sum of the salaries of all employees.
SELECT SUM(SAL) FROM EMP;
Vector aggregate: If multiple values are returned from an SQL query that
includes an aggregate function then it is vector aggregation. The following
example returns multiple values i.e. sum of the salaries of all employees in
each department.
SELECT DEPTNO, SUM(SAL) FROM EMP GROUP BY DEPTNO;
46
UNIT - II
ERD DEFINITION & SYMBOLS USED IN ERD
ENTITY RELATIONSHIP MODELING
Entity-Relationship Model: A database can be modeled as a collection of
entities, relationship among entities in a graphical representation. This
model is called Entity Relationship Model. It first developed by peter chen. It
has become more popular because it is easy to understand.
The ER model is based on the following components:
Entities:
An entity may be a person, place, object, event or even a concept about
which the organization wish to maintain the data in the database.
Examples: Person : CUSTOMER, EMPLOYEE
Place : WAREHOUSE, CITY, STORE
Object: PRODUCT, MACHINE
Event : SALE, ADMISSION
Concept : ACCOUNT, COURSE
Attributes :
The properties or characteristics of an entity type. For example the
attributes of CUSTOMER entity may be customer number, customer name,
address, phone number etc.
Relationship:
The meaningful association between entities is called relationship. The
relationships may of many types such as one – to – one relationship, one –
to – many relationship, many – to – many relationship.
Eg EMPLOYEE is assigned PARKING PLACE
ORDER contains ORDERLINE
STUDENT takes COURSE.
Entity set
An entity set is a set of entities of the same type that share the same
properties.
Example: set of all persons, companies, trees, holidays
Entity Relationship Diagram (E R diagram): ERD is the graphical
representation of an entity-relationship model. It includes the components
such as entities, relationships, attributes with different symbols.
47
Rectangles represent entity sets.
Diamonds represent relationship sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Underline indicates primary key attributes
The basic ER model symbols are represented below
48
Strong entity
Weak entity
Associative entity
Strong Entity:
An entity whose existence is independent of other entities. Its primary key is
not derived from the attributes of other entities.
Weak Entity
An entity whose existence is dependent of other entity. Its primary key is
partially or fully derived from related entity.
Example:
49
An associative entity
50
In ER Model attributes can be classified into the following types.
Simple and Composite Attribute
Single Valued and Multi Valued attribute
Stored and Derived Attributes
Identifier and composite identifier
Required and optional attributes
Simple and Composite Attribute
Simple attribute that consist of a single atomic value. A composite attribute
is an attribute that can be further subdivided. For example the attribute
ADDRESS can be subdivided into street, city, state, and zip code. A simple
attribute cannot be subdivided. For example the attributes age, sex etc are
simple attributes.
Simple Attribute: Attribute that consist of a single atomic value.
Example: Salary, age etc
Composite Attribute: Attribute value not
atomic. Example: Address :
‘House_no:City:State
Name : ‘First Name: Middle Name: Last Name’
Single Valued and Multi Valued attribute
A single valued attribute can have only a single value. For example a person
can have only one 'date of birth', 'age' etc. That is a single valued attributes
can have only single value. But it can be simple or composite attribute. That
is 'date of birth' is a composite attribute , 'age' is a simple attribute. But
both are single valued attributes.
Multi valued attributes can have multiple values. For instance a person may
have multiple phone numbers, multiple degrees etc.
Stored and Derived Attributes
The value for the derived attribute is derived from the stored attribute. For
example 'Date of birth' of a person is a stored attribute. The value for the
attribute 'AGE' can be derived by subtracting the 'Date of Birth'(DOB) from
the current date. Stored attribute supplies a value to the related attribute.
Stored Attribute: An attribute that supplies a value to the related attribute.
Example: Date of Birth
Derived Attribute: An attribute that’s value is derived from a stored
51
attribute.
52
Example : age, and it’s value is derived from the stored attribute Date of
Birth.
Identifier and composite identifier
An identifier is an attribute that identifies each instance of entity uniquely.
For example student_number is an identifier in student table.
Composite identifier is an identifier which is formed with the combination of
two or more attributes. For example (emp_num, course_id) is a composite
identifier in CERTIFICATE entity.
Required and Optional attributes:
Required attribute is an attribute that must have a value, i.e. it cannot be
null.
Example: In EMPLOYEE Table, emp_name must not be null so we can say it
a required attribute.
Optional attribute is an attribute that may or may not have a value for each
instance of entity. Often it accepts null
Example. Email is an optional attribute for student table. Because every
student may not have email address.
DIFFERENCE BETWEEN COMPOSITE IDENTIFIER & COMPOSITE
ATTRIBUTE
The composite identifier consists two or more attributes to meet the unique
identification property. But the composite attribute can be divided into two
or more meaningful components (attributes)
composite identifier
53
DOMAIN (2 MARKS)
Domain is a set of possible values for an attribute. For example a domain of
marks attribute include the values from 0 to 100. Similarly a domain of
course attribute include the values { B.Sc., B.Com, B.A.,B.B.M}
IMPLEMENTING MULTI VALUED ATTRIBUTE
An attribute that takes more than one value for each instance of entity type
is called multi valued attribute.
For example : In an EMPLOYEE entity, the attribute skill is a multi valued
attribute. Because each employee may have more than one skill. This
attribute can be represented as follows:
The value for the derived attribute is derived from the stored attribute. For
example 'Date of birth' of a person is a stored attribute. The value for the
attribute 'AGE' can be derived by subtracting the 'Date of Birth'(DOB) from
the current date. Stored attribute supplies a value to the related attribute.
54
The following are the advantages and disadvantages of storing and not
storing of derived attributes.
55
The PK of parent entity acts as foreign key in the child entity.
Example
56
In the above example, DEPENDENT is an existence dependent on
EMPLOYEE. The relationship between those entities is called identifying
relationship. Here the PK of child entity is derived from parent entity.
EXISTENCE DEPENDENCE
An entity is said to be existence dependent if it can exist in the database if
the related entity occurrence exists. For example a D EPENDENT entity
instance exists only when its concerned EMPLOYEE Entity instance exists.
RELATIONSHIP PARTICIPATION (OR) OPTIONAL AND MANDATORY
PARTICIPATION OF ENTITIES IN RELATIONSHIPS
57
Example:
58
“Is_Married_To” is unary 1:1 relationship between the instances of
PERSON entity type. i.e One person is married to only one other
person.
“Manages’ is unary 1:M relationship between the instances of
EMPLOYEE. Because an employee manages one or more other
employees
Binary Relationship:
It is the relationship between the instances of two entity types. Binary
relationship is the most common type of relationship in data modeling. Here
the degree of the relationship is 2.
Example:
59
“Contains” is the binary 1:M relationship between the instances of
ORDER and ORDERLINE. It means an order may contain one or more
order lines.
“Registers_For” is the binary M:N relationship between the instances
of STUDENT and COURSE. It means a student may register for many
courses similarly a course may be registered by many students.
Ternary Relationship:
It is the simultaneous relationship among the instances of three entity
types. It is recommended to convert ternary relationships to associative
entities. Here the degree of the relationship is 3.
Example:
60
“Is_Married_To” is unary/recursive 1:1 relationship between the
instances of PERSON entity type. i.e One person is married to only
one other person.
“Manages’ is unary/recursive 1:M relationship between the instances
of EMPLOYEE. Because an employee manages one or more other
employees
61
the entities, using the format (x,y). The first value represents the minimum
number of associated entities, and the second value represents the
maximum number of associated entities.
Minimum Cardinality: The minimum number of instances of one entity
that may be associated with each instance of other entity.
Maximum Cardinality: The maximum number of instances of one entity
that may be associated with each instance of other entity.
Example:
The below diagram shows connectivities and cardinalities in both chen and
crow’s foot notation.
62
NEED FOR NORMALIZATION & CHARACTERISTICS OF
NORMALIZED TABLES
Normalization: It is a process of decomposing a relation with anomalies to
form smaller and well structured relations. In Normalization there are three
important basic normal forms. (1NF, 2NF, 3NF).
Need for Normalization: The need of normalization arises due to the
following two reasons.
To produce well structured relations that minimizes the data
redundancy as much as possible.
To reduce the chances of anomalies like insertion, deletion an
updation.
Objective of Normalization:
The objective of Normalization is to ensure that each table matches with
the concept of well-structured relation. i.e. they must satisfy the following
characteristics.
Each table represents a single entity. For example, a STUDENT table
must contain the data that directly related to STUDENT.
No data item will be stored more than once unnecessarily in one or
more table. This is to ensure that the data are updated only once in
one location in the database.
All non-key attributes in the table are dependent on entire primary
key. This is to ensure that the data are uniquely identifiable by a
primary key value.
Each table is free of insertion, update, or deletion anomalies. This is to
ensure the integrity and consistency of the data.
Characteristics of Normalized tables.
First normal form (1NF) A relation has no repeating groups, and PK Identified
63
Boyce–Codd normal The relation must be in 3NF and every determinant
form (BCNF) is a candidate key
64
NORMALIZATION PROCESS WITH EXAMPLE
Normalization: It is a process of decomposing a relation with anomalies to
form smaller and well structured relations. In Normalization there are three
important basic normal forms. (1NF, 2NF, 3NF).
Conversion from 1NF to 2NF and 2NF to 3NF is explained below with an
example
65
The following table is not in 1NF
66
o Writing the dependent attributes after each new key.
67
The conversion results of 3NF.
68
(A determinant is any attribute whose value determines other
values with a row.)
If a table contains only one candidate key, the 3NF and the BCNF
are equivalent.
BCNF is a special case of 3NF.
Figure 1 illustrates a table that is in 3NF but not in BCNF.
Figure 2 shows how the table can be decomposed to conform to the
BCNF form.
69
Decomposition into BCNF
70
Maths BLUE S CHAND
RED VGS
GREEN
When removing multiple values for the attributes, the relation becomes
as follows:
71
With above decomposition the table size is decreased from 24 cells ot 16
cells of data. In this way we can reduce the amount of redundancy in
fourth normal form.
ENHANCED ENTITY RELATIONSHIP MODELING (EERM) (OR)
USE OF SUPER TYPE/ SUB TYPE RELATIONSHIPS IN DATA
MODELING.
72
Example 1 in peter chen notation:
Employee super type with three subtypes
73
Example 2 in Crow’s Foot Notation
74
GENERALIZATION AND SPECIALIZATION IN SUPER TYPE AND SUB
TYPE RELATIONSHIPS. (OR)
THE VARIOUS APPROACHES TO DEVELOP SUPER TYPE
SUB TYPE RELATIONSHIPS
75
(b) – Generalization to VEHICLE supertype
76
The below example demonstrates Specialization process.
Here an organization wish to store details about a PRODUCT. But here some
attributes are applicable to one set of instances and some other
attributes/relationships are specific to one other set of instances so the
specialization is done as follows.
77
SUB TYPE DISCRIMINATOR
Subtype Discriminator: An attribute of the supertype whose values
determine the target subtype(s)
Disjoint – a simple attribute with alternative values to indicate the
possible subtypes
Overlapping – a composite attribute whose subparts are related to
different subtypes. Each subpart contains a boolean value to
indicate whether or not the instance belongs to the associated
subtype
Example.
78
79
CONSTRAINTS ON SUPER TYPE SUB TYPE RELATIONSHIPS.
(OR)
DISJOINT NESS CONSTRAINT AND COMPLETENESS
CONSTRAINT WITH EXAMPLE.
Disjointness Constraints: Whether an instance of a supertype may
simultaneously be a member of two (or more) subtypes.
Disjoint Rule: An instance of the supertype can be member of only
ONE of the subtypes
Overlap Rule: An instance of the supertype could be member of
more than one of the subtypes
80
Completeness Constraints: Whether an instance of a supertype must also
be a member of at least one subtype
Total Specialization Rule:It specifies that each entity instance of the
super type must be a member of at least one sub type in the
relationship. It is represented with double line in ERD
Partial Specialization Rule: It specifies that an entity instance of the
super type is allowed that is not belongs to any subtype. It is
represented with single line in the ERD.
81
Structural Constraints
The main type of constraint on relationships is called multiplicity.
The number (or range) of possible occurrences of an entity type that may
relate to a single occurrence of an associated entity type through a
particular relationship is denoted as multiplicity.
82
As we mentioned earlier, the most common degree for relationships is
binary. Binary relationships are generally referred to as being one-to-one
(1:1), one-tomany (1:*), or many-to-many (*:*). We understant these three
types of relationships using the following integrity constraints:
• a member of staff manages a branch (1:1);
• a member of staff oversees properties for rent (1:*);
• newspapers advertise properties for rent (*:*).
One-to-One (1:1) Relationships
The relationship Manages, which relates the Staff and Branch entity types.
The below figure displays two occurrences of the Manages relationship type
(denoted rl and r2) using a semantic net. Each relationship (rn) represents
the association between a single Staff entity occurrence and a single Branch
entity occurrence. We represent each entity occurrence using the values for
the primary key attributes of the Staff and Branch entities, namely staffNo
and branchNo.
83
Many-to-Many (*:*) Relationships
The relationship Advertises, which relates the Newspaper and
PropertyForRent entity types. The below figure displays four occurrences of
the Advertises relationship (denoted rl, r2, r3, and r4) using a semantic net.
Each relationship (rn) represents the association between a single
Newspaper entity occurrence and a single PropertyForRent entity
occurrence. We represent each entity occurrence using the values for the
primary key attributes of the Newspaper and PropertyForRent entity types,
namely newspaperName and propertyNo.
84
Participation: Determines whether all or only some entity occurrences
participate in a relationship
The participation constraint represents whether all entity occurrences are
involved in a particular relationship (referred to as mandatory participation)
or only some (referred to as optional participation). The participation of
entities in a relationship appears as the minimum values for the multiplicity
ranges on either side of the relationship. Optional participation is
represented as a minimum value of 0, and mandatory participation is shown
as a minimum value of 1. It is important to note that the participation for a
given entity in a relationship is represented by the minimum value on the
opposite side of the relationship; that is, the minimum value for the
multiplicity beside the related entity.
85
Problems with ER Models
Some problems may arise when creating an ER model. These problems are
referred to as connection traps, and normally these problems occur due to
a misinterpretation of the meaning of certain relationships. The two main
types of connection traps are called fan traps and chasm traps.
Fan Traps
The Fan Traps in a model indicates a relationship between entity types, but
the pathway between certain entity occurrences is ambiguous.
A fan trap may exist where two or more 1:* relationships fan out from the
same entity. A potential fan trap is illustrated in below Figure which shows
two 1:* relationships (Has and Operates) emanating from the same entity
called Division.
This model represents the facts that a single division operates one or more
branches and has one or more staff. However, a problem arises when we
want to know which members of staff work at a particular branch. To
appreciate the problem, we examine some occurrences of the Has and
Operates relationships using values for the primary key attributes of the
Staff, Division, and Branch entity types, as shown in below Figure.
If we attempt to answer the question: “At which branch does staff number
SG37 work?” we are unable to give a specific answer based on the current
structure. We can determine only that staff number SG37 works at Branch
B003 or B007. The inability to answer this question specifically is the result
of a fan trap associated with the misrepresentation of the correct
relationships between the Staff, Division, and Branch entities. We resolve
this fan trap by restructuring the original ER model to represent the correct
association between these entities, as shown in below Figure.
86
If we now examine occurrences of the Operates and Has relationships, as
shown in below Figure we are now in a position to answer the type of
question posed earlier. From this semantic net model, we can determine that
staff number SG37 works at branch number B003, which is part of division
Dl.
Chasm Traps
Chasm Traps occurs when a model suggests the existence of a relationship
between entity types, but the pathway does not exist between certain
entity occurrences.
A chasm trap may occur where there are one or more relationships with a
minimum multiplicity of zero (that is, optional participation) forming part of
the pathway between related entities. A potential chasm trap is illustrated in
below Figure which shows relationships between the Branch, Staff, and
PropertyForRent entities.
This model represents the facts that a single branch has one or more staff
who oversee zero or more properties for rent. We also note that not all staff
oversee property, and not all properties are overseen by a member of staff. A
problem arises when we want to know which properties are available at each
branch. To appreciate the problem, we examine some occurrences of the Has
and Oversees relationships using values for the primary key attributes of the
Branch, Staff, and PropertyForRent entity types, as shown in below Figure.
87
If we attempt to answer the question: “At which branch is property number
PA14 available?” we are unable to answer this question, as this property is
not yet allocated to a member of staff working at a branch. The inability to
answer this question is considered to be a loss of information (as we know a
property must be available at a branch), and is the result of a chasm trap.
The multiplicity of both the Staff and PropertyForRent entities in the
Oversees relationship has a minimum value of zero, which means that some
properties cannot be associated with a branch through a member of staff.
Therefore, to solve this problem, we need to identify the missing
relationship, which in this case is the Offers relationship between the
Branch and PropertyForRent entities. The ER model shown in below Figure
represents the true association between these entities.
This model ensures that at all times, the properties associated with each
branch are known, including properties that are not yet allocated to a
member of staff. If we now examine occurrences of the Has, Oversees, and
Offers relationship types, as shown in below Figure, we are now able to
determine that property number PA14 is available at branch number B007.
88
The Database Design Methodology for Relational Databases
The three main phases of database design include the following:
• Conceptual design,
• Logical design,
• Physical database design.
The steps involved in the main phases of the database design methodology
are explained below.
Step 1: Build Conceptual Data Model
The first step in conceptual database design is to build a conceptual data
model of the data requirements of the enterprise. A conceptual data model
comprises:
• entity types;
• relationship types;
• attributes and attribute domains;
• primary keys and alternate keys;
• integrity constraints.
Step 1.1: Identify entity types
The first step in building a local conceptual data model is to define the main
89
objects that the users are interested in. We also look for major objects such
as people, places, or concepts of interest
Step 1.2: Identify relationship types
Identify the important relationships that exist between the entity types that
have been identified. Use Entity–Relationship (ER) modeling to visualize the
entity and relationships.
Step 1.3: Identify and associate attributes with entity or relationship
types
Associate attributes with the appropriate entity or relationship types.
Identify simple/composite attributes, single-valued/multi-valued attributes,
and derived attributes. Document attributes.
Step 1.4: Determine attribute domains
Determine domains for the attributes in the conceptual model. Document
attributes domains.
Step 1.5: Determine candidate, primary, and alternative key
attributes
Identify the candidate key(s) for each entity and, if there is more than one
candidate key, choose one to be the primary key. Document primary and
alternative keys for each strong entity.
Step 1.6: Consider use of enhanced modeling concepts (optional step)
Consider the use of enhanced modeling concepts, such as
specialization/generalization, aggregation, and composition.
Step 1.7: Check model for redundancy
Check for the presence of any redundancy in the model. Specifically, re-
examine one-to-one (1:1) relationships, remove redundant relationships,
and consider time dimension.
Step 1.8: Validate conceptual data model against user transactions
Ensure that the conceptual data model supports the required transactions.
Two possible approaches are describing the transactions and using
transaction pathways.
Step 1.9: Review conceptual data model with user
Review the conceptual data model with the user to ensure that the model is
a “true” representation of the data requirements of the enterprise.
Step 2: Build Logical Data Model
90
Build a logical data model from the conceptual data model and then validate
this model to ensure that it is structurally correct (using the technique of
normalization) and to ensure that it supports the required transactions.
Step 2.1: Derive relations for logical data model
Create relations from the conceptual data model to represent the entities,
relationships, and attributes that have been identified.
Step 2.2: Validate relations using normalization
Validate the relations in the logical data model using the technique of
normalization. The objective of this step is to ensure that each relation is in
at least Third Normal Form (3NF).
Step 2.3: Validate relations against user transactions
Ensure that the relations in the logical data model support the required
transactions.
Step 2.4: Check integrity constraints
Identify the integrity constraints, which include specifying the required data,
attribute domain constraints, multiplicity, entity integrity, referential
integrity, and general constraints. Document all integrity constraints.
Step 2.5: Review logical data model with user
Ensure that the users consider the logical data model to be a true
representation of the data requirements of the enterprise.
Step 2.6: Merge logical data models into global model
The methodology for Step 2 is presented so that it is applicable for the
design of simple and complex database systems.
Step 2.7: Check for future growth
Determine whether there are any significant changes likely in the
foreseeable future, and assess whether the logical data model can
accommodate these changes.
Step 3: Translate Logical Data Model for Target DBMS
Produce a relational database schema that can be implemented in the target
DBMS from the logical data model.
Step 3.1: Design base relations
Decide how to represent the base relations that have been identified in the
logical data model in the target DBMS. Document design of base relations.
Step 3.2: Design representation of derived data
91
Decide how to represent any derived data present in the logical data model
in the target DBMS. Document design of derived data.
Step 3.3: Design general constraints
Design the general constraints for the target DBMS. Document design of
general constraints.
Step 4: Design File Organizations and Indexes
Determine the optimal file organizations to store the base relations and the
indexes that are required to achieve acceptable performance, that is, the way
in which relations
and tuples will be held on secondary storage.
Step 4.1: Analyze transactions
Understand the functionality of the transactions that will run on the
database and analyze the important transactions.
Step 4.2: Choose file organizations
Determine an efficient file organization for each base relation.
Step 4.3: Choose indexes
Determine whether adding indexes will improve the performance of the
system.
Step 4.4: Estimate disk space requirements
Estimate the amount of disk space that will be required by the database.
Step 5: Design User Views
Design the user views that were identified during the requirements collection
and analysis stage of the relational database system development lifecycle.
Document design of user views.
Step 6: Design Security Mechanisms
Design the security measures for the database system as specified by the
users. Document design of security measures.
Step 7: Consider the Introduction of Controlled Redundancy
Determine whether introducing redundancy in a controlled manner by
relaxing the normalization rules will improve the performance of the system.
For example, consider duplicating attributes or joining relations together.
Document introduction of redundancy.
Step 8: Monitor and Tune the Operational System
Monitor the operational system and improve the performance of the system
to correct inappropriate design decisions or reflect changing requirements
92
UNIT III
SQL & TYPES OF SQL COMMANDS
SQL stands for Structured Query Language. SQL is used to communicate
with a database. According to ANSI (American National Standards Institute),
it is the standard language for relational database management systems.
SQL statements are used to perform tasks such as storing data, updating
data, deleting data in a database, or retrieve data from a database. Some
common relational database management systems that use SQL are: Oracle,
Sybase, Microsoft SQL Server.
Features:
SQL is relatively easy to learn.
Basic command set has a vocabulary of less than 100 words.
Nonprocedural language.
ANSI prescribes a standard SQL.
Different types of SQL commands are explained below.
Data Definition Language (DDL) statements are used to define the
database structure or schema. Some examples:
o CREATE - to create objects in the database
o ALTER - alters the structure of the database
o DROP - delete objects from the database
o TRUNCATE - remove all records from a table, including all spaces
allocated for the records are removed
o RENAME - rename an object
Data Manipulation Language (DML) statements are used for managing data
within schema objects. Some examples:
o SELECT - retrieve data from the a database
o INSERT - insert data into a table
o UPDATE - updates existing data within a table
o DELETE - deletes all records from a table, the space for the records
remain
Data Control Language (DCL) statements. Some examples:
o GRANT - gives user's access privileges to database
o REVOKE - withdraw access privileges given with the GRANT command
93
Transaction Control (TCL) statements are used to manage the changes
made by DML statements. It allows statements to be grouped together into
logical transactions.
o COMMIT - save work done
o SAVEPOINT - identify a point in a transaction to which you can later
roll back
oROLLBACK - restore database to original since the last COMMIT
SQL Identifiers
94
SQL truth values are mutually comparable and assignable. The value TRUE
95
is greater than the value FALSE, and any comparison involving the NULL
value or an UNKNOWN truth value returns an UNKNOWN result.
Character data
Character data consists of a sequence of characters from an
implementationdefined character set, that is, it is defined by the vendor of
the particular SQL dialect. Thus, the exact characters that can appear as
data values in a character type column will vary. ASCII and EBCDIC are two
sets in common use today.
For example, the branch number column branchNo of the Branch table,
which has a fixed length of four characters, is declared as:
branchNo CHAR(4)
The column address of the PrivateOwner table, which has a variable number
of characters up to a maximum of 30, is declared as:
address VARCHAR(30)
Exact Numeric Data
The exact numeric data type is used to define numbers with an exact
representation. The number consists of digits, an optional decimal point,
and an optional sign. An exact numeric data type consists of a precision
and a scale. The precision gives the total number of significant decimal
digits, that is, the total number of digits, including decimal places but
excluding the point itself. The scale gives the total number of decimal places.
For example, the exact numeric value –12.345 has precision 5 and scale 3. A
special case of exact numeric occurs with integers. There are several ways of
specifying an exact numeric data type:
NUMERIC [ precision [, scale] ]
DECIMAL [ precision [, scale] ]
INTEGER
SMALLINT
BIGINT
INTEGER can be abbreviated to INT and DECIMAL to DEC
Approximate numeric data
The approximate numeric data type is used for defining numbers that do not
have an exact representation, such as real numbers. Approximate numeric,
or floating point, notation is similar to scientific notation, in which a number
is written as a mantissa times some power of ten (the exponent). For
example, 10E3, +5.2E6, −0.2E–4. There are several ways of specifying an
approximate numeric data type:
FLOAT [precision]
REAL
DOUBLE PRECISION
96
The precision controls the precision of the mantissa. The precision of REAL
and DOUBLE PRECISION is implementation-defined.
Datetime data
The datetime data type is used to define points in time to a certain degree of
accuracy. Examples are dates, times, and times of day. The ISO standard
subdivides the datetime data type into YEAR, MONTH, DAY, HOUR,
MINUTE, SECOND, TIMEZONE_HOUR, and TIMEZONE_MINUTE. The latter
two fields specify the hour and minute part of the time zone offset from
Universal Coordinated Time (which used to be called Greenwich Mean Time).
Three types of datetime data type are supported:
DATE
TIME [timePrecision] [WITH TIME ZONE]
TIMESTAMP [timePrecision] [WITH TIME ZONE]
DATE is used to store calendar dates using the YEAR, MONTH, and DAY
fields.
TIME is used to store time using the HOUR, MINUTE, and SECOND fields
97
Example:
CREATE TABLE VENDOR (
V_CODE INTEGERNOT NULL UNIQUE,
V_NAME VARCHAR(35) NOT NULL,
V_CONTACT VARCHAR(15) NOT NULL,
V_AREACODE CHAR(3) NOT NULL,
V_PHONE CHAR(8) NOT NULL,
V_STATE CHAR(2) NOT NULL,
V_ORDER CHAR(1) NOT NULL,
PRIMARY KEY (V_CODE));
SQL CONSTRAINTS
Constraints are the rules that can not be violated by the user. Constraints
are used to ensure proper data entry in tables. They should be written when
creating or altering the table structures. Constraints used in SQL are listed
below.
Primary key constraint:
o Primary key attributes contain both a NOT NULL and a UNIQUE
specification
Foreign key constraint:
o RDBMS will automatically enforce referential integrity for
foreign keys. It indicates that the proper foreign key values must
match the primary key values of other table other wise null
NOT NULL constraint
o Ensures that a column does not accept nulls
UNIQUE constraint
o Ensures that all values in a column are unique
DEFAULT constraint
o Assigns a value to an attribute when a new row is added to a
table
CHECK constraint
o Validates data when an attribute value is entered
ON UPDATE CASCADE
98
o Ensures that a change in pk will automatically be applied to all
FK references throughout the system
o Also have ON DELETE CASCADE and ON UPDATE CASCADE
Example:
SQL> create table emp99
(emp_num number(3),
emp_name varchar2(15) constraint emp99_emp_name_nn not null,
emp_city varchar2(10) default 'hyd',
emp_basic number(5) constraint
emp99_emp_basic_ck check (emp_basic>10000),
deptno number(2),
constraint emp99_emp_num_pk primary key(emp_num),
constraint emp99_deptno_fk foreign key(deptno) references
dept99(deptno));
DATA DEFINITION LANGUAGE (DDL) COMMANDS.
The DDL commands are used to define a database including creating,
altering, and dropping tables and establishing constraints. DDL deals with
meta data.
The commands are explained below.
Create command:
The CREATE TABLE Statement is used to create tables to store data.
The Syntax for the CREATE TABLE Statement is:
99
For Example: If you want to create the employee table, the statement would
be like,
CREATE TABLE employee
( id number(5),
name char(20),
dept char(10),
age number(2),
salary number(10),
location char(10)
);
ALTER COMMAND:
The SQL ALTER TABLE command is used to modify the definition
(structure) of a table by modifying the definition of its columns. The
ALTER command is used to perform the following functions.
1) Add, drop, modify table columns
2) Add and drop constraints
3) Enable and Disable constraints
Syntax to add a column
ALTER TABLE table_name ADD column_name datatype;
For Example: To add a column "experience" to the employee table, the
query would be like
ALTER TABLE employee ADD experience number(3);
Syntax to drop a column
ALTER TABLE table_name DROP column_name;
For Example: To drop the column "location" from the employee table, the
query would be like
ALTER TABLE employee DROP
location; Syntax to modify a column
ALTER TABLE table_name MODIFY column_name datatype;
For Example: To modify the column salary in the employee table, the query
would be like
ALTER TABLE employee MODIFY salary number(15,2);
DROP COMMAND:
It is used to delete the database objects like tables, indexes and views.
100
Syntax:
DROP object_type object_name;
Example : DROP table employee;
TRUNCATE command:
It is used to delete rows with auto commit
Syntax :
TRUNCATE table table_name;
Example:
TRUNCATE table employee;
RENAME COMMAND
The SQL RENAME command is used to change the name of the table or a
database object.
Syntax to rename a table
RENAME old_table_name To new_table_name;
For Example: To change the name of the table employee to my_employee,
the query would be like
RENAME employee TO my_emloyee;
DATA MANIPULATIN COMMANDS (DML).
DML commands are used to maintain and access a database, including
updating, inserting, modifying and retrieving data. It deals with data. The
commands are explained below.
SELECT Command:
The SQL SELECT statement is used to query or retrieve data from a table in
the database. A query may retrieve information from specified columns or
from all of the columns in the table
Syntax of SQL SELECT Statement:
SELECT column_list FROM table-name
[WHERE Clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause];
table-name is the name of the table from which the information is
retrieved.
101
column_list includes one or more columns from which data is
retrieved.
The code within the brackets is optional.
Example:
SELECT first_name FROM student_details;
INSERT Command:
The INSERT Statement is used to add new rows of data to a table.
We can insert data to a table in two ways,
1) Inserting the data directly to a table.
Syntax for SQL INSERT is:
INSERT INTO TABLE_NAME
[ (col1, col2, col3,...colN)]
VALUES (value1, value2, value3,...valueN);
col1, col2,...colN -- the names of the columns in the table into which
you want to insert data.
Example:
INSERT INTO employee (id, name, dept, age, salary location) VALUES
Example
INSERT INTO employee
SELECT * FROM temp_employee;
SQL UPDATE STATEMENT
The UPDATE Statement is used to modify the existing rows in a table.
102
Syntax
UPDATE table_name
SET column_name1 = value1,
column_name2 = value2, ...
[WHERE condition]
table_name - the table name which has to be updated.
column_name1, column_name2.. - the columns that gets changed.
value1, value2... - are the new values.
Example:
UPDATE employee
SET location ='Mysore'
WHERE id = 101;
Delete command:
The DELETE Statement is used to delete rows from a table.
Syntax:
DELETE FROM table_name [WHERE condition];
table_name -- the table name which has to be updated.
Example: To delete an employee with id 100 from the employee table, the
sql delete query would be like,
DELETE FROM employee WHERE id = 100;
To delete all the rows from the employee table, the query would be like,
DELETE FROM employee;
DATA CONTROL LANGUAGE (DCL) COMMANDS.
DCL commands are used to control a database. It deals with accessibilitlity
and transaction changes. It includes a set of privileges that can be granted
to or revoked from a user. The commands are : GRANT, REVOKE, COMMIT,
ROLLBACK and SAVEPOINT.
GRANT COMMAND:
GRANT is a command used to provide access or privileges on the database
objects to the users.
Syntax:
GRANT privilege_name
ON object_name
TO {user_name |PUBLIC |role_name}
[WITH GRANT OPTION];
103
privilege_name is the access right or privilege granted to the user.
Some of the access rights are ALL, EXECUTE, and SELECT.
object_name is the name of an database object like TABLE, VIEW,
STORED PROC and SEQUENCE.
user_name is the name of the user to whom an access right is
being granted.
PUBLIC is used to grant access rights to all users.
ROLES are a set of privileges grouped together.
WITH GRANT OPTION - allows a user to grant access rights to
other users.
For Eample: GRANT SELECT ON employee TO user1;
This command grants a SELECT permission on employee table to user1.
REVOKE Command:
COMMIT Command
ROLLBACK
104
To rollback the changes done in a transaction give rollback statement.
Rollback restore the state of the database to the last commit point.
Example :
delete from emp;
rollback; /* undo the changes */
SAVEPOINT
Specify a point in a transaction to which later you can roll back.
Example
insert into emp (empno,ename,sal) values (109,’Sami’,3000);
savepoint a;
insert into dept values (10,’Sales’,’Hyd’);
Now if you give
rollback to a;
Then row dept will be roll backed. Now you can commit the row inserted
into emp table or rollback the transaction.
SQL GRANT REVOKE COMMANDS
GRANT privilege_name
ON object_name
TO {user_name |PUBLIC |role_name}
[WITH GRANT OPTION];
105
ROLES are a set of privileges grouped together.
WITH GRANT OPTION - allows a user to grant access rights to other
users.
REVOKE privilege_name
ON object_name
FROM {user_name |PUBLIC |role_name}
106
System
Privileges Description
The above rules also apply for ALTER and DROP system privileges.
Object
Privileges Description
allows users to insert rows into a
INSERT table.
allows users to select data from
SELECT a database object.
allows user to update data in a
UPDATE table.
allows user to execute a stored
EXECUTE procedure or a function.
Roles: Roles are a collection of privileges or access rights. When there are
many users in a database it becomes difficult to grant or revoke privileges to
users. Therefore, if you define roles, you can grant or revoke privileges to
users, thereby automatically granting or revoking privileges. You can either
create Roles or use the system roles pre-defined by oracle.
Some of the privileges granted to the system roles are as given below:
System
Role Privileges Granted to the Role
107
Creating Roles:
It's easier to GRANT or REVOKE privileges to the users through a role rather
than assigning a privilege directly to every user. If a role is identified by a
password, then, when you GRANT or REVOKE privileges to the role, you
definitely have to identify it with the password.
Second, grant a CREATE TABLE privilege to the ROLE testing. You can add
more privileges to the ROLE.
To revoke a CREATE TABLE privilege from testing ROLE, you can write:
108
SPECIAL OPERATORS IN SQL
The special operatores in SQL are "IN", "BETWEEN...AND", "IS NULL",
"LIKE".
Operator Description
The LIKE operator is used to list all rows in a table whose column values
match a specified pattern.
For example: To select all the students whose name begins with 'S'
SELECT first_name, last_name
FROM student_details
WHERE first_name LIKE 'S%';
The output would be similar
to:
first_name last_name
Stephen Fleming
Shekar Gowda
The operator BETWEEN and AND, are used to compare data for a range of
values.
109
For Example: to find the names of the students between age 10 to 15 years,
the query would be like,
SELECT first_name, last_name, age
FROM student_details
WHERE age BETWEEN 10 AND 15;
The output would be similar to:
Rahul Sharma 10
Anajali Bhagwat 12
Shekar Gowda 15
SQL IN Operator:
The IN operator is used when you want to compare a column with more
than one value. It is similar to an OR condition.
For example: If you want to find the names of students who are studying
either Maths or Science, the query would be like,
SELECT first_name, last_name, subject
FROM student_details
WHERE subject IN ('Maths', 'Science');
The output would be similar to:
A column value is NULL if it does not exist. The IS NULL operator is used to
display all the rows for columns that do not have a value.
For Example: If you want to find the names of students who do not
participate in any games, the query would be as given below:
110
SELECT first_name, last_name
FROM student_details
WHERE games IS NULL
There would be no output as we have every student participate in a game in
the table student_details, else the names of the students who do not
participate in any games would be displayed.
EXISTS Operators:
EXISTS simply tests whether the inner query returns any row. If it does,
then the outer query proceeds. If not, the outer query does not execute, and
the entire SQL statement returns nothing.
The syntax for EXISTS is:
SELECT "column_name1"
FROM "table_name1"
WHERE EXISTS
(SELECT *
FROM "table_name2"
WHERE [Condition]);
Example:
SELECT SUM(Sales) FROM Store_Information
WHERE EXISTS
(SELECT * FROM Geography
WHERE region_name = 'West');
Concatenation (||) operator:
This operator concatenates character strings.
Example : SELECT 'Name is ' || ename FROM emp;
Distinct Operator:
This operator lists the distinct values in a column.
Example : Select DISTINCT job from EMP;
WILD CARD CHARACTERS IN SQL
In SQL, there are two wildcard characters:
1. % (percent sign) represents zero, one, or more characters.
2. _ (underscore) represents exactly one character.
Wildcards are used with the LIKEkeyword in SQL.
Below are some wildcard examples:
111
• 'A_Z': All string that starts with 'A', another character, and end with 'Z'. For
example, 'ABZ' and 'A2Z' would both satisfy the condition, while 'AKKZ'
would not (because there are two characters between A and Z instead of
one).
• 'ABC%': All strings that start with 'ABC'. For example, 'ABCD' and
'ABCABC' would both satisfy the condition.
SORTING THE RECORDS OF A TABLE
Soumya 20000
Ramesh 25000
Priya 30000
Hrithik 35000
Harsha 35000
112
If you want to sort the data in descending order, you must explicitly specify
it as shown below.
SELECT name, salary
FROM employee
ORDER BY name, salary DESC;
The above query sorts only the column 'salary' in descending order and the
column 'name' by ascending order.
If you want to select both name and salary in descending order, the query
would be as given below.
SELECT name, salary
FROM employee
ORDER BY name DESC, salary DESC;
SQL CLAUSES
The clauses in SQL are explained below:
SELECT CLAUSE
The most commonly used SQL command is SELECT statement. The SQL
SELECT statement is used to query or retrieve data from a table in the
database.
WHERE CLAUSE
The WHERE Clause is used when you want to retrieve specific information
from a table excluding other irrelevant data.
ORDER BY CLAUSE
The ORDER BY clause is used in a SELECT statement to sort results either
in ascending or descending order. Oracle sorts query results in ascending
order by default.
GROUP BY CLAUSE
The SQL GROUP BY Clause is used along with the group functions to
retrieve data grouped according to one or more columns.
HAVING CLAUSE
Having clause is used to filter data based on the group functions. This is
similar to WHERE condition but is used with group functions. Group
functions cannot be used in WHERE Clause but can be used in HAVING
clause.
UNION Clause
Returns all distinct rows selected by either query.
113
UNION ALL Clause
Returns all rows selected by either query, including all duplicates.
INTERSECT Clause
Returns all distinct rows selected by both queries.
MINUS Clause
Returns all distinct rows selected by the first query but not the
second.
AGGREGATE (GROUP) FUNCTIONS.
SQL GROUP Functions
Group functions are built-in SQL functions that operate on groups of
rows and return one value for the entire group. These functions
are: COUNT, MAX, MIN, AVG, SUM, DISTINCT
COUNT (): This function returns the number of rows in the table that
satisfies the condition specified in the WHERE condition. If the WHERE
condition is not specified, then the query returns the total number of
rows in the table.
Example: SELECT COUNT (*) FROM employee
WHERE dept = 'Electronics';
MAX(): This function is used to get the maximum value from a column.
To get the maximum salary drawn by an employee, the query would be:
MIN(): This function is used to get the minimum value from a column.
To get the minimum salary drawn by an employee, he query would be:
114
AVG(): This function is used to get the average value of a numeric
column.
To get the average salary, the query would be
GROUPING ON DATA.
dept salary
Electrical 25000
Electronics 55000
NOTE: The group by clause should contain all the columns in the
select list expect those used along with the group functions.
SELECT location, dept, SUM (salary)
FROM employee
GROUP BY location, dept;
The output would be like:
115
Bangalore Electrical 25000
dept salary
Electronics 55000
Aeronautics 35000
InfoTech 30000
116
CREATE VIEW view_name
AS
SELECT column_list
FROM table_name [WHERE condition];
view_name is the name of the VIEW.
The SELECT statement is used to define the columns and rows that
you want to display in the view.
For Example: to create a view on the product table the sql query would be
like
CREATE VIEW view_product AS
SELECT product_id, product_name FROM product;
Advantages of views:
1. View the data without storing the data into the object.
2. Provides the most current data from table
3. Restrict the view of a table i.e. can hide some of columns in the tables.
4. Join two or more tables and show it as one object to user.
5. Restrict the access of a table so that nobody can insert the rows into the
table.
6. Disadvantages:
7. Can not use DML operations on this.
8. When table is dropped view becomes inactive.. it depends on the table
objects.
9. It is an object, so it occupies space.
UPDATABLE VIEWS
The views on which we perform INSERT, UPDATE and DELETE statements
are treated as updatable views. These are then performed on the base table
upon which the view is defined. However, certain views may not be updated.
For example a view containing DISTINCT values, where a single row in the
view may represent several rows in the base table.
A view is not updatable if any of the following conditions are true:
The keyword DISTINCT is used in the view definition
The select list contains components other than column specifications,
or contains more than one specification of the same column
The FROM clause specifies more than one table reference or refers to a
non-updatable view
The GROUP BY clause is used in the view definition
The HAVING clause is used in the view definition
117
SUB QUERY AND ITS PROPERTIES
Subquery or Inner query or Nested query is a query in a query. A subquery
is usually added in the WHERE Clause of the sql statement. Most of the
time, a subquery is used when you know how to search for a value using a
SELECT statement, but do not know the exact value.
Subqueries are an alternate way of returning data from multiple tables.
Subqueries can be used with the following sql statements along with the
comparision operators like =, <, >, >=, <= etc.
SELECT
INSERT
UPDATE
DELETE
Properties of Sub-Query
A sub query is select statement inside a query.
A sub query must be enclosed in the parenthesis.
A sub query must be put in the right hand of the comparison
operator
The first query in the SQL statement is known as the outer query.
The query inside the SQL statement is known as the inner query.
The inner query is executed first.
The output of the inner query is used as input for the outer query.
A sub query cannot contain a ORDER-BY clause.
A query can contain more than one sub-queries.
Example: the following are the examples of nested queries.
To find the second max salary of the employees in emp table.
Select max(sal) from emp where sal< (select max(sal) from emp);
To print the name of the employee who draws maximum salary.
Select ename form emp where sal = (select max(sal) from emp);
MULTI-ROW SUB QUERY OPERATORS (OR)
ANY, IN AND ALL OPERATORS
Multiple row subquery returns one or more rows to the outer SQL
statement.
118
The outer query may use the IN, ANY, or ALL operator to handle a sub query
that returns multiple rows
Using IN operator:
Select ename from emp e where 1 < ( select count(*) from emp where
hiredate = e.hiredate
The following query lists the names of the employees who draw first 3
maximum salary
119
Select * from emp e where 3 > (select count(*) from emp where sal > e.sal);
DUAL TABLE
This is a single row and single column dummy table provided by
oracle. This is used to perform mathematical calculations without
using a table.
Output:
DUMMY
X
SQL FUNCTIONS.
There are two types of functions in Oracle.
1) Single Row Functions: Single row or Scalar functions return a value
for every row that is processed in a query.
2) Group Functions:. The group functions are used to calculate
aggregate values like total or average, which return just one total or
one average value after processing a group of rows.
There are four types of single row functions. They are:
1) Numeric Functions: These are functions that accept numeric input
and return numeric values.
2) Character or Text Functions: These are functions that accept
character input and can return both character and number values.
3) Date Functions: These are functions that take values that are of
datatype DATE as input and return values of datatype DATE, except
for the MONTHS_BETWEEN function, which returns a number.
4) Conversion Functions: These are functions that help us to convert
a value in one form to another form. For Example: a null value into an
actual value, or a value from one datatype to another datatype like
NVL, TO_CHAR, TO_NUMBER, TO_DATE etc.
You can combine more than one function together in an expression.
This is known as nesting of functions.
1) Numeric Functions:
120
Numeric functions are used to perform operations on numbers. They accept
numeric values as input and return numeric values as output. Few of the
Numeric functions are:
FLOOR (x) Integer value that is Less than or equal to the number 'x'
The following examples explains the usage of the above numeric functions .
ABS (1) 1
ABS (x)
ABS (-1) 1
CEIL (2.83) 3
CEIL (x) CEIL (2.49) 3
FLOOR (2.83) 2
FLOOR (x) FLOOR (2.49) 2
121
Character or text functions are used to manipulate text strings. They accept
strings or characters as input and can return both character and number
values as output.
Few of the character or text functions are as given below:
TRIM (trim_text FROM All occurrences of 'trim_text' from the left and
string_value) right of 'string_value' , 'trim_text' can also be only
one character long .
122
Function Name Examples Return Value
GOOD
UPPER(string_value) UPPER('Good Morning')
MORNING
LPAD (string_value, n,
pad_value) LPAD ('Good', 6, '*') **Good
RPAD (string_value, n,
pad_value) RPAD ('Good', 6, '*') Good**
3) Date Functions:
These are functions that take values that are of datatype DATE as input and
return values of datatypes DATE, except for the MONTHS_BETWEEN
function, which returns a number as output.
Few date functions are as given below.
123
x2) x1 and x2.
The below table provides the examples for the above functions
Return
Function Name Examples
Value
MONTHS_BETWEEN ('16-Sep-81',
MONTHS_BETWEEN( ) '16-Dec-81') 3
NEXT_DAY ('01-Jun-08',
NEXT_DAY( ) 'Wednesday') 04-JUN-08
4) Conversion Functions:
These are functions that help us to convert a value in one form to another
form. For Ex: a null value into an actual value, or a value from one datatype
to another datatype like NVL, TO_CHAR, TO_NUMBER, TO_DATE.
124
Few of the conversion functions available in oracle are:
The below table provides the examples for the above functions
Monday, June
TO_CHAR () TO_CHAR (SYSDATE, 'Day,
Month YYYY') 2008
SEQUENCE IN ORACLE
In Oracle, you can create an autonumber field by using sequences. A
sequence is an object in Oracle that is used to generate a number sequence.
This can be useful when you need to create a unique number to act as a
primary key.
Characteristics of Sequences:
Oracle sequences are an independent object in the database.
(Sequences are not a data type)
Oracle sequences have a name and can be used anywhere a value is
expected.
125
Oracle sequences are not tied to a table or a column.
Oracle sequences generate a numeric value that can be assigned to
any column in any table.
The table attribute to which you assigned a value based on a sequence
can be edited and modified.
An oracle sequence can be created and deleted anytime.
The syntax for a sequence is:
For example:
This would create a sequence object called supplier_seq. The first sequence
number that it would use is 1 and each subsequent number would
increment by 1 (ie: 2,3,4,...}. It will cache up to 20 values for performance.
Each PL/SQL program consists of SQL and PL/SQL statements which form
a PL/SQL block.
A PL/SQL Block consists of three sections:
The Declaration section (optional).
The Execution section (mandatory).
126
The Exception (or Error) Handling section (optional).
Declaration Section:
The Declaration section of a PL/SQL Block starts with the reserved keyword
DECLARE. This section is optional and is used to declare variables,
constants, records etc.
Execution Section:
The Execution section of a PL/SQL Block starts with the reserved keyword
BEGIN and ends with END. This is a mandatory section. It contains the
program logic to perform any task.
Exception Section:
The Exception section of a PL/SQL Block starts with the reserved keyword
EXCEPTION. This section is optional. Any errors in the program can be
handled in this section, so that the PL/SQL Blocks terminates gracefully. If
the PL/SQL Block contains exceptions that cannot be handled, the Block
terminates abruptly with errors.
Every statement in the above three sections must end with a semicolon ; .
Sample PL/SQL Block appears as follows:
DECLARE
Variable declaration BEGIN
ProgramExecution EXCEPTION
Exception handling END;
Example
DECLARE
A Number := 50 ;
B Number := 0 ;
BEGIN
DBMS_OUTPUT.PUT_LINE('A+B= ' ||
(A+B)); END;
ADVANTAGES OF PL/SQL
Advantages of PL/SQL
127
Block Structures: PL SQL consists of blocks of code, which can be
nested within each other. Each block forms a unit of a task or a
logical module. PL/SQL Blocks can be stored in the database and
reused.
Procedural Language Capability: PL SQL consists of procedural
language constructs such as conditional statements (if else
statements) and loops like (FOR loops).
Better Performance: PL SQL engine processes multiple SQL
statements simultaneously as a single block, thereby reducing
network traffic.
Error Handling: PL/SQL handles errors or exceptions effectively
during the execution of a PL/SQL program. Once an exception is
caught, specific actions can be taken depending upon the type of the
exception or it can be displayed to the user with a message.
PL-SQL ARCHITECTURE
Steps involved in PL-SQL Block execution are listed below:
First, PL/SQL block is submitted to PL/SQL engine.
The PL/SQL engine accepts it as input and separates SQL statements.
PL/SQL engine executes procedural statements.
It sends SQL statements to the SQL engine in the oracle database.
The SQL engine executes the SQL Statements.
The results are combined and presented
PL-SQL Architecture:
128
COMMENTS IN PL/SQL
Comments are non-executable statements in a program. The comments are
used to increase the understandability of the program. Two different ways
to comments in pl-sql are given below:
We use two hyphens ( - - ) for single comments.
We use /* */ for multiline comment in programs
PL-SQL DATA TYPES
Data type: It tells what type of value a variable or an attribute can hold.
Specifying the data type for attributes and variables in a program is
mandatory. The common data types used in SQL are listed below:
Simple data types:
These types includes binary_integer, decimal, int, long, number, float, real,
char, string, date, Boolean.
Composite data types:
Record, table.
Attribute type:
%type (simple) eg: x emp.empno%type.
%rowtype(composite) eg. R emp%rowtype.
Large Object type:
CLOB, BLOB, BFILE
CONTROL STRUCTURES IN PL/SQL WITH EXAMPLE.
Control Structure:
The control structure controls the execution flow in the program. PL/SQL
Supports conditional control structure, loop control structures, jump control
statements, and case control statements.
Conditional control statements:
129
Simple IF Statement
Syntax: Example:
IF <condition> THEN IF A > 40 THEN
<statements>; B :=A - 40;
END IF; END IF;
IF..ELSE Statement
Syntax Example:
IF <condition> THEN IF a > 40 THEN
<statements>; b := a - 40;
ELSE ELSE
<statements>; b := 0;
END IF; END IF;
If – elsif Statement
Syntax IF Score >= 90 THEN
IF <condition-1> THEN LetterGrade := 'A';
<statements>; ELSIF Score >= 80 THEN
ELSIF <condition-2> THEN LetterGrade := 'B';
<statements>; ELSIF Score >= 70 THEN
ELSIF <condition-3> THEN LetterGrade := 'C';
<statements>; ELSIF Score >= 60 THEN
.................... LetterGrade := 'D';
.................... ELSE
ELSE LetterGrade := 'E';
<statements>; END IF;
END IF;
130
Simple Loop
Syntax: Example:
LOOP LOOP
<Sequence of dbms_output.put_line('Hello');
statements> EXIT WHEN i:=i+1;
condition;} EXIT WHEN i>5;
END LOOP; END LOOP;
While Loop
Syntax: Example:
WHILE <condition> WHILE i<5
LOOP LOOP
<action> dbms_output.put_line('Hello');
END LOOP; i:=i+1;
END LOOP;
FOR Loop
Syntax: Example
FOR variable IN [REVERSE] FOR i IN 1..5
start..end LOOP
LOOP
<action>
END LOOP;
131
GOTO Syntax Example Code:
GOTO <<Code_Block_Name>>
----------- DECLARE
----------- BEGIN
<<Code_Block_Name>> FOR i IN 1..5
----------- LOOP
----------- dbms_output.put_line(i);
IF i=4 THEN
GOTO label1;
END IF;
END LOOP;
<<label1>>
dbms_output.put_line(Row Filled);
END;
SWITCH Statement
EXCEPTION
In PL/SQL, a warning or error is called an exception. Handling run time
errors is called exception handling Exceptions can be internally defined
132
(by the run-time system) or user defined. Examples of internally defined
exceptions include division by zero and out of memory. Some common
internal exceptions have predefined names, such as ZERO_DIVIDE and
STORAGE_ERROR.
When an error occurs, an exception is raised. That is, normal execution
stops and control transfers to the exception-handling part of your PL/SQL
block or subprogram. Internal exceptions are raised implicitly
(automatically) by the run-time system. User-defined exceptions must be
raised explicitly by RAISE statements, which can also raise predefined
exceptions.
To handle raised exceptions, you write separate routines called exception
handlers.
Buit-In Exception Syntax:
EXCEPTION
WHEN <Buit-in Exception> THEN
<User Defined Action to be taken>
Example Code:
DECLARE
V_NAME VARCHAR2(20);
BEGIN
SELECT EMP_NAME
INTO V_NAME
FROM EMP99
WHERE EMP_CITY = 'DELHI';
EXCEPTION
WHEN NO_DATA_FOUND
THEN
DBMS_OUTPUT.PUT_LINE('NO EMP EXIST WITH THE GIVEN
CITY');
END;
Buit-in Exception List
DUP_VAL_ON_INDEX
NO_DATA_FOUND
TOO_MANY_ROWS
133
VALUE_ERROR
User-defined Exceptions
EXCEPTION
WHEN ex_invalid_id THEN
dbms_output.put_line('ID must be greater than zero!');
WHEN no_data_found THEN
dbms_output.put_line('No such
customer!');
WHEN others THEN
dbms_output.put_line('Error!');
END;
/
134
When the above code is executed at the SQL prompt, it produces the
following result −
Enter value for cc_id: -6 (let's enter a value -6)
135
old 2: c_id customers.id%type := &cc_id;
new 2: c_id customers.id%type := -6;
ID must be greater than zero!
136
From the SQL prompt.
EXECUTE [or EXEC] procedure_name;
Within another procedure – simply use the procedure name. procedure_name;
Example:
CREATEORREPLACEPROCEDUREUPDATESAL1(NNUMBER,A NUMBER) IS
BEGIN
FUNCTION IN PL/SQL
A function is a named PL/SQL Block which is similar to a procedure. The
major difference between a procedure and a function is, a function must
always return a value, but a procedure may or may not return a value.
1) Return Type: The header section defines the return type of the function.
The return datatype can be any of the oracle datatype like varchar, number
etc.
2) The execution and exception section both should return a value which is
of the datatype defined in the header section.
137
Example:
CREATE OR REPLACE FUNCTION FACT (N NUMBER)
RETURN NUMBER
IS
I NUMBER(10);
F NUMBER :=1;
BEGIN
FOR I IN 1.. N LOOP
F:= F*I;
END LOOP;
RETURN F;
END;
TRIGGER
A trigger is a pl/sql block structure which is fired when a DML statements
like Insert, Delete, Update is executed on a database table. A trigger is
triggered automatically when an associated DML statement is executed.
Syntax of Triggers
The Syntax for creating a trigger is:
CREATE [OR REPLACE ] TRIGGER trigger_name
{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
BEGIN
--- sql statements
END;
CREATE [OR REPLACE ] TRIGGER trigger_name - This clause creates a
trigger with the given name or overwrites an existing trigger with the
same name.
138
{BEFORE | AFTER | INSTEAD OF } - This clause indicates at what
time should the trigger get fired. i.e for example: before or after
updating a table. INSTEAD OF is used to create a trigger on a view.
before and after cannot be used to create a trigger on a view.
{INSERT [OR] | UPDATE [OR] | DELETE} - This clause determines the
triggering event. More than one triggering events can be used together
separated by OR keyword. The trigger gets fired at all the specified
triggering event.
[OF col_name] - This clause is used with update triggers. This clause is
used when you want to trigger an event only when a specific column is
updated.
[ON table_name] - This clause identifies the name of the table or view
to which the trigger is associated.
[REFERENCING OLD AS o NEW AS n] - This clause is used to reference
the old and new values of the data being changed. By default, you
reference the values as :old.column_name or :new.column_name. The
reference names can also be changed from old (or new) to any other
user-defined name. You cannot reference old values when inserting a
record, or new values when deleting a record, because they do not
exist.
[FOR EACH ROW] - This clause is used to determine whether a trigger
must fire when each row gets affected ( i.e. a Row Level Trigger) or just
once when the entire sql statement is executed(i.e.statement level
Trigger).
WHEN (condition) - This clause is valid only for row level triggers. The
trigger is fired only for rows that satisfy the condition specified.
PROCESS OF CREATING A TRIGGER WITH EXAMPLE
For Example: The price of a product changes constantly. It is important to
maintain the history of the prices of the products.
139
unit_price number(7,2) );
140
TYPES OF TRIGGERS
There are two types of triggers based on the which level it is triggered.
1) Row level trigger - An event is triggered for each row upated, inserted or
deleted.
2) Statement level trigger - An event is triggered for each sql statement
executed
CHARACTERISTICS OF TRIGGERS
Characteristics of triggers:
A trigger is invoked before or after a row is inserted, updated or
deleted.
Trigger may be row level or column level
A trigger does not take parameters.
A trigger is table dependent.
Each table may have one or more triggers.
ADVANTAGES OF TRIGGERS
Advantages of triggers:
It can be used to enforce complex constraints
It can be used to ensure proper data entry. i.e. to maintain data
integrity.
It can be used to keep track of transactions on a table.
It can be used to interrupt a transaction when it is not appropriate.
It can replicate a table for back up process.
It can be used to generate values for derived columns automatically.
CURSOR AND DIFFERENT TYPES OF CURSORS
A cursor is a temporary work area created in the system memory when a
SQL statement is executed. A cursor contains information on a select
statement and the rows of data accessed by it. This temporary work area is
used to store the data retrieved from the database, and manipulate this
data. A cursor can hold more than one row, but can process only one row at
a time. The set of rows the cursor holds is called theactive set.
There are two types of cursors in PL/SQL:
141
Implicit cursors:
These are created by default when DML statements like, INSERT, UPDATE,
and DELETE statements are executed. They are also created when a
SELECT statement that returns just one row is executed.
Explicit cursors:
They must be created when you are executing a SELECT statement that
returns more than one row. Even though the cursor stores multiple records,
only one record can be processed at a time, which is called as current row.
When you fetch a row the current row position moves to next row.
Both implicit and explicit cursors have the same functionality, but they
differ in the way they are accessed.
IMPLICIT CURSOR WITH EXAMPLE
Implicit Cursors:
When you execute DML statements like DELETE, INSERT, UPDATE and
SELECT statements, implicit statements are created to process these
statements.
Oracle provides few attributes called as implicit cursor attributes to check
the status of DML operations. The cursor attributes available are %FOUND,
%NOTFOUND, %ROWCOUNT, and %ISOPEN.
For example, When you execute INSERT, UPDATE, or DELETE statements
the cursor attributes tell us whether any rows are affected and how many
have been affected.
When a SELECT... INTO statement is executed in a PL/SQL Block, implicit
cursor attributes can be used to find out whether any row has been
returned by the SELECT statement. PL/SQL returns an error when no data
is selected.
The status of the cursor for each of these attributes are defined in the below
table.
Attributes Return Value Example
%FOUND The return value is TRUE, if the DML
statements like INSERT, DELETE and
UPDATE affect at least one row and if
SELECT ….INTO statement return at
least one row. SQL%FOUND
The return value is FALSE, if DML
statements like INSERT, DELETE and
UPDATE do not affect row and if
142
SELECT….INTO statement do not
return a row.
%NOTFOUND The return value is FALSE, if DML
statements like INSERT, DELETE and
UPDATE at least one row and if
SELECT ….INTO statement return at
least one row. SQL%NOTFOUND
The return value is TRUE, if a DML
statement like INSERT, DELETE and
UPDATE do not affect even one row
and if SELECT ….INTO statement does
not return a row.
%ROWCOUNT Return the number of rows affected by SQL%ROWCOUNT
the DML operations INSERT, DELETE,
UPDATE, SELECT
For Example: Consider the PL/SQL Block that uses implicit cursor
attributes as shown below:
In the above PL/SQL Block, the salaries of all the employees in the
‘employee’ table are updated. If none of the employee’s salary are updated
we get a message 'None of the salaries where updated'. Else we get a
message like for example, 'Salaries for 1000 employees are updated' if there
are 1000 rows in ‘employee’ table.
143
CURSORS WITH EXAMPLE.
Explicit Cursors
144
Process to access an Explicit Cursor:
These are the three steps in accessing the cursor.
1) Open the cursor.
2) Fetch the records in the cursor one at a time.
3) Close the cursor.
General Syntax to open a cursor is:
OPEN cursor_name;
General Syntax to fetch records from a cursor is:
FETCH cursor_name INTO record_name;
OR
FETCH cursor_name INTO variable_list;
General Syntax to close a cursor is:
CLOSE cursor_name;
DECLARE
variables;
records;
create a
cursor;
BEGIN
OPEN cursor;
FETCH cursor;
process the records;
CLOSE cursor;
END;
Example 1:
DECLARE
emp_rec emp_tbl%rowtype;
CURSOR emp_cur IS
145
SELECT *
FROM
WHERE salary > 10;
BEGIN
OPEN emp_cur;
FETCH emp_cur INTO emp_rec;
dbms_output.put_line (emp_rec.first_name || ' ' || emp_rec.last_name);
CLOSE emp_cur;
END;
ATTRIBUTES AVAILABLE TO CHECK THE STATUS OF AN
EXPLICIT CURSOR
These are the attributes available to check the status of an explicit cursor.
Attributes Return values Example
%FOUND TRUE, if fetch statement returns Cursor_name%FOUND
at least one row.
FALSE, if fetch statement
doesn’t return a row.
%NOTFOUND TRUE, , if fetch statement Cursor_name%NOTFOUND
doesn’t return a row.
FALSE, if fetch statement
returns at least one row.
%ROWCOUNT The number of rows fetched by Cursor_name%ROWCOUNT
the fetch statement
If no row is returned, the
PL/SQL statement returns an
error.
%ISOPEN TRUE, if the cursor is already Cursor_name%ISNAME
open in the program
FALSE, if the cursor is not
opened in the program.
Packages in PL/SQL
Packages are schema objects that groups logically related PL/SQL types,
variables, and subprograms.
A package will have two mandatory parts −
146
Package specification
Package body or definition
Package Specification
The specification is the interface to the package. It just DECLARES the
types, variables, constants, exceptions, cursors, and subprograms that can
be referenced from outside the package. In other words, it contains all
information about the content of the package, but excludes the code for the
subprograms.
All objects placed in the specification are called public objects. Any
subprogram not in the package specification but coded in the package body
is called a private object.
The following code snippet shows a package specification having a single
procedure. You can have many global variables defined and multiple
procedures or functions inside a package.
CREATE PACKAGE cust_sal AS
PROCEDURE find_sal(c_id customers.id%type);
END cust_sal;
/
When the above code is executed at the SQL prompt, it produces the
following result −
Package created.
Package Body
The package body has the codes for various methods declared in the
package specification and other private declarations, which are hidden from
the code outside the package.
The CREATE PACKAGE BODY Statement is used for creating the package
body. The following code snippet shows the package body declaration for
the cust_sal package created above. I assumed that we already have
CUSTOMERS table created in our database as mentioned in the PL/SQL -
Variables chapter.
CREATE OR REPLACE PACKAGE BODY cust_sal AS
147
When the above code is executed at the SQL prompt, it produces the
following result −
Package body created.
Using the Package Elements
The package elements (variables, procedures or functions) are accessed with
the following syntax −
package_name.element_name;
Consider, we already have created the above package in our database
schema, the following program uses the find_sal method of the cust_sal
package −
DECLARE
code customers.id%type :=
&cc_id; BEGIN
cust_sal.find_sal(code);
END;
/
When the above code is executed at the SQL prompt, it prompts to enter the
customer ID and when you enter an ID, it displays the corresponding salary
as follows −
Enter value for cc_id: 1
Salary: 3000
148
-- Removes a customer
PROCEDURE delCustomer(c_id customers.id%TYPE);
--Lists all customers
PROCEDURE listCustomer;
END c_package;
/
When the above code is executed at the SQL prompt, it creates the above
package and displays the following result −
Package created.
Creating the Package Body
CREATE OR REPLACE PACKAGE BODY c_package AS
PROCEDURE addCustomer(c_id customers.id%type,
c_name customerS.No.ame%type,
c_age customers.age%type,
c_addr customers.address%type,
c_sal customers.salary%type)
IS
BEGIN
INSERT INTO customers
(id,name,age,address,salary) VALUES(c_id, c_name,
c_age, c_addr, c_sal);
END addCustomer;
PROCEDURE listCustomer IS
CURSOR c_customers is
SELECT name FROM customers;
TYPE c_list is TABLE OF customerS.No.ame%type;
name_list c_list := c_list();
counter integer
:=0; BEGIN
FOR n IN c_customers LOOP
counter := counter +1;
name_list.extend;
name_list(counter) :=
n.name;
149
dbms_output.put_line('Customer(' ||counter|| ')'||name_list(counter));
END LOOP;
150
END listCustomer;
END c_package;
/
When the above code is executed at the SQL prompt, it produces the
following result −
Package body created.
Using The Package
The following program uses the methods declared and defined in the
package c_package.
DECLARE
code customers.id%type:= 8;
BEGIN
c_package.addcustomer(7, 'Rajnish', 25, 'Chennai', 3500);
c_package.addcustomer(8, 'Subham', 32, 'Delhi', 7500);
c_package.listcustomer;
c_package.delcustomer(code);
c_package.listcustomer;
END;
/
When the above code is executed at the SQL prompt, it produces the
following result −
Customer(1): Ramesh
Customer(2): Khilan
Customer(3): kaushik
Customer(4): Chaitali
Customer(5): Hardik
Customer(6): Komal
Customer(7): Rajnish
Customer(8): Subham
Customer(1): Ramesh
Customer(2): Khilan
Customer(3): kaushik
Customer(4): Chaitali
Customer(5): Hardik
Customer(6): Komal
Customer(7): Rajnish
151
PL/SQL Recursive Functions
n! = n*(n-1)!
= n*(n-1)*(n-2)!
...
= n*(n-1)*(n-2)*(n-3)... 1
DECLARE
num number;
factorial number;
BEGIN
num:= 6;
factorial := fact(num);
dbms_output.put_line(' Factorial '|| num || ' is ' || factorial);
END;
/
When the above code is executed at the SQL prompt, it produces the
following result −
Factorial 6 is 720
152
UNIT – IV
TRANSACTION & IT’S PROPERTIES
A database transaction is A logical unit of work that must be either entirely
completed or aborted. A unit of work, may be encapsulating a number of
operations over a database i.e. reading a database object, writing, acquiring
lock
etc.
Example :
Properties of Transactions:
Atomicity:
All transaction operations must be completed
Incomplete transactions aborted
Consistency:
Every transaction must leave the database in a consistent (correct)
state, i.e. maintain the predetermined integrity rules of the database.
A transaction must transform a database from one consistent state to
another consistent state An aborted transaction does not change the
state.
Isolation :
All the transactions are independent.
Transactions cannot interfere with each other. Thus, each transaction
is unaware of the concurrently running transactions.
Durability
153
Effects of successful transactions must persist through crashes.i.e.
after the successful completion of the transaction, the changes to the
database must persist even in the case with database failure.
Serializability
Conducts transactions in serial order
Important in multi-user and distributed databases
TRANSACTION STATE DIAGRAM.
The transaction can be in any one of the following states:
Active State: A database transaction is in this state while its statements
are being executed..
Partially Committed Phase: a database transaction enters this phase when
its final statement has been executed but the updation/changes are not
committed. At this phase, the database transaction has finished its
execution,
but it is still possible for the transaction to be aborted because the output
from
the execution may remain residing temporarily in main memory – an event
like hardware failure may erase the output.
Failed State: this is the state where the transaction cannot execute due to
some error or failure.
Aborted State: this state arises when the transaction has failed. An aborted
transaction must have no effect on the database and thus any changes it
made
to the database have to be undone or in technical terms, rolled back.
Committed State: A database transaction enters the committed state when
enough information has been written to disk after completing its execution
with success.
154
CONCURRENCY CONTROL.
The coordination of the simultaneous execution of transactions in a
multiprocessing database is known as concurrency control
The objective of concurrency control is to ensure the serializability of
transactions in a multiuser database environment
155
Uncommitted Dependency Problem (Dirty Read Problem):
156
Another problem can occur when a transaction T rereads a data item it has
Previously read but, in between, another transaction has modified it. Thus,
T receives two different values for the same data item. This is sometimes
referred to as a nonrepeatable (or fuzzy) read. A similar problem can occur
if transaction T executes a query that retrieves a set of tuples from a relation
satisfying a certain predicate, re-executes the query at a later time, but finds
that the retrieved set contains an additional (phantom) tuple that has been
inserted by another transaction in the meantime. This is sometimes referred
to as a phantom read.
Serializability
The objective of a concurrency control protocol is to schedule
transactions in such a way as to avoid any interference between them
and hence prevent various problems.
One obvious solution is to allow only one transaction to execute at a
time: one transaction is committed before the next transaction is allowed
to begin.
The aim of a multi-user DBMS is also to maximize the degree of
concurrency or parallelism in the system, so that transactions that can
execute without interfering with one another can run in parallel.
Schedule:
A sequence of the operations by a set of concurrent transactions that
preserves the order of the operations in each of the individual transactions.
157
Serial schedule:
A schedule where the operations of each transaction are executed
consecutively without any interleaved operations from other transactions
Nonserial schedule:
A schedule where the operations from a set of concurrent transactions
are interleaved
The problems such lost update, dirty read problem and inconsistent
retrievals are resulted from the mismanagement of concurrency, which left
the database in an inconsistent state in the first two examples and
presented the user with the wrong result in the third. Serial execution
prevents such problems from occurring. No matter which serial schedule is
chosen, serial execution never leaves the database in an inconsistent state,
so every serial execution is considered correct, although different results
may be produced. The objective of serializability is to find nonserial
schedules that allow transactions to execute concurrently without
interfering with one another, and thereby produce a database state that
could be produced by a serial execution. If a set of transactions executes
concurrently, we say that the (nonserial) schedule is correct if it produces the
same result as some serial execution. Such a schedule is called serializable.
To prevent inconsistency from transactions interfering with one another, it
is essential to guarantee serializability of concurrent transactions. In
serializability, the ordering of read and write operations is important:
If two transactions only read a data item, they do not conflict and
order is not important.
If two transactions either read or write completely separate data items,
they do not conflict and order is not important.
If one transaction writes a data item and another either reads or
writes the same data item, the order of execution is important.
Conflict serializability
Consider the schedule S1 shown in below figure containing operations
from two concurrently executing transactions T7 and T8. Because the
write
158
operation on balx in T8 does not conflict with the subsequent read
operation on baly in T7, we can change the order of these operations to
produce the equivalent schedule S2 shown in Figure (b). If we also now
change the order of the following non-conflicting operations, we produce
the equivalent serial schedule S3 shown in Figure (c):
• Change the order of the write(balx) of T8 with the write(baly) of T7.
• Change the order of the read(balx) of T8 with the read(baly) of T7.
• Change the order of the read(balx) of T8 with the write(baly) of T7.
Schedule S3 is a serial schedule and, because S1 and S2 are equivalent to
S3, S1 and S2 are serializable schedules.
Under the constrained write rule (that is, a transaction updates a data
item based on its old value, which is first read by the transaction), a
precedence (or serialization) graph can be produced to test for conflict
serializability. For a schedule S, a precedence graph is a directed graph G =
(N, E) that consists of a set of nodes N and a set of directed edges E, which
is constructed as follows:
Create a node for each transaction.
Create a directed edge Ti Tj, if Tj reads the value of an item written
by Ti.
Create a directed edge Ti Tj, if Tj writes a value into an item after it
has been read by Ti.
159
Create a directed edge Ti Tj, if Tj writes a value into an item after it
has been written by Ti.
If an edge Ti Tj exists in the precedence graph for S, then in any serial
schedule S9 equivalent to S, Ti must appear before Tj. If the precedence
graph contains a cycle, the schedule is not conflict serializable.
Example for Nonconflict serializable schedule
Consider the two transactions shown in the belowFigure . Transaction T9 is
transferring £100 from one account with balance balx to another acount
with balance baly, while T10 is increasing the balance of these two accounts
by 10%.
The precedence graph for this schedule, shown in below Figure has a cycle
and so is not conflict serializable.
160
View serializability
There are several other types of serializability that offer less precise
definitions of schedule equivalence than that offered by conflict
serializability. One less restrictive definition is called view serializability.
Two schedules S1 and S2 consisting of the same operations from n
transactions T1, T2, . . . , Tn are view equivalent if the following three
conditions hold:
For each data item x, if transaction Ti reads the initial value of x in
schedule S1, then transaction Ti must also read the initial value of x
in schedule S2.
For each read operation on data item x by transaction Ti in schedule
S1, if the value read by Ti has been written by transaction Tj, then
transaction Ti must also read the value of x produced by transaction
Tj in schedule S2.
For each data item x, if the last write operation on x was performed by
transaction Ti in schedule S1, the same transaction must perform the
final write on data item x in schedule S2.
A schedule is view serializable if it is view equivalent to a serial schedule.
Every conflict serializable schedule is view serializable, although the
converse is not true. For example, the schedule shown in below figure is
view serializable, although it is not conflict serializable.
161
In this example, transactions T12 and T13 do not conform to the
constrained write rule; in other words, they perform blind writes. It can be
shown that any view serializable schedule that is not conflict serializable
contains one or more blind writes.
Testing for view serializability:
Testing for view serializability is much more complex than testing for conflict
serializability. To test for view serializability we need a method to decide
whether an edge should be inserted into the precedence graph. The
approach we take is to construct a labeled precedence graph for the
schedule as follows:
1. Create a node for each transaction.
2. Create a node labeled Tbw. Tbw is a dummy transaction inserted at
the beginning of the schedule containing a write operation for each
data item accessed in the schedule.
162
3. Create a node labeled Tfr. Tfr is a dummy transaction added at the end
of the schedule containing a read operation for each data item accessed
in the schedule.
4. Create a directed edge Ti Tj if Tj, reads the value of an item written
by Ti.
5. Remove all directed edges incident on transaction Ti for which there is
no path from Ti to Tfr.
6. For each data item that Tj reads that has been written by Ti, and Tk
writes (Tk ≠ Tbw), then:
a) If Ti = Tbw and Tj ≠ Tfr, then create a directed edge Ti Tk.
b) If Ti ≠ Tbw and Tj = Tfr, then create a directed edge Tk Ti.
c) If Ti ≠ Tbw and Tj ≠ Tfr, then create a pair of directed edge Tk
Ti, and Tj Tk, where x is a unique positive integer that has not
been used for labeling an earlier directed edge. This rule is a more
general case of the preceding two rules, indicating that if transaction
Ti writes an item that Tj subsequently reads, then any transaction, Tk,
that writes the same item must either precede Ti or succeed Tj.
Applying the first five rules to the above mentioned schedule produces
the precedence graph shown in below Figure
Applying rule 6(a), we add the edges T11 T12 and T11 T13, both labeled 0; applying
rule 6(b), we add the edges T11 T13 (which is already present) and T12 T13, again
both labeled 0. The final graph is shown in below Figure.
163
Based on this labeled precedence graph, the test for view serializability is as
follows:
(1) If the graph contains no cycles, the schedule is view serializable.
(2) The presence of a cycle, however, is not a sufficient condition to conclude
that the schedule is not view serializable. The actual test is based on the
observation that rule 6(c) generates m distinct directed edge pairs, resulting
in 2m different graphs contain ing just one edge from each pair. If any one of
these graphs is acyclic, then the corresponding schedule is view serializable
and the serializability order is determined by the topological sorting of the
graph with the dummy transactions Tbw and Tfr removed.
Recoverability
164
But instead of the commit operation at the end of transaction T9, assume
that T9 decides to roll back the effects of the transaction. T10 has read the
update to balx performed by T9, and has itself updated balx and committed
the change. Strictly speaking, we should undo transaction T10, because it
has used a value for balx that has been undone. However, the durability
property does not allow this. In other words, this schedule is a
nonrecoverable schedule, which should not be allowed. This leads to the
definition of a recoverable schedule.
Recoverable schedule
A schedule in which for each pair of transactions Ti and Tj, if Tj reads a data
item previously written by Ti, then the commit operation of Ti precedes the
commit operation of Tj.
Concurrency control techniques
Serializability can be achieved in several ways; however, the two main
concurrency control techniques that allow transactions to execute safely in
parallel subject to certain constraints are called locking and timestamping.
Locking and timestamping are conservative (or pessimistic) approaches in
that they cause transactions to be delayed in case they conflict with other
transactions at some time in the future. Optimistic methods, are based on
the premise that conflict is rare, so they allow transactions to proceed
unsynchronized and check for conflicts only at the end, when a transaction
commits.
165
Locking :
A procedure used to control concurrent access to data. When one
transaction is accessing the database, a lock may deny access to other
transactions to prevent incorrect results.
Shared lock If a transaction has a shared lock on a data item, it can read
the item but not update it.
Exclusive lock If a transaction has an exclusive lock on a data item, it can
both read and update the item.
Locks are used in the following way:
• Any transaction that needs to access a data item must first lock the
item, requesting a shared lock for read-only access or an exclusive
lock for both read and write access.
• If the item is not already locked by another transaction, the lock will
be granted.
• If the item is currently locked, the DBMS determines whether the
request is compatible with the existing lock. If a shared lock is
requested on an item that already has a shared lock on it, the request
will be granted; otherwise, the transaction must wait until the existing
lock is released.
A transaction continues to hold a lock until it explicitly releases it
either during execution or when it terminates (aborts or commits). It
is only when the exclusive lock has been released that the effects of
the write operation will be made visible to other transactions.
166
A valid schedule that may be employed using the previous locking rules for
the above transaction is:
The problem in this example is that the schedule releases the locks that are
held by a transaction as soon as the associated read/write is executed and
that lock item (say balx) no longer needs to be accessed. However, the
transaction itself is locking other items (baly), after it releases its lock on
balx. Although this may seem to allow greater concurrency, it permits
transactions to interfere with one another, resulting in the loss of total
isolation and atomicity. To guarantee serializability, we must follow an
additional protocol concerning the positioning of the lock and unlock
operations in every transaction. The bestknown protocol is two-phase
locking (2PL).
167
Two-phase locking (2PL)
According to the rules of this protocol, every transaction can be divided into
two phases: first a growing phase, in which it acquires all the locks needed
but cannot release any locks, and then a shrinking phase, in which it
releases its locks but cannot acquire any new locks. There is no requirement
that all locks be obtained simultaneously. Normally, the transaction
acquires some locks, does some processing, and goes on to acquire
additional locks as needed. However, it never releases any lock until it has
reached a stage where no new locks are needed. The rules are:
• A transaction must acquire a lock on an item before operating on the
item. The lock may be read or write, depending on the type of access
needed.
• Once the transaction releases a lock, it can never acquire any new
locks.
If upgrading of locks is allowed, upgrading can take place only during the
growing phase and may require that the transaction wait until another
transaction releases a shared lock on the item. Downgrading can take place
only during the shrinking phase.
The two phase locking prevents the problems that arise in concurrent access
of same data item.
168
Deadlocks :
169
queued until the lock can be granted.
170
A deadlock is also called a circular waiting condition where two transactions
are waiting (directly or indirectly) for each other. Thus in a deadlock, two
transactions are mutually excluded from accessing the next record required
to complete their transactions, also called a deadly embrace.
Example:
A deadlock exists two transactions A and B exist in the following example:
Transaction A = access data items X and Y
Transaction B = access data items Y and X
Here, Transaction-A has aquired lock on X and is waiting to acquire lock on
y. While, Transaction-B has aquired lock on Y and is waiting to aquire lock
on X. But, none of them can execute further.
Transaction-A Time Transaction-B
--- t0 ---
Lock (X) (acquired lock on X) t1 ---
--- t2 Lock (Y) (acquired lock on Y)
Lock (Y) (request lock on Y) t3 ---
Wait t4 Lock (X) (request lock on X)
Wait t5 Wait
Wait t6 Wait
Wait t7 Wait
171
Time-Stamp Methods for Concurrency control:
1. Granule Timestamps :
Granule timestamp is a record of the timestamp of the last transaction to
access it. Each granule accessed by an active transaction must have a
granule timestamp.
A separate record of last Read and Write accesses may be kept. Granule
timestamp may cause. Additional Write operations for Read accesses if
they are stored with the granules. The problem can be avoided by
maintaining granule timestamps as an in-memory table. The table may be
of limited size, since conflicts may only occur between current
transactions. An entry in a granule timestamp table consists of the granule
identifier and the transaction timestamp. The record containing the largest
(latest) granule timestamp removed from the table is also maintained. A
search for a granule timestamp, using the granule identifier, will either be
successful or will use the largest removed timestamp.
2. Timestamp Ordering:
Following are the three basic variants of timestamp-based methods of
concurrency control:
Total timestamp ordering
Partial timestamp ordering
172
Multiversion timestamp ordering
173
(a) Total timestamp ordering :
The total timestamp ordering algorithm depends on maintaining access to
granules in timestamp order by aborting one of the transactions involved in
any conflicting access. No distinction is made between Read and Write
access, so only a single value is required for each granule timestamp .
(b) Partial timestamp ordering :
In a partial timestamp ordering, only non-permutable actions are ordered to
improve upon the total timestamp ordering. In this case, both Read and
Write granule timestamps are stored.
The algorithm allows the granule to be read by any transaction younger
than the last transaction that updated the granule. A transaction is aborted
if it tries to update a granule that has previously been accessed by a
younger transaction. The partial timestamp ordering algorithm aborts fewer
transactions than the total timestamp ordering algorithm, at the cost of
extra storage for granule timestamps
(c) Multiversion Timestamp ordering :
The multiversion timestamp ordering algorithm stores several versions of an
updated granule, allowing transactions to see a consistent set of versions
for all granules it accesses. So, it reduces the conflicts that result in
transaction restarts to those where there is a Write-Write conflict. Each
update of a granule creates a new version, with an associated granule
timestamp.
A transaction that requires read access to the granule sees the youngest
version that is older than the transaction. That is, the version having a
timestamp equal to or immediately below the transaction's timestamp.
3. Conflict Resolution in Timestamps :
To deal with conflicts in timestamp algorithms, some transactions
involved in conflicts are made to wait and to abort others.
Following are the main strategies of conflict resolution in timestamps:
WAIT-DIE:
The older transaction waits for the younger if the younger
has accessed the granule first.
The younger transaction is aborted (dies) and restarted if it tries
to access a granule after an older concurrent transaction.
WOUND-WAIT:
The older transaction pre-empts the younger by suspending
(wounding) it if the younger transaction tries to access a granule
after an older concurrent transaction.
174
An older transaction will wait for a younger one to commit if
the younger has accessed a granule that both want.
a. Read phase.
b. Validation or certification phase.
c. Write phase.
a. Read phase :
In a Read phase, the updates are prepared using private (or local) copies (or
versions) of the granule. In this phase, the transaction reads values of
committed data from the database, executes the needed computations, and
makes the updates to a private copy of the database values. All update
operations of the transaction are recorded in a temporary update file, which
is not accessed by the remaining transactions.
It is conventional to allocate a timestamp to each transaction at the end of
its Read to determine the set of transactions that must be examined by the
175
validation procedure. These set of transactions are those who have finished
their Read phases since the start of the transaction being verified
c. Write phase :
In a Write phase, the changes are permanently applied to the database and
the updated granules are made public. Otherwise, the updates are discarded
and the transaction is restarted. This phase is only for the Read-Write
transactions and not for Read-only transactions.
i. This technique is very efficient when conflicts are rare. The occasional
conflicts result in the transaction roll back.
ii. The rollback involves only the local copy of data, the database is not
involved and thus there will not be any cascading rollbacks.
i. Only suitable for environments where there are few conflicts and no
long transactions.
ii. Acceptable for mostly Read or Query database systems that require
very few update transactions.
176
control program is
called GRANULARITY. Locking can take place at the following level:
Database level.
Table level.
Page level.
Row (Tuple) level.
Attributes (fields) level.
i. Database level Locking :
At database level locking, the entire database is locked. Thus, it prevents
the use of any tables in the database by transaction T2 while transaction T1
is being executed. Database level of locking is suitable for batch processes.
Being very slow, it is unsuitable for on-line multi-user DBMSs.
177
Database Recovery
<Tn, Start>
When the transaction modifies an item X, it write logs as follows −
178
When the transaction finishes, it logs −
<Tn, commit>
The database can be modified using two approaches −
Deferred database modification − All logs are written on to the
stable storage and the database is updated when a transaction
commits.
Immediate database modification − Each log follows an actual
database modification. That is, the database is modified immediately
after every operation.
Recovery with Concurrent Transactions
When more than one transaction are being executed in parallel, the logs are
interleaved. At the time of recovery, it would become hard for the recovery
system to backtrack all logs, and then start recovering. To ease this
situation, most modern DBMS use the concept of 'checkpoints'.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill
out all the memory space available in the system. As time passes, the log
file may grow too big to be handled at all. Checkpoint is a mechanism
where all the previous logs are removed from the system and stored
permanently in a storage disk. Checkpoint declares a point before which
the DBMS was in consistent state, and all the transactions were committed.
Recovery
When a system with concurrent transactions crashes and recovers, it
behaves in the following manner −
The recovery system reads the logs backwards from the end to the
last checkpoint.
It maintains two lists, an undo-list and a redo-list.
179
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or
just <Tn, Commit>, it puts the transaction in the redo-list.
If the recovery system sees a log with <T n, Start> but no commit or
abort log found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are
removed. All the transactions in the redo-list and their previous logs are
removed and then redone before saving their logs.
180
• Run an alternative subtransaction, called a contingency
subtransaction. In our example, if the hotel reservation at the Hilton
fails, an alternative booking may be possible at another hotel, for
example, the Sheraton.
• Abort.
181
Database Security
1. Database Security:
Information is very critical asset. Organizations create so much information
and they use database systems to handle the information within them to
automate various functions. Due to information importance, information
protection is a critical component of the database management system.
Information security is the goal of a database management system (DBMS),
also called database security.
2. Importance of Database Security
In this information technology age, it is compulsory for all types of
institutions or companies to make avail their information assets
online always through databases. However, they must have a policy to
divide the levels of users with to which extent they can asset the
information. It is vital not to give opportunities to mischievous
intruders.
Databases are used to provide personnel information, customer
information, credit card numbers, financial data and business
transactions, etc. The information is very sensitive and highly
confidential and must be prevented from disclosure by other
competitors and unauthorized persons.
It is important to define who can access what data, who is allowed and
who is restricted, whether passwords are used and how to maintain it,
what sort of firewalls and anti-malware solutions to use, how to train
the staff and to enforce data security.
Furthermore, the backup continuity plan should be laid out so that
even though the systems fail, the business can be carried out without
delay.
While constructing the infrastructure security of a company, database
security should be well considered.
Database is very crucial to most enterprises at present days; the
damage of database will have tragic impact on it. Unsecured systems
will make hurt both the company itself and its clients.
3. Database Security Threats:
Database security begins with physical security for the systems that host
the database management system (DBMS). Database Management system is
not safe from intrusion, corruption, or destruction by people who have
physical access to the computers. Once physical security has been
established, database must be protected from unauthorized access by
authorized users as well as unauthorized users. There are three main
objects when designing a secure database system, and anything prevents
from a database management system to achieve these goals would be
consider a threat to database security. There are many internal and external
threats to database systems. Some of threats are as follows:
182
3.1 Integrity:
Database integrity refers that information be protected from improper
modification. Modification includes creation, insertion, modification,
changing the status of data, and deletion. Integrity is lost if unauthorized
changes are made intentionally or through accidental acts. For example,
Students cannot be allowed to modify their grades.
3.2 Availability:
Authorized user or program should not be denied access. For example, an
instructor who wishes to change a student grade should be allowed to do so.
3.3 Secrecy:
Data should not be disclosed to unauthorized users. For example, a student
should not be allowed to see and change other student grades.
3.4 Denial of service attack:
This attack makes a database server greatly slower or even not available to
user at all. DoS attack does not result in the disclosure or loss of the
database information; it can cost the victims much time and money.
3.5 Sniff attack:
To accommodate the e-commerce and advantage of distributed systems,
database is designed in a client-server mode. Attackers can use sniffer
software to monitor data streams, and acquire some confidential
information. For example, the credit card number of a customer.
3.6 Spoofing attack:
Attackers forge a legal web application to access the database, and then
retrieve data from the database and use it for bad transactions. The most
common spoofing attacks are TCP used to get the IP addresses and DNS
spoofing used to get the mapping between IP address and DNS name.
3.7 Trojan Horse:
It is a malicious program that embeds into the system. It can modify the
database and reside in operating system.
To achieve these objectives, a clear and consistent security policy should be
developed to define what security measure must be enforced. We must
determine what part of data is to be protected and which users get access to
which part of the information. The security mechanisms of the underlying
database management system, as well as external mechanism, such as
securing access to buildings, must be utilized to enforce the policy.
4. Database Security Countermeasures:
To protect the database system from the above mentioned threats. Here are
some countermeasures which are as follows:
4.1 Access Control:
A database for an organization contains a great deal of information and
usually has several users. Most of them need to access only a small part of
the database. A policy defines the requirements that are to be implemented
183
within hardware and software and those that are external to the system,
including physical, personal, and procedural controls.
4.2 Flow Control:
Flow control provides the flow of information among accessible objects. Flow
controls check that information contained in objects does not flow explicitly
or implicitly into less protected objects.
4.3 Encryption:
An encryption algorithm should be applied to the data, using a user-
specified encryption key. The output of the algorithm is the encrypted
version. There is also a decryption algorithm, which takes the encrypted
data and a decryption key as input and then returns the original data.
4.4 RAID:
Redundant Array of Independent Disks which protect against data loss due
to disk failure.
4.5 Authentication:
Access to the database is a matter of authentication. It provides the
guidelines how the database is accessed. Every access should be monitored.
4.6 Backup:
At every instant, backup should be done. In case of any disaster,
Organizations can retrieve their data.
184
Database Management Systems Lab
Consider the relational schema for part of the DreamHome case
study is:
Branch ( branchNo , street, city, postcode)
Staff ( staffNo , fName, IName , position, sex, DOB, salary, branchNo)
PropertyForRent ( propertyNo , street, city, postcode, type, rooms,
rent, ownerNo, staffNo, branchNo)
Client ( clientNo , fName, IName, telNo, prefType, maxRent, eMail)
PrivateOwner ( ownerNo , fName, IName, address, telNo , eMail,
password)
Viewing ( clientNo , propertyNo , viewDate, comment)
Registration ( clientNo , branchNo , staffNo, dateJoined)
Create a database with name “DreamHome” and now create all the tables
listed above with constraints.
185
List full details of all staff.
SELECT * FROM Staff
List all staff with a salary greater than £10000.
SELECT staffNo, fName, lName, salary FROM Staff WHERE salary > 10000
;
List the property numbers of all properties that have been
viewed.
SELECT DISTINCT propertyNo FROM Viewing ;
Produce a list of salaries for all staff, showing only the
staffNo, fName, IName, and salary details.
186
SELECT staffNo, fName AS FirstName, lName AS LastName, Salary FROM
Staff ;
List all cities where there is either a branch office or a
property for rent.
List the names and comments of all clients who have viewed a
property for rent.
SELECT C.clientNo, c.fName, c.lName, v.propertyNo, v.comment
FROM Client C, Viewing V
WHERE C.clientNo = V.clientNo ;
189
Find all staff whose salary is larger than the salary of at
least one member of staff at branch B003.
SELECT staffNo, fName, lName, oPosition, Salary
FROM Staff
WHERE salary > SOME(SELECT salary
FROM Staff
WHERE branchNo = 'B003');
Find all staff whose salary is larger than the salary of every
member of staff at branch B003
190
GROUP BY S.branchNo, S.staffNo
191
ORDER BY S.branchNo, S.staffNo ;
List all branch offices and any properties that are in the
same city.
SELECT B.branchNo, B.city AS branchCity, P.propertyNo, P.city AS
propertyCity
FROM Branch B LEFT OUTER JOIN PropertyForRent P ON B.city = P.city ;
List all properties and any branch offices that are in the
same city.
SELECT B.branchNo, P.city AS branchCity, P.propertyNo, P.city AS
propertyCity
FROM Branch B RIGHT OUTER JOIN PropertyForRent P ON B.city = P.city
;
List the branch offices and properties that are in the same
city along with any unmatched branches or properties.
SELECT B.branchNo, P.city AS branchCity, P.propertyNo, P.city AS
propertyCity
FROM Branch B FULL JOIN PropertyForRent P ON B.city = P.city ;
Find all staff who work in a London branch office.
SELECT staffNo, fName, lName, oPosition
FROM Staff S
WHERE EXISTS (SELECT *
FROM Branch B
WHERE S.branchNo = B.branchNo AND city = 'London');
Construct a list of all cities where there is either a branch
office or a property.
193
insert into guest values(10001, 'John Kay', '56 High St,
London');
insert into guest values(10002, 'Mike Ritchie', '18 Tain St,
London');
insert into guest values(10003, 'Mary Tregear', '5 Tarbot Rd,
Aberdeen');
insert into guest values(10004, 'Joe Keogh', '2 Fergus Dr,
Aberdeen');
insert into guest values(10005, 'Carol Farrel', '6 Achray St,
Glasgow');
insert into guest values(10006, 'Tina Murphy', '63 Well St,
Glasgow');
insert into guest values(10007, 'Tony Shaw', '12 Park Pl,
Glasgow');
194
SELECT * FROM hotel;
4. List all double or family rooms with a price below £40.00 per night,
in
ascending order of price.
SELECT * FROM room
WHERE price < 40 AND type IN ('Double', 'Family')
ORDER BY price;
5. List the bookings for which no date_To has been specified.
195
hotelno = (SELECT hotelno FROM hotel
WHERE hotelname = 'Grosvenor'));
12. List the details of all rooms at the Grosvenor Hotel, including the
name of
the guest staying in the room
SELECT r.* FROM room r LEFT JOIN
(SELECT g.guestname, h.hotelno, b.roomno FROM Guest g,
Booking b, Hotel h
WHERE g.guestno = b.guestno AND b.hotelno = h.hotelno AND
h.hotelname= 'Grosvenor' AND
b.datefrom <= CURRENT_DATE AND b.dateto >= CURRENT_DATE) AS
XXX
ON r.hotelno = XXX.hotelno AND r.roomno = XXX.roomno;
13. What is the total income from bookings for the Grosvenor Hotel
today?
SELECT SUM(price) FROM booking b, room r, hotel h
WHERE (b.datefrom <= CURRENT_DATE AND
b.dateto >= CURRENT_DATE) AND
r.hotelno = h.hotelno AND r.roomno = b.roomno;
14. List the rooms that are currently unoccupied at the Grosvenor
Hotel.
SELECT * FROM room r
WHERE roomno NOT IN
(SELECT roomno FROM booking b, hotel h
WHERE (datefrom <= CURRENT_DATE AND
dateto >= CURRENT_DATE) AND
b.hotelno = h.hotelno AND hotelname = 'Grosvenor');
15. What is the lost income from unoccupied rooms at the Grosvenor
Hotel?
SELECT SUM(price) FROM room r
WHERE roomno NOT IN
(SELECT roomno FROM booking b, hotel h
WHERE (datefrom <= CURRENT_DATE AND
dateto >= CURRENT_DATE) AND
b.hotelno = h.hotelno AND hotelname = 'Grosvenor');
16. List the number of rooms in each hotel.
SELECT hotelno, COUNT(roomno) AS count FROM room
GROUP BY hotelno;
17. List the number of rooms in each hotel in London.
SELECT hotel.hotelno, COUNT(roomno)
AS count FROM hotel, room
196
WHERE room.hotelno = hotel.hotelno
AND city LIKE '%London%'
GROUP BY hotelno;
18. What is the average number of bookings for each hotel in August?
SELECT AVG(X) AS AveNumBook FROM
(SELECT hotelno, COUNT(hotelno) AS X
FROM booking b
WHERE (b.datefrom >= DATE'2004-08-01' AND b.datefrom <=
DATE'2004-08-31')
GROUP BY hotelno);
19. What is the most commonly booked room type for each hotel in
London?
SELECT MAX(X) AS MostlyBook
FROM (SELECT type, COUNT(type) AS X
FROM booking b, hotel h, room r
WHERE r.roomno = b.roomno AND b.hotelno = h.hotelno AND
h.city LIKE '%London%'
GROUP BY type);
20. What is the lost income from unoccupied rooms at each hotel
today?
197
CREATE TABLE sailors ( sid integer not null,
sname varchar(32),
rating integer,
age real,
CONSTRAINT PK_sailors PRIMARY KEY (sid) );
SELECT S.sid
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color =
‘red’;
5. Find the names of sailors who have reserved a red boat.
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color =
‘red’;
7. Find the names of sailors who have reserved at least one boat.
SELECT sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid
8. Find the names of sailors who have reserved at least two boats.
SELECT sname
FROM sailors s, Reserves r1, reserves r2
WHERE s.sid=r1.sid AND s.sid=r2.sid AND
r1.bid<>r2.bid
9. Compute increments for the ratings of persons who have sailed two
different boats on the same day.
SELECT S.sname , S.rating+1
AS
Rating FROM Sailors S, Reserves R1, Reserves R2
WHERE S.sid = R1.sid AND S.sid = R2.sid
AND R1.day = R2.day AND R1.bid <> R2.bid;
10. Find the ages of sailors whose name begins and ends with B and
has at least three characters.
SELECT
S.age
FROM
Sailors S
199
WHERE
S.name LIKE ‘B_B’
11. Find the names of sailors who have reserved a red or a green boat.
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
OR
B.color=’green’;
12. Find the names of sailors who have reserved a red and a green
boat.
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
and
B.color=’green’;
13. Find the sids of all sailors who have reserved red boats but not
green boats.
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
and
B.color<>’green’;
14. Find all sids of sailors who have a rating of 10 or have reserved
boat 104.
(SELECT sid FROM sailor
WHERE rating=10)
UNION
(SELECT sid
FROM r
WHERE bid=104)
15. Find the names of sailors who have not reserved a red boat.
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color <>‘red’
;
16. Find sailors whose rating is better than some sailor called Horatio.
Select S.sid from sailors S where S. rating >
ANY ( Select S2.rating from Sailors S2 where S2.sname
= ‘Horatio’);
17. Find the names of sailors who have reserved all boats.
200
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS ( ( SELECT B.bid
FROM Boats B)
EXCEPT
( SELECT R.bid
FROM Reserves R
WHERE R.sid = S.sid ) )
18. Find the names of sailors who have reserved at least two boats.
201
WHERE S.sid = R.sid AND s.age > 20 AND R.bid = B.bid AND
B.color <>‘red’ ;
25. Find the name and age of the oldest sailor. //name can’t be
displayed
SELECT S.sname, MAX
(S.age) FROM Sailors S;
26. Count the number of different sailor names.
SELECT COUNT (DISTINCT S.sname)
FROM Sailors S;
27. Find the names of sailors who are older than the oldest sailor with a
rating
of 10.
SELECT S.sname
FROM Sailors S
WHERE S.age > (SELECT MAX(S2.age)
FROM Sailors S2
WHERE S2.rating = 10);
29. Find the age of the youngest sailor for each rating level.
SELECT S.rating,
MIN(S.age) FROM Sailors S
GROUP BY S.rating ;
30. Find age of the youngest sailor who is eligible to vote for each rating
level with at least 2 such sailors.
SELECT S.rating, MIN(S.age) AS minage
FROM Sailors S
WHERE S.age >=18
GROUP BY S.rating
HAVING COUNT(*) > 1;
202
31. Find the average age of sailors for each rating level that has at least
two sailors.
SELECT S.rating, AVG(S.age) AS average
FROM Sailors S
GROUP BY S.rating
HAVING COUNT(*) > 1;
32. For each red boat, find the number of reservations for this boat.
SELECT B.bid, COUNT(*) AS sailorcount
FROM Boats B, Reserves R
WHERE R.bid = B.bid AND B.color = 'red'
GROUP BY B.bid;
33. Find the average age of sailors who are of voting age (i.e., at least
18 years old) for each rating level that has at least two sailors.
SELECT S.rating, AVG(S.age) AS average
FROM Sailors S
WHERE S.age > 18
GROUP BY S.rating
HAVING 1<(SELECT COUNT(*)
FROM Sailors S2
WHERE S.rating=S2.rating AND S2.age>=18);
34. Delete the records of sailors who have rating 8 (deleting some rows
in a table).
Delete from Sailors where rating =8;
203