0% found this document useful (0 votes)
7 views

Introduction Chapter

grdfhdfgjfht

Uploaded by

Musiclover Huu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Introduction Chapter

grdfhdfgjfht

Uploaded by

Musiclover Huu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Introduction to Database Systems

Chapter 1
1.1 Introduction

An organization must have accurate and reliable data (information) for effective decision
making. Data (information) is the backbone and most critical resource of an organization that
enables managers and organizations to gain a competitive edge. In this age of information
explosion, where people are bombarded with data, getting the right information, in the right
amount, at the right time is not an easy task. So, only those organizations will survive that
successfully manage information.
A database system simplifies the tasks of managing the data and extracting useful
information in a timely fashion. A database system is an integrated collection of related files,
along with the details of the interpretation of the data. A Data Base Management System
is a software system or program that allows access to data contained in a database. The
objective of the DBMS is to provide a convenient and effective method of defining, storing,
and retrieving the information stored in the database.
The database and database management systems have become essential for managing
business, governments, schools, universities, banks etc.

1.2 Basic Definitions and Concepts

In an organization, the data is the most basic resource. To run the organization efficiently,
the proper organization and management of data is essential. The formal definition of the
major terms used in databases and database systems is defined in this section.

1.2.1 Data
The term data may be defined as known facts that could be recorded and stored on Computer
Media. It is also defined as raw facts from which the required information is produced.

1
2 Introduc tion to Database Management System
1.2.2 Information
Data and information are closely related and are often used interchangeably. Information is
nothing but refined data. In other way, we can say, information is processed, organized or
summarized data. According to Burch et. al., “Information is data that have been put into
a meaningful and useful content and communicated to a recipient who uses it to made
decisions”. Information consists of data, images, text, documents and voice, but always in
a meaningful content. So we can say, that information is something more than mere data.
Data are processed to create information. The recipient receives the information and then
makes a decision and takes an action, which may triggers other actions

Input Processing Output User


Data Information Decision

In these days, there is no lack of data, but there is lack of quality information. The
quality information means information that is accurate, timely and relevant, which are the
three major key attributes of information.
1. Accuracy : It means that the information is free from errors, and it clearly and
accurately reflects the meaning of data on which it is based. It also means it is free
from bias and conveys an accurate picture to the recipient.
2. Timeliness : It means that the recipients receive the information when they need it
and within the required time frame.
3. Relevancy : It means the usefulness of the piece of information for the corresponding
persons. It is a very subjective matter. Some information that is relevant for one
person might not be relevant for another and vice versa e.g., the price of printer is
irrelevant for a person who wants to purchase computer.
So, organization that have good information system, which produce information that is
accurate, timely and relevant will survive and those that do not realize the importance of
information will soon be out of business.

1.2.3 Meta Data


A meta data is the data about the data. The meta data describe objects in the database and
makes easier for those objects to be accessed or manipulated. The meta data describes the
database structure, sizes of data types, constraints, applications, autorisation etc., that are
used as an integral tool for information resource management. There are three main types
of meta data :
1. Descriptive meta data : It describes a resource for purpose such as discovery and
identification. In a traditional library cataloging that is form of meta data, title,
abstract, author and keywords are examples of meta data.
2. Structural meta data : It describes how compound objects are put together. The
example is how pages are ordered to form chapters.
3. Administrative meta data : It provides information to help manage a resource, such
as when and how it was created, file type and other technical information, and who
can access it. There are several subsets of data.
Introduc tion to Database Systems 3
1.2.4 Data Dictionary
The data dictionary contains information of the data stored in the database and is consulted
by the DBMS before any manipulation operation on the database. It is an integral part of
the database management systems and store meta data i.e., information about the database,
attribute names and definitions for each table in the database. It helps the DBA in the
management of the database, user view definitions as well as their use.
Data dictionary is generated for each database and generally stores and manages the
following types of information :
1. The complete information about physical database design e.g. storage structures, access
paths and file sizes etc.
2. The information about the database users, their responsibilities and access rights of
each user.
3. The complete information about the schema of the database.
4. The high level descriptions of the database transactions, applications and the infor-
mation about the relationships of users to the transactions.
5. The information about the relationship between the data items referenced by the
database transactions. This information is helpful in determining which transactions
are affected when some data definitions are modified.
The data dictionaries are of two types : Active data dictionary and passive data dictionary.
1. Active Data Dictionary : It is managed automatically by the database management
system (DBMS) and are always consistent with the current structure and definition
of the database. Most of the RDBMS’s maintain active data dictionaries.
2. Passive Data Dictionary : It is used only for documentation purposes and the data
about fields, files and people are maintained into the dictionary for cross references.
It is generally managed by the users of the system and is modified whenever the
structure of the database is changed. The passive dictionary may not be consistent
with the structure of the database, since modifications are performed manually
by the user. It is possible that passive dictionaries may contain information about
organisational data that is not computerized as these are maintained by the users.

1.2.5 Database
A database is a collection of interrelated data stored together with controlled redundancy
to serve one or more applications in an optimal way. The data are stored in such a way
that they are independent of the programs used by the people for accessing the data. The
approach used in adding the new data, modifying and retrieving the existing data from the
database is common and controlled one.
It is also defined as a collection of logically related data stored together that is designed
to meet information requirements of an organization. We can also define it as an electronic
filling system.
The example of a database is a telephone directory that contains names, addresses and
telephone numbers of the people stored in the computer storage.
4 Introduc tion to Database Management System
Databases are organized by fields, records and files. These are described briefly as
follows :
1.2.5.1 Fields
It is the smallest unit of the data that has meaning to its users and is also called data
item or data element. Name, Address and Telephone number are examples of fields. These
are represented in the database by a value.
1.2.5.2 Records
A record is a collection of logically related fields and each field is possessing a fixed
number of bytes and is of fixed data type. Alternatively, we can say a record is one complete
set of fields and each field have some value. The complete information about a particular
phone number in the database represents a record. Records are of two types fixed length
records and variable length records.
1.2.5.3 Files
A file is a collection of related records. Generally, all the records in a file are of same
size and record type but it is not always true. The records in a file may be of fixed length
or variable length depending upon the size of the records contained in a file. The telephone
directory containing records about the different telephone holders is an example of file. More
detail is available in chapter 3.

1.2.6 Components of a Database


A Database consists of four components as shown in Figure 1.1.

Data Items

Relationships
Physical
Database Constraints

Schema

Figure 1.1. Components of Database.

1. Data item : It is defined as a distinct piece of information and is explained in the


previous section.
2. Relationships : It represents a correspondence between various data elements.
3. Constraints : These are the predicates that define correct database states.
4. Schema : It describes the organization of data and relationships within the database.
The schema consists of definitions of the various types of record in the database,
the data-items they contain and the sets into which they are grouped. The storage
structure of the database is described by the storage schema. The conceptual schema
defines the stored data structure. The external schema defines a view of the database
for particular users.

1.2.7 Database Management System (DBMS)


DBMS is a program or group of programs that work in conjunction with the operating
system to create, process, store, retrieve, control and manage the data. It acts as an interface
between the application program and the data stored in the database.
Introduc tion to Database Systems 5
Alternatively, it can be defined as a computerized record-keeping system that stores
information and allows the users to add, delete, modify, retrieve and update that information.
The DBMS performs the following five primary functions :
1. Define, create and organise a database : The DBMS establishes the logical relationships
among different data elements in a database and also defines schemas and subschemas
using the DDL.
2. Input data : It performs the function of entering the data into the database through
an input device (like data screen, or voice activated system) with the help of the
user.
3. Process data : It performs the function of manipulation and processing of the data
stored in the database using the DML.
4. Maintain data integrity and security : It allows limited access of the database to
authorised users to maintain data integrity and security.
5. Query database : It provides information to the decision makers that they need to
make important decisions. This information is provided by querying the database
using SQL.

1.2.8 Components of DBMS


A DBMS has three main components. These are Data Definition Language (DDL), Data
Manipulation Language and Query Facilities (DML/SQL) and software for controlled access
of Database as shown in Figure 1.2 and are defined as follows :
USERS
Database system

Application Programs

Data Definition Language


DBMS (DDL)
Components

Software to process queries


and programs
(DML/SQL)

Software for controlled access


of stored data

Physical
Meta Data
Database

Figure 1.2. Components of DBMS.


6 Introduc tion to Database Management System
1.2.8.1 Data Definition Language (DDL)
It allows the users to define the database, specify the data types, data structures and
the constraints on the data to be stored in the database. More about DDL in section 1.5.
1.2.8.2 Data Manipulation Language (DML) and Query Language
DML allows users to insert, update, delete and retrieve data from the database. SQL
provides general query facility. More about DML and SQL in section 1.5.
1.2.8.3 Software for Controlled Access of Database
This software provides the facility of controlled access of the database by the users,
concurrency control to allow shared access of the database and a recovery control system
to restore the database in case of ardware or software failure.

The DBMS software together with the database is called a Database System.

1.3 Traditional File System Versus DataBase Systems

Conventionally, the data were stored and processed using traditional file processing systems.
In these traditional file systems, each file is independent of other file, and data in different
files can be integrated only by writing individual program for each application. The data
and the application programs that uses the data are so arranged that any change to the data
requires modifying all the programs that uses the data. This is because each file is hard-coded
with specific information like data type, data size etc. Some time it is even not possible to
identify all the programs using that data and is identified on a trial-and-error basis.
A file processing system of an organization is shown in Figure 1.3. All functional areas in
the organization creates, processes and disseminates its own files. The files such as inventory
and payroll generate separate files and do not communicate with each other.

Marketing Manufacturing Inventory Payroll

Application Application Application Application


program program program program

File 1 File 2 File 3 File 1 File 2 File 3 File 1 File 2 File 3 File 1 File 2

FIGURE 1.3. Traditional file system.

No doubt such an organization was simple to operate and had better local control but
the data of the organization is dispersed throughout the functional sub-systems. These days,
databases are preferred because of many disadvantages of traditional file systems.
Introduc tion to Database Systems 7
1.3.1 Disadvantages of Traditional File System
A traditional file system has the following disadvantages:
1. Data Redundancy : Since each application has its own data file, the same data may
have to be recorded and stored in many files. For example, personal file and payroll
file, both contain data on employee name, designation etc. The result is unnecessary
duplicate or redundant data items. This redundancy requires additional or higher
storage space, costs extra time and money, and requires additional efforts to keep
all files upto-date.
2. Data Inconsistency : Data redundancy leads to data inconsistency especially when
data is to be updated. Data inconsistency occurs due to the same data items that
appear in more than one file do not get updated simultaneously in each and every
file. For example, an employee is promoted from Clerk to Superintendent and the
same is immediately updated in the payroll file may not necessarily be updated
in provident fund file. This results in two different designations of an employee at
the same time. Over the period of time, such discrepencis degrade the quality of
information contain in the data file that affects the accuracy of reports.
3. Lack of Data Integration : Since independent data file exists, users face difficulty in
getting information on any ad hoc query that requires accessing the data stored in
many files. In such a case complicated programs have to be developed to retrieve
data from every file or the users have to manually collect the required information.
4. Program Dependence : The reports produced by the file processing system are
program dependent, which means if any change in the format or structure of data
and records in the file is to be made, the programs have to modified correspondingly.
Also, a new program will have to be developed to produce a new report.
5. Data Dependence : The Applications/programs in file processing system are data
dependent i.e., the file organization, its physical location and retrieval from the storage
media are dictated by the requirements of the particular application. For example, in
payroll application, the file may be organised on employee records sorted on their
last name, which implies that accessing of any employee’s record has to be through
the last name only.
6. Limited Data Sharing : There is limited data sharing possibilities with the traditional
file system. Each application has its own private files and users have little choice
to share the data outside their own applications. Complex programs required to be
written to obtain data from several incompatible files.
7. Poor Data Control : There was no centralised control at the data element level,
hence a traditional file system is decentralised in nature. It could be possible that
the data field may have multiple names defined by the different departments of an
organization and depending on the file it was in. This situation leads to different
meaning of a data field in different context or same meaning for different fields.
This causes poor data control.
8. Problem of Security : It is very difficult to enforce security checks and access rights
in a traditional file system, since application programs are added in an adhoc manner.
8 Introduc tion to Database Management System
9. Data Manipulation Capability is Inadequate : The data manipulation capability is
very limited in traditional file systems since they do not provide strong relationships
between data in different files.
10. Needs Excessive Programming : An excessive programming effort was needed to
develop a new application program due to very high interdependence between
program and data in a file system. Each new application requires that the developers
start from the scratch by designing new file formats and descriptions and then write
the file access logic for each new file.

1.3.2 Database Systems or Database System Environment


The DBMS software together with the Database is called a database system. In other words,
it can be defined as an organization of components that define and regulate the collection,
storage, management and use of data in a database. Furthermore, it is a system whose
overall purpose is to record and maintain information. A database system consists of four
major components as shown in Figure 1.4.
1. Data 2. Hardware 3. Software 4. Users
DBMS

Database
(Hardware)

User
(Data)
Application
programs User
(Software)
(Data)

User

(Data)
User
(Users)

FIGURE 1.4. Database system.

1. Data : The whole data in the system is stored in a single database. This data in the
database are both shared and integrated. Sharing of data means individual pieces of data
in the database is shared among different users and every user can access the same piece
of data but may be for different purposes. Integration of data means the database can be
function of several distinct files with redundancy controlled among the files.
Introduc tion to Database Systems 9
2. Hardware : The hardware consists of the secondary storage devices like disks, drums
and so on, where the database resides together with other devices. There is two types of
hardware. The first one, i.e., processor and main memory that supports in running the DBMS.
The second one is the secondary storage devices, i.e., hard disk, magnetic disk etc., that are
used to hold the stored data.
3. Software : A layer or interface of software exists between the physical database and
the users. This layer is called the DBMS. All requests from the users to access the database
are handled by the DBMS. Thus, the DBMS shields the database users from hardware details.
Furthermore, the DBMS provides the other facilities like accessing and updating the data in
the files and adding and deleting files itself.
4. Users : The users are the people interacting with the database system in any way.
There are four types of users interacting with the database systems. These are Application
Programmers, online users, end users or naive users and finally the Database Administrator
(DBA). More about users in section 1.4.

1.3.3 Advantages of Database Systems (DBMS’s)


The database systems provide the following advantages over the traditional file system:
1. Controlled redundancy : In a traditional file system, each application program has
its own data, which causes duplication of common data items in more than one file.
This duplication/redundancy requires multiple updations for a single transaction and
wastes a lot of storage space. We cannot eliminate all redundancy due to technical
reasons. But in a database, this duplication can be carefully controlled, that means
the database system is aware of the redundancy and it assumes the responsibility
for propagating updates.
2. Data consistency : The problem of updating multiple files in traditional file system
leads to inaccurate data as different files may contain different information of the same
data item at a given point of time. This causes incorrect or contradictory information
to its users. In database systems, this problem of inconsistent data is automatically
solved by controlling the redundancy.
3. Program data independence : The traditional file systems are generally data dependent,
which implies that the data organization and access strategies are dictated by the needs
of the specific application and the application programs are developed accordingly.
However, the database systems provide an independence between the file system
and application program, that allows for changes at one level of the data without
affecting others. This property of database systems allow to change data without
changing the application programs that process the data.
4. Sharing of data : In database systems, the data is centrally controlled and can be shared
by all authorized users. The sharing of data means not only the existing applications
programs can also share the data in the database but new application programs can
be developed to operate on the existing data. Furthermore, the requirements of the
new application programs may be satisfied without creating any new file.
5. Enforcement of standards : In database systems, data being stored at one central place,
standards can easily be enforced by the DBA. This ensures standardised data formats
10 Introduc tion to Database Management System
to facilitate data transfers between systems. Applicable standards might include any
or all of the following—departmental, installation, organizational, industry, corporate,
national or international.
6. Improved data integrity : Data integrity means that the data contained in the
database is both accurate and consistent. The centralized control property allow
adequate checks can be incorporated to provide data integrity. One integrity check
that should be incorporated in the database is to ensure that if there is a reference
to certain object, that object must exist.
7. Improved security : Database security means protecting the data contained in the
database from unauthorised users. The DBA ensures that proper access procedures
are followed, including proper authentical schemes for access to the DBMS and
additional checks before permitting access to sensitive data. The level of security
could be different for various types of data and operations.
8. Data access is efficient : The database system utilizes different sophisticated techniques
to access the stored data very efficiently.
9. Conflicting requirements can be balanced : The DBA resolves the conflicting
requirements of various users and applications by knowing the overall requirements
of the organization. The DBA can structure the system to provide an overall service
that is best for the organization.
10. Improved backup and recovery facility : Through its backup and recovery subsystem,
the database system provides the facilities for recovering from hardware or software
failures. The recovery subsystem of the database system ensures that the database
is restored to the state it was in before the program started executing, in case of
system crash.
11. Minimal program maintenance : In a traditional file system, the application programs
with the description of data and the logic for accessing the data are built individually.
Thus, changes to the data formats or access methods results in the need to modify
the application programs. Therefore, high maintenance effort are required. These are
reduced to minimal in database systems due to independence of data and application
programs.
12. Data quality is high : The quality of data in database systems are very high as
compared to traditional file systems. This is possible due to the presence of tools
and processes in the database system.
13. Good data accessibility and responsiveness : The database systems provide query
languages or report writers that allow the users to ask ad hoc queries to obtain
the needed information immediately, without the requirement to write application
programs (as in case of file system), that access the information from the database.
This is possible due to integration in database systems.
14. Concurrency control : The database systems are designed to manage simultaneous
(concurrent) access of the database by many users. They also prevents any loss of
information or loss of integrity due to these concurrent accesses.
15. Economical to scale : In database systems, the operational data of an organization is
stored in a central database. The application programs that work on this data can be
Introduc tion to Database Systems 11
built with very less cost as compared to traditional file system. This reduces overall
costs of operation and management of the database that leads to an economical
scaling.
16. Increased programmer productivity : The database system provides many standard
functions that the programmer would generally have to write in file system. The
availability of these functions allow the programmers to concentrate on the specific
functionality required by the users without worrying about the implementation
details. This increases the overall productivity of the programmer and also reduces
the development time and cost.

1.3.4 Disadvantages of Database Systems


In contrast to many advantages of the database systems, there are some disadvantages as
well. The disadvantages of a database system are as follows :
1. Complexity increases : The data structure may become more complex because of the
centralised database supporting many applications in an organization. This may lead
to difficulties in its management and may require professionals for management.
2. Requirement of more disk space : The wide functionality and more complexity
increase the size of DBMS. Thus, it requires much more space to store and run than
the traditional file system.
3. Additional cost of hardware : The cost of database system’s installation is much more.
It depends on environment and functionality, size of the hardware and maintenance
costs of hardware.
4. Cost of conversion : The cost of conversion from old file-system to new database
system is very high. In some cases the cost of conversion is so high that the cost of
DBMS and extra hardware becomes insignificant. It also includes the cost of training
manpower and hiring the specialized manpower to convert and run the system.
5. Need of additional and specialized manpower : Any organization having database
systems, need to be hire and train its manpower on regular basis to design and
implement databases and to provide database administration services.
6. Need for backup and recovery : For a database system to be accurate and available
all times, a procedure is required to be developed and used for providing backup
copies to all its users when damage occurs.
7. Organizational conflict : A centralised and shared database system requires a
consensus on data definitions and ownership as well as responsibilities for accurate
data maintenance.
8. More installational and management cost : The big and complete database systems
are more costly. They require trained manpower to operate the system and has
additional annual maintenance and support costs.

1.4 DBMS Users

The users of a database system can be classified into various categories depending upon
their interaction and degree of expertise of the DBMS.
12 Introduc tion to Database Management System
1.4.1 End Users or Naive Users
The end users or naive users use the database system through a menu-oriented application
program, where the type and range of response is always displayed on the screen. The user
need not be aware of the presence of the database system and is instructed through each
step. A user of an ATM falls in this category.

1.4.2 Online Users


These type of users communicate with the database directly through an online terminal or
indirectly through an application program and user interface. They know about the existence
of the database system and may have some knowledge about the limited interaction they
are permitted.

1.4.3 Application Programmers


These are the professional programmers or software developers who develop the application
programs or user interfaces for the naive and online users. These programmers must have
the knowledge of programming languages such as Assembly, C, C++, Java, or SQL, etc.,
since the application programs are written in these languages.

1.4.4 Database Administrator


Database Administrator (DBA) is a person who have complete control over database of
any enterprise. DBA is responsible for overall performance of database. He is free to take
decisions for database and provides technical support. He is concerned with the Back-End
of any project. Some of the main responsibilities of DBA are as follows :
1. Deciding the conceptual schema or contents of database : DBA decides the data
fields, tables, queries, data types, attributes, relations, entities or you can say that he
is responsible for overall logical design of database.
2. Deciding the internal schema of structure of physical storage : DBA decides how
the data is actually stored at physical storage, how data is represented at physical
storage.
3. Deciding users : DBA gives permission to users to use database. Without having
proper permission, no one can access data from database.
4. Deciding user view : DBA decides different views for different users.
5. Granting of authorities : DBA decides which user can use which portion of database.
DBA gives authorities or rights to data access. User can use only that data on which
access right is granted to him.
6. Deciding constraints : DBA decides various constraints over database for maintaining
consistency and validity in database.
7. Security : Security is the major concern in database. DBA takes various steps to
make data more secure against various disasters and unauthorized access of data.
8. Monitoring the performance : DBA is responsible for overall performance of database.
DBA regularly monitors the database to maintain its performance and try to improve it.
Introduc tion to Database Systems 13
9. Backup : DBA takes regular backup of database, so that it can be used during system
failure. Backup is also used for checking data for consistency.
10. Removal of dump and maintain free space : DBA is responsible for removing
unnecessary data from storage and maintain enough free space for daily operations.
He can also increase storage capacity when necessary.
11. Checks : DBA also decides various security and validation checks over database to
ensure consistency.
12. Liaisioning with users : Another task of the DBA is to liaisioning with users and ensure
the availability of the data they require and write the necessary external schemas.

1.5 DataBase or DBMS Languages

The DBMS provides different languages and interfaces for each category of users to express
database queries and updations. When the design of the database is complete and the DBMS
is chosen to implement it, the first thing to be done is to specify the conceptual and internal
schemas for the database and the corresponding mappings. The following five languages are
available to specify different schemas.
1. Data Definition Language (DDL) 2. Storage Definition Language (SDL)
3. View Definition Language (VDL) 4. Data Manipulation Language (DML)
5. Fourth-Generation Language (4-GL)

1.5.1 Data Definition Language (DDL)


It is used to specify a database conceptual schema using set of definitions. It supports the
definition or declaration of database objects. Many techniques are available for writing DDL.
One widely used technique is writing DDL into a text file. More about DDL in chapter 7.

1.5.2 Storage Definition Language (SDL)


It is used to specify the internal schema in the database. The storage structure and access
methods used by the database system is specified by the specified set of SDL statements.
The implementation details of the database schemas are implemented by the specified SDL
statements and are usually hidden from the users.

1.5.3 View Definition Language (VDL)


It is used to specify user’s views and their mappings to the conceptual schema. But generally,
DDL is used to specify both conceptual and external schemas in many DBMS’s. There are
two views of data the logical view—that is perceived by the programmer and physical
view—data stored on storage devices.

1.5.4 Data Manipulation Language (DML)


It provides a set of operations to support the basic data manipulation operations on the data
held in the database. It is used to query, update or retrieve data stored in a database. The
part of DML that provide data retrieval is called query language.
14 Introduc tion to Database Management System
The DML is of the two types :
(i) Procedural DML : It allows the user to tell the system what data is needed and how
to retrieve it.
(ii) Non-procedural DML : It allows the user to state what data are needed, rather than
how it is to be retrieved. More about DML in chapter 7.

1.5.5 Fourth-Generation Language (4-GL)


It is a compact, efficient and non-procedural programming language used to improve the
efficiency and productivity of the DBMS. In this, the user defines what is to be done and
not how it is to be done. The 4-GL has the following components in it. These are :
(a) Query languages (b) Report
(c) Spread sheets (d) Database languages
(e) Application generators
(f ) High level languages to generate application program.
System Query Language (SQL) is an example of 4-GL. More about SQL in Chapter 7.

1.6 Schemas, Subschema and Instances

The plans of the database and data stored in the database are most important for an
organization, since database is designed to provide information to the organization. The data
stored in the database changes regularly but the plans remain static for longer periods of time.

1.6.1 Schema
A schema is plan of the database that give the names of the entities and attributes and
the relationship among them. A schema includes the definition of the database name, the
record type and the components that make up the records. Alternatively, it is defined as
a frame-work into which the values of the data items are fitted. The values fitted into the
frame-work changes regularly but the format of schema remains the same e.g., consider the
database consisting of three files ITEM, CUSTOMER and SALES. The data structure diagram
for this schema is shown in Figure 1.5. The schema is shown in database language.
Generally, a schema can be partitioned into two categories, i.e., (i) Logical schema and
(ii) Physical schema.
(i) The logical schema is concerned with exploiting the data structures offered by the
DBMS so that the schema becomes understandable to the computer. It is important
as programs use it to construct applications.
(ii) The physical schema is concerned with the manner in which the conceptual database
get represented in the computer as a stored database. It is hidden behind the logical
schema and can usually be modified without affecting the application programs.
The DBMS’s provide DDL and DSDL to specify both the logical and physical schema.
Introduc tion to Database Systems 15

Schema name is ITEM_SALES_REC

type ITEM = record


ITEM_ID: string;
ITEM_DESC: string; Attributes/data items
ITEM_COST: integer;
end
type CUSTOMER = record
CUSTOMER_ID = integer;
CUSTOMER_NAME = string;
CUSTOMER_ADD = string;
CUSTOMER_CITY = string;
CUSTOMER_BAL = integer;
end
type SALES = RECORD
CUSTOMER_ID = integer;
ITEM_ID = string;
ITEM_QTY = integer;
ITEM_PRICE = integer;
end

FIGURE 1.5. Data structure diagram for the item sales record.

1.6.2 Subschema
A subschema is a subset of the schema having the same properties that a schema has. It
identifies a subset of areas, sets, records, and data names defined in the database schema
available to user sessions. The subschema allows the user to view only that part of the
database that is of interest to him. The subschema defines the portion of the database as
seen by the application programs and the application programs can have different view of
data stored in the database.
The different application programs can change their respective subschema without affecting
other’s subschema or view.
The Subschema Definition Language (SDL) is used to specify a subschema in the DBMS.

1.6.3 Instances
The data in the database at a particular moment of time is called an instance or a database
state. In a given instance, each schema construct has its own current set of instances. Many
instances or database states can be constructed to correspond to a particular database schema.
Everytime we update (i.e., insert, delete or modify) the value of a data item in a record,
one state of the database changes into another state. The Figure 1.6 shows an instance of
the ITEM relation in a database schema.
16 Introduc tion to Database Management System
ITEM

ITEM-ID ITEM_DESC ITEM_COST


1111A Nutt 3
1112A Bolt 5
1113A Belt 100
1144B Screw 2
FIGURE 1.6. An instance/database state of the ITEM relation.

1.7 Three Level Architecture of Database Systems (DBMS)

The architecture is a framework for describing database concepts and specifying the structure
of database system. The three level architecture was suggested by Ansi/Sparc. Here
database is divided into three levels external level, conceptual level and internal level as
shown in Figure 1.7.
Views and External External External External
mappings view 1 view 2 view N level
are
maintained
by DBA

External conceptual External/conceptual External/conceptual


mapping 1 mapping 2 mapping N

Conceptual Conceptual view Database


level Management
System

Conceptual/Internal
mapping

Internal Internal view (Physical


level storage of data)

FIGURE 1.7. Three level architecture of DBMS.

1.7.1 Levels or Views


The three levels or views are discussed below:
(i) Internal Level : Internal level describes the actual physical storage of data or the way
in which the data is actually stored in memory. This level is not relational because data is
stored according to various coding schemes instead of tabular form (in tables). This is the
low level representation of entire database. The internal view is described by means of an
internal schema.
Introduc tion to Database Systems 17
The internal level is concerned with the following aspects:
– Storage space allocation
– Access paths
– Data compression and encryption techniques
– Record placement etc.
The internal level provides coverage to the data structures and file organizations used to
store data on storage devices.
(ii) Conceptual Level : The conceptual level is also known as logical level which
describes the overall logical structure of whole database for a community of users. This level
is relational because data visible at this level will be relational tables and operators will be
relational operators. This level represents entire contents of the database in an abstract form
in comparison with physical level. Here conceptual schema is defined which hides the actual
physical storage and concentrate on relational model of database.
(iii) External Level : The external level is concerned with individual users. This level
describes the actual view of data seen by individual users. The external schema is defined
by the DBA for every user. The remaining part of database is hidden from that user. This
means user can only access data of its own interest. In other words, user can access only
that part of database for which he is authorized by DBA. This level is also relational or
very close to it.

1.7.2 Different Mappings in Three Level Architecture of DBMS


The process of transforming requests and results between the three levels are called mappings.
The database management system is responsible for this mapping between internal, external
and conceptual schemas.
There are two types of mappings:
1. Conceptual/Internal mapping.
2. The External/Conceptual mapping.
1. The Conceptual/Internal Mapping : This mapping defines the correspondence or
operations between the conceptual view and the physical view. It specifies how the data is
retrieved from physical storage and shown at conceptual level and vice-versa. It specifies
how conceptual records and fields are represented at the internal level. It also allows any
differences in entity names, attribute names and their orders, data types etc., to be resolved.
2. The External/Conceptual Mapping : This mapping defines the correspondence between
the conceptual view and the physical view. It specifies how the data is retrieved from
conceptual level and shown at external level because at external level some part of database
is hidden from a particular user and even names of data fields are changed etc.
There could be one mapping between conceptual and internal level and several mappings
between external and conceptual level. The physical data independence is achieved through
conceptual/internal mapping while the logical data independence is achieved through external/
conceptual mapping. The information about the mapping requests among various schema
levels are included in the system catalog of DBMS. When schema is changed at some level,
the schema at the next higher level remains unchanged, only the mapping between the two
levels is changed.
18 Introduc tion to Database Management System
1.7.3 Advantages of Three-level Architecture
The motive behind the three-level architecture is to isolate each user’s view of the database
from the way the database is physically stored or represented. The advantages of the three-
level architecture are as follows :
1. Each user is able to access the same data but have a different customized view of
the data as per the requirement.
2. The changes to physical storage organization does not affect the internal structure
of the database. e.g., moving the database to a new storage device.
3. To use the database, the user is no need to concern about the physical data storage
details.
4. The conceptual structure of the database can be changed by the DBA without affecting
any user.
5. The database storage structure can be changed by the DBA without affecting the
user’s view.

1.7.4 Data Independence


It is defined as the characteristics of a database system to change the schema at one level
without having to change the schema at the next higher level. It can also be defined as the
immunity of the application programs to change in the physical representation and access
techniques of the database. The above definition says that the application programs do
not depend on any particular physical representation or access technique of the database.
The DBMS achieved the data independence by the use of three-level architecture. The data
independence is of Two types:
1. Physical Data Independence : It indicates that the physical storage structures or
devices used for storing the data could be changed without changing the conceptual view
or any of the external views. Only the mapping between the conceptual and internal level
is changed. Thus, in physical data independence, the conceptual schema insulates the users
from changes in the physical storage of the data.
2. Logical Data Independence : It indicates that the conceptual schema can be changed
without changing the existing external schemas. Only the mapping between the external and
conceptual level is changed and absorbed all the changes of the conceptual schema. DBMS
that supports logical data independence, changes to the conceptual schema is possible without
making any change in the existing external schemas or rewriting the application programs.
Logical data independence also insulates application programs from operations like combining
of two records into one or splitting an existing record into more than one records.

1.8 Data Models

A data model is a collection of concepts that can be used to describe the structure of the
database including data types, relationships and the constraints that apply on the data.
A data model helps in understanding the meaning of the data and ensures that, we
understand.
Introduc tion to Database Systems 19
– The data requirements of each user.
– The use of data across various applications.
– The nature of data independent of its physical representations.
A data model supports communication between the users and database designers. The major
use of data model is to understand the meaning of the data and to facilitate communication
about the user requirements.

Characteristics of Data Models


A data model must posses the following characteristics so that the best possible data
representation can be obtained.
(i) Diagrammatic representation of the data model.
(ii) Simplicity in designing i.e., Data and their relationships can be expressed and
distinguished easily.
(iii) Application independent, so that different applications can share it.
(iv) Data representation must be without duplication.
(v) Bottom-up approach must be followed.
(vi) Consistency and structure validation must be maintained.

1.8.1 Types of Data Models


The various data models can be divided into three categories, such as
(i) Record Based Data Models.
(ii) Object Based Data Models.
(iii) Physical Data Models.
(i) Record Based Data Models : These models represent data by using the record
structures. These models lie between the object based data models and the physical
data models. These data models can be further categorised into three types:
(a) Hierarchical Data Model
(b) Network Data Model
(c) Relational Data Model.
(ii) Object Based Data Models : These models are used in describing the data at the
logical and user view levels. These models allow the users to implicity specify the
constraints in the data. These data models can be further categorised into four types:
(a) Entity Relationship Model (ER-Model)
(b) Object Oriented Model
(c) Semantic Data Model
(d) Functional Data Model.
The models are discussed in the coming sections.
(iii) Physical Data Models : These models provide the concepts that describes the details
of how the data is stored in the computer along with their record structures, access
paths and ordering. Only specialized or professional users can use these models.
These data models can be divided into two types:
20 Introduc tion to Database Management System
(a) Unifying Model.
(b) Frame Memory Model.
1.8.1.1 Record based Data Models
Record based data models represent data by using the record structures. These are used
to describe data at the conceptual view level. These are named because the database is
structured in a fixed format records of several types. The use of fixed length records simplify
the physical level implementation of the database. These models lie between the object based
data models and the physical data models. These models provide the concepts that may be
understood by the end users. These data models do not implement the full detail of the
data storage on a computer system. Thus, these models are used to specify overall logical
structure of the database and to provide high level description of implementation. These are
generally used in traditional DBMS’s and are also known as ‘Representational Data Models’.
The various categories of record based data models are as follows:
(i) Hierarchical Data Model
(ii) Network Data Model
(iii) Relational Data Model.
(i) Hierarchical Data Model : Hierarchical Data Model is one of the oldest database
models. The hierarchical model became popular with the introduction of IBM’s Information
Management System (IMS).
The hierarchical data model organizes records in a tree structure i.e., hierarchy of parent
and child records relationships. This model employs two main concepts : Record and Parent
Child Relationship. A record is a collection of field values that provide information of an
entity.
A Parent Child Relationship type is a 1 : N relationship between two record types. The
record type of one side is called the parent record type and the one on the N side is called
the child record type. In terms of tree data structure, a record type corresponds to node of
a tree and relationship type corresponds to edge of the tree.
The model requires that each child record can be linked to only one parent and child
can only be reached through its parent.

WORLD

Continent ASIA EUROPE AUSTRALIA Etc.

Country INDIA CHINA PAKISTAN Etc.


(Grand Parent)

State PUNJAB HARYANA RAJASTHAN Etc.


(Parent)

District ROHTAK SIRSA HISSAR Etc.


(Child)

FIGURE 1.8. Hierarchical Model.


Introduc tion to Database Systems 21
In the Figure 1.8, the ‘World’ acts as a root of the tree structure which has many
children’s like Asia, Europe, Australia etc. These children can act as a parent for different
countries such as ASIA continents acts as a parent for countries like India, China, Pakistan
etc. Similarly these children can act as a parent for different states such as INDIA country
acts as a parent for states Punjab, Haryana, Rajasthan etc. Further the same follows.
Consider child ‘Rohtak’ which has a parent ‘Haryana’ which further has a parent
‘India’ and so on. Now ‘India’ will acts a grandparent for the child ‘Rohtak’.
The major advantages of Hierarchical Model are that it is simple, efficient, maintains
data integrity and is the first model that provides the concept of data security. The major
disadvantages of Hierarchical model are that it is complex to implement, Lacking of structural
independence, operational anomalies and data management problem.
(ii) Network Data Model : As a result of limitations in the hierarchical model, designers
developed the Network Model. The ability of this model to handle many to many (N : N)
relations between its records is the main distinguishing feature from the hierarchical model.
Thus, this model permits a child record to have more than one parent. In this model, directed
graphs are used instead of tree structure in which a node can have more than one parent.
This model was basically designed to handle non-hierarchical relationships.
The relationships between specific records of 1 : 1 (one to one), 1 : N (one to many) or
N : N (many to many) are explicitly defined in database definition of this model.
The Network Model was standardized as the Codasyl Dbtg (Conference of Data
System Languages, Database Task Group) model.
There are two basic data structures in this model—Records and Sets. The record contains
the detailed information regarding the data which are classified into record types. A set type
represents relationship between record types and this model use linked lists to represent these
relationships. Each set type definition consists of three basic elements : a name for set type
an owner record type (like parent) and a member record type (like child).
To represent many to many relationship in this model, the relationship is decomposed
into two one to many (1 : N) relationships by introducing an additional record type called
an Intersection Record or Connection Record.
The major advantages of Network Model are that it is conceptually simple, Handles
more relationship types, promotes database integrity, data access flexibility and conformance
to the standards.
The major disadvantages of Network Model are that it is complex and lack of structural
independence.
(iii) Relational data Model : The Relational Model was first introduced by Dr. Edgar
Frank, an Oxford-trained Mathematician, while working in IBM Research Centre in 1970’s.
The Relational Model is considered one of the most popular developments in the database
technology because it can be used for representing most of the real world objects and the
relationships between them.
The main significance of the model is the absolute separation of the logical view and the
physical view of the data. The physical view in relational model is implementation dependent
and not further defined.
The logical view of data in relational model is set oriented. A relational set is an unordered
group of items. The field in the items are the columns. The column in a table have names.
22 Introduc tion to Database Management System
The rows are unordered and unnamed. A database consists of one or more tables plus a
catalogue describing the database.
The relational model consists of three components:
1. A structural component—A set of tables (also called relations) and set of domains
that defines the way data can be represented.
2. A set of rules for maintaining the integrity of the database.
3. A manipulative component consisting of a set of high-level operations which act
upon and produce whole tables.
In the relational model the data is represented in the form of tables which is used
interchangeably with the word Relation. Each table consists of rows also knowns as tuples
(A tuple represents a collection of information about an item, e.g., student record) and column
also known as attributes. (An attribute represents the characteristics of an item, e.g., Student’s
Name and Phone No.). There are relationships existing between different tables. This model
doesn’t require any information that specifies how the data should be stored physically.
The major advantages of Relational Model are that it is structurally independent, improved
conceptual simplicity adhoc query capability and powerful DBMS. The major disadvantages of
relational model are substantial hardware and software overhead and facilitates poor design
and implementation.
1.8.1.2 Object Based Data Models
Object Based Data Models are also known as conceptual models used for defining
concepts including entries, attributes and relationships between them. These models are used
in describing data at the logical and user view levels. These models allow the constraints
to be specified on the data explicitly by the users.
An entity is a distinct object which has existence in real world. It will be implemented
as a table in a database.
An attribute is the property of an entity, in other words, attribute is a single atomic unit
of information that describes something about its entity. It will be implemented as a column
or field in the database.
The associations or links between the various entities is known as relationships.
There are 4 types of object based data models. These are:
(a) Entity-relationship (E-R) Model
(b) Object-Oriented Model
(c) Semantic Data Model
(d) Functional Data Model
These are discussed as follows:
(a) Entity-Relationship (E-R) Model : The E-R model is a high level conceptual data model
developed by Chen in 1976 to facilitate database design. The E-R model is the generalization
of earlier available commercial model like the hierarchical and network model. It also allows
the representation of the various constraints as well as their relationships.
The relationship between entity sets is represented by a name. E-R relationship is of
1 : 1, 1 : N or N : N type which tells the mapping from one entity set to another.
E-R model is shown diagrammatically using entity-relationship (E-R) diagrams which
represents the elements of the conceptual model that show the meanings and relationships
Introduc tion to Database Systems 23
between those elements independent of any particular DBMS. The various features of E-R
model are:
(i) E-R Model can be easily converted into relations (tables).
(ii) E-R Model is used for purpose of good database design by database developer.
(iii) It is helpful as a problem decomposition tool as it shows entities and the relationship
between those entities.
(iv) It is an iterative process.
(v) It is very simple and easy to understand by various types of users.
The major advantages of E-R model are that it is conceptually simple, have vishal
representation, an effective communication tool and can be integrated with the relational
data model.
The major disadvantages of E-R model are that there are limited constraint representation,
limited relationship representation, no data manipulation language and loss of information
content.
(b) Object-Oriented Data Model : Object-oriented data model is a logical data model that
captures the semantics of objects supported in an object-oriented programming. It is based
on collection of objects, attributes and relationships which together form the static properties.
It also consists of the integrity rules over objects and dynamic properties such as operations
or rules defining new database states.
An object is a collection of data and methods. When different objects of same type are
grouped together they form a class. This model is used basically for multimedia applications
as well as data with complex relationships. The object model is represented graphically
with object diagrams containing object classes. Classes are arranged into hierarchies sharing
common structure and behaviour and are associated with other classes.

Advantages of Object-Oriented Data Models


The various advantages of object-oriented data model are as follows:
(i) Capability to handle various data types : The object-oriented databases has the
capability to store various types of data such as text, video pictures, voices etc.
(ii) Improved data access : Object oriented data models represent relationships explicitly.
This improves the data access.
(iii) Improved productivity : Object-oriented data models provide various features such
as inheritance, polymorphism and dynamic binding that allow the users to compose
objects. These features increase the productivity of the database developer.
(iv) Integrated application development system : Object-oriented data model is capable
of combining object-oriented programming with database technology which provides
an integrated application development system.

Disadvantages of Object-Oriented Data Models


The various disadvantages of object-oriented data models are as follows:
(i) Not suitable for all applications : Object-oriented data models are used where there
is a need to manage complex relationships among data objects. They are generally
24 Introduc tion to Database Management System
suited for applications such as e-commerce, engineering and science etc. and not for
all applications.
(ii) No precise definition : It is difficult to define what constitutes an object-oriented
DBMS since the name has been applicable to wide variety of products.
(iii) Difficult to maintain : The definition of object is required to be changed periodically
and migration of existing databases to confirm to the new object definition. It creates
problems when changing object definitions and migrating databases.
(c) Semantic Data Models : These models are used to express greater interdependencies
among entities of interest. These interdependencies enable the models to represent the
semantic of the data in the database. This class of data models are influenced by the work
done by artificial intelligence researchers. Semantic data models are developed to organize
and represent knowledge but not data. This type of data models are able to express greater
interdependencies among entities of interest. Mainframe database are increasingly adopting
semantic data models. Also, its growth usage is seen in PC’s. In coming times database
management systems will be partially or fully intelligent.
(d) Functional Data Model : The functional data model describes those aspects of a
system concerned with transformation of values-functions, mappings, constraints and functional
dependencies. The functional data model describes the computations within a system. It
shows how output value in computation are derived from input values without regard for
the order in which the values are computed. It also includes constraints among values. It
consists of multiple data flow diagrams. Data flow diagrams show the dependencies between
values and computation of output values from input values and functions, without regard
for when the functions are executed. Traditional computing concepts such as expression trees
are examples of functional models.

1.8.2 Comparison of Various Data Models


The most commonly used data models are compared on the basis of various properties. The
comparison table is given below.
Property Hierarchical Network Relational E-R Diagram Object-oriented
1. Data element Files, records Files, Tables/tuples Objects, Objects
organization records entity sets
2. Identity Record based Record Value based Value based Record based
based
3. Data Yes Yes Yes Yes Yes
Independence
4. Relationship Logical Intersecting Identifiers of Relational Logical
Organization proximity in Networks rows in one extenders containment
a linearised table are that support
tree. embedded as specialized
attribute values applications.
in another table.
5. Access Procedural Procedural Non-procedural Non- Procedural
Language procedural
6. Structural No No Yes Yes Yes
Independence
Introduc tion to Database Systems 25
1.8.3 Which Data Models to Use?
So far we have discussed a large number of data models. Data models are essential as they
provide access techniques and data structure for defining a DBMS. In other words, a data
model describe the logical Organization of data along with operations that manipulate the data.
We have large number of data models, the one which is best for the Organization depends
upon the following factors:
l Is the database too small or too big.
l What are the costs involved.
l The volume of daily transactions that will be done.
l The estimated number of queries that will be made from the database by the
organization to enquire about the data.
l The data requirements of the organization using it.
From the available record based data models, the relational data model is most commonly
used model by most of the organizations because of the following reasons:
1. It increases the productivity of application programmers in designing the database.
Whenever changes are made to the database there is no need of changing the
application programs because of separation of logical level from conceptual level.
2. It is useful for representing most of the real world objects and relationships between
them.
3. It provides very powerful search, selection and maintenance of data.
4. It hides the physical level details from the end users so end users are not bothered
by physical storage.
5. It provides data integrity and security so that data is not accessed by unauthorized
users and data is always accurate.
6. It provides adhoc query capability.
Some of the common DBMS using Relational model are MS-Access, Informix, Ingres,
Oracle etc.
The hierarchical data model is used in those organizations which use databases consisting
of large number of one to many relationships. Because of the restriction to one to many
relationships, complexity of tree structure diagrams, lack of declarative querying facilities the
hierarchical model lost its importance.
The network data model is used in those organizations which use databases consisting
of large number of many to many relationships, but due to its complex nature it is also not
preferred.
Most of the DBMS use object oriented data modelling techniques which are used by
large number of organizations. For example—Latest versions of oracle are object relational
hybrids because they support both relational and Object Oriented features.

1.9 Types of Database Systems

The database systems can be classified into three categories i.e.,


(i) According to the number of users
26 Introduc tion to Database Management System
(ii) According to the type of use
(iii) According to database site locations
The various types of database systems are as follows:

1.9.1 According to the Number of Users


According to the number of users, the database systems can be further subdivided into two
categories, namely:
(a) Single-user database systems
(b) Multiuser database systems.
(a) Single-user database systems : In a single user database system, the database reside
on a PC–on the hard disk. All the applications run on the same PC and directly access the
database. In single user database systems, the application is the DBMS. A single user accesses
the applications and the business rules are enforced in the applications running on PC. A
single user database system is shown in Figure 1.9. The example is DBASE files on a PC.
Microcomputer or Workstation
User Interface
DBMS
Operating System
Database Files (Storage)
Communications
FIGURE 1.9. Single user database system.
(b) Multiuser database systems : In a multiuser database system, many PC’s are connected
through a Local Area Network (LAN) and a file server stores a copy of the database files.
Each PC on the LAN is given a volume name on the file server. Applications run on each
PC that is connected to the LAN and access the same set of files on the file server. The
application is the DBMS and each user runs a copy of the same application and accesses
the same files. The applications must handle the concurrency control and the business rules
are enforced in the application. The example is MS-Access or Oracle files on a file server.
A multiuser database system is shown in Figure 1.10.

LAN Server

Operating System
Database Files (Storage)
Communications

Local Area Network


(LAN)

PC PC

User Interface User Interface


DBMS DBMS
Operating System Operating System
Communications Communications

FIGURE 1.10. Multiuser database system.


Introduc tion to Database Systems 27
Advantages of Multiuser Database System
There are many advantages of multiuser database system. Some of them are as follows:
(i) Ability to share data among various users.
(ii) Cost of storage is now divided among various users.
(iii) Low cost since most components are now commodity items.

Disadvantages of Multiuser Database System


The major disadvantage of the multiuser database system is that it has a limited data sharing
ability i.e., only a few users can share the data at most.

1.9.2 According to the Type of Use


According to the type of use, the database systems can be further subdivided into three
categories, namely:
(a) Production or Transactional Database Systems
(b) Decision Support Database Systems
(c) Data Warehouses.
(a) Production or Transactional Database Systems : The production database systems
are used for management of supply chain and for tracking production of items in factories,
inventories of items in warehouses/stores and orders for items. The transactional database
systems are used for purchases on credit cards and generation of monthly statements. They
are also used in Banks for customer information, accounts, loans and banking transactions.
(b) Decision Support Database Systems : Decision support database systems are interactive,
computer-based systems that aid users in judgement and choice activities. They provide data
storage and retrieval but enhance the traditional information access and retrieval functions
with support for model building and model based reasoning. They support framing, modelling
and problem solving. Typical application areas of DSS’s are management and planning in
business, health care, military and any area in which management will encounter complex
decision situations. DSS’s are generally used for strategic and tactical decisions faced by
upper level management i.e., decisions with a reasonably low frequency and high potential
consequences.
A database system serves as a databank for the DSS. It stores large quantities of data
that are relevant to the class of problems for which the DSS has been designed and provides
logical data structures with which the users interact. The database system is capable of
informing the user the types of data that are available and how to gain access to them.
(c) Data Warehouses : A data warehouse is a relational database management system
(RDMS) designed specifically to meet the transaction processing systems. It can be loosely
defined as any centralised data repository which can be queried for business benefit.

1.9.3 According to Database Site Locations


According to database site locations, database systems can be further subdivided into four
categories namely:
(a) Centralized database systems
(b) Parallel database systems
(c) Distributed database systems
(d) Client/Server database systems.
28 Introduc tion to Database Management System
(a) Centralized database systems : The centralised database system consists of a single
processor together with its associated data storage devices and other peripherals. Database
files resides on a personal computer (small enterprise) or on a mainframe computer (large
enterprise). The applications are run on the same PC or mainframe computer. Multiple users
access the applications through simple terminals that have no processing power of their
own. The user interface is text-mode screens and the business rules are enforced in the
applications running on the mainframe or PC. The example of centralized database system
is DB2 database and Cobol application programs running on IBM 390.
The centralized database system is shown in Figure 1.11.
Terminal Terminal
— Display — Display
— Keyboard — Keyboard
PC or Mainframe (Host)

— User Interface
— DBMS
— Operating System
— Database Files (Storage)
— Operating System

Terminal Terminal

— Display — Display
— Keyboard — Keyboard

FIGURE 1.11. Centralized database system.

Advantages of Centralized Database System


There are many advantages of centralized database system some of them are as follows:
(i) The control over applications and security is excellent.
(ii) The incremental cost per user is very low.
(iii) The centralized systems are highly reliable due to proven mainframe technology.
(iv) Many functions such as query, backup, update etc., are easier to accomplish.

Disadvantages of Centralized Database System


The various disadvantages of centralized database system are as follows:
(i) The users are not able to effectively manipulate data outside of standard applications.
(ii) The system is not able to effectively serve advance user interfaces.
(iii) The failure of central computer blocks every user from using the system until the
system comes back.
(iv) The communication costs from the terminal to the central computer is a matter of
concern.
(b) Parallel database systems : A parallel database system can be defined as a database
system implemented on a tightly coupled multiprocessor or on a loosely coupled multiprocessor.
Parallel database systems link multiple smaller machines to achieve the same throughput as
a single larger machine, often with greater scalability and reliability than single processor
Introduc tion to Database Systems 29
database system. Parallel database systems are used in the applications that have to query
extremely large databases or have to process an extremely large number of transactions per
second. There are three main architectures for parallel database system. These are
(i) Shared memory architecture
(ii) Shared disk architecture
(iii) Shared nothing architecture.
More about these types is discussed in Chapter 12.

Advantages of Parallel Database Systems


There are many advantages of parallel database systems. Some of these are as follows:
(i) These are very useful in the applications where large databases have to be queried
or where extremely large number of transactions per second has to be processed.
(ii) The response time is very high.
(iii) The throughput is also very high.
(iv) The input/output speeds and processing is very high.
(v) They have greater scalability and reliability than single processor system.

Disadvantages of Parallel Database Systems


The various disadvantages of parallel database systems are as follows:
(i) Due to start-up cost and start-up time, the overall speed up is adversely affected.
(ii) Due to processes executed in parallel, sharing the resources, a slow down may result
offer each new process as it competes with existing processes for the resources.
(c) Distributed database systems : A distributed database system is a database system,
in which, the data is spread across a variety of different databases. These are managed by a
variety of DBMS’s that are running on various types of machines having different operating
systems. These machines are widely spread and are connected through the communication
networks. Each machine can have is own data and applications, and can access data stored
on other machines. Thus, each machine acts as a server as well as client.
Thus, distributed database system is a combination of logically interrelated databases
distributed over a computer network and the distributed database management system
(DDBMS). A distributed database system can be homogeneous or heterogeneous. A distributed
database system is shown in Figure 1.12.
Client/Server Client/Server

Database Database

Network

Client/Server Client/Server

Database Database

FIGURE 1.12. Distributed database system.


30 Introduc tion to Database Management System
Advantages of Distributed Database Systems
The various advantages of distributed database systems are as follows:
1. Improved sharing ability
2. Local autonomy
3. Availability
4. Reliability
5. Improved performance
6. Easier expansion
7. Reduced communications overhead and better response time
8. More economical
9. Direct user interaction
10. No a single point failure
11. Processor independence.

Disadvantages of Distributed Database Systems


The various disadvantages of distributed database systems are as follows:
1. Architectural complexity
2. Lack of standards
3. Lack of professional support
4. Data integrity problems
5. Problem of security
6. High cost
7. Complex database design.
(d) Client/Server Database System : With the development of technology, hardware cost
become cheaper and cheaper and more personal computers are used. There was a change
and enterprises started use of client-server technology instead of centralized system. In client-
server technology, there is a server which acts as a whole data base management system
and some clients or personal computers which are connected with server through a network
interface. The complete architecture is shown in Figure 1.13.

Components of Client-Server Architecture


There are three major components of client server architecture:
1. Server
2. Client
3. Network interface
Introduc tion to Database Systems 31

Client Client
Network
Server

Client Client

FIGURE 1.13. Client server database system.

1. Server : Server is DBMS itself. It consists of DBMS and supports all basic DBMS
functions. Server components of DBMS are installed at server. It acts as monitor of all of
its clients. It distributes work-load to other computers. Clients must obey their servers.
Functions of Server : The server performs various functions, which are as follows.
1. It supports all basic DBMS functions.
2. Monitor all his clients.
3. Distribute work-load over clients.
4. Solve problems which are not solved by clients.
5. Maintain security and privacy.
6. Avoiding unauthorized access of data.
2. Clients : Client machine is a personal computer or workstation which provide services
to both server and users. It must obey his server. Client components of DBMS are installed
at client site. Clients are taking instructions from server and help them by taking their load.
When any user want to execute a query on client, the client first take data from server then
execute the query on his own hardware and returns the result to the server. As a result,
server is free to do more complex applications.
3. Network Interface : Clients are connected to server by network interface. It is useful
in connecting the server interface with user interface so that server can run his applications
over his clients.
In the client server architecture, there are more than one server. Sometimes, a server is
used as Database Server, other as Application Server, other as Backup Server etc.
Advantages of Client-Server Database System
1. It increase the overall performance of DBMS.
2. Load can be distributed among clients.
3. It provides better user interface.
4. It is used to develop highly complex applications.
5. Clients with different operating systems can be connected with each other.
6. Single copy of DBMS is shared.
7. It reduces cost.
Disadvantages of Client-Server Database System
1. Network is error prone.
2. It is a kind of centralized system. If server is crashed or failed, there is loss of data.
32 Introduc tion to Database Management System
3. Recovery is typical and additional burden on DBMS server to handle concurrency
control.
4. Programming cost is high.
5. The implementation is more complex since one needs to deal with the middle ware
and the network.

1.10 Comparison between Client/Server and Distributed


Database System

Client/Server Database System Distributed Database System


1. In this, different platforms are often 1. In this, different platforms can be
difficult to manage. managed easily.
2. Here, application is usually distributed 2. Here, application is distributed across
across clients. sites.
3. In this database system, whole system 3. Here, failure of one site doesn’t bring
comes to a halt if server crashes. the entire system down as system may be
able to reroute the one site’s request to
another site.
4. Maintenance cost is low. 4. Maintenance cost is much higher.
5. In this system, access to data can be 5. In DDS not only does the access to
easily controlled. replicate the data has to be controlled at
multiple locations but also the network
has to be made secure.
6. In this, new sites can not be added easily. 6. In this, new sites can be added with little
or no problem.
7. Speed of database access is good. 7. Speed of database access is much better.

Test Your Knowledge

True/False
1. A database actually consists of three parts: information, the logical structure of that information,
and tables.
2. A data dictionary, or relation, is a two-dimensional table used to store data within a relational
database.
3. A database management system (DBMS) allows you to specify the logical organization for a
database and access and use the information within a database.
4. A physical view represents how the users view the data.
5. A database may have numerous physical views.
6. Fixed length record sometimes wastes space while variable length record does not waste space.
7. A database is any collection of related data.
8. A DBMS is a software system to facilitate the creation and maintenance of a computerized
database.
9. End-users can be categorized into casual, designer, or parametric users.

You might also like