DBMS (R20) UNIT - 1-1
DBMS (R20) UNIT - 1-1
UNIT I: Introduction
Syllabus:
Introduction: Database system, Characteristics (Database Vs File System), Database Users
(Actors on Scene, Workers behind the scene), Advantages of Database systems, Database
applications. Brief introduction of different Data Models; Concepts of Schema, Instance and
data independence; Three tier schema architecture for data independence; Database system
structure, environment, Centralized and Client Server architecture for the database.
Objectives:
After studying this unit, you will be able to:
Define database management system
Explain database system applications
State the characteristics and the database approach
Understand different data models
Discuss the advantages and disadvantages of database Discuss the database architecture
DATABASE MANAGEMENT SYSTEMS UNIT – I : INTRODUCTION
Introduction
The information storage and retrieval has become very important in our day-to-day life. The
old era of manual system is no longer used in most of the places. For example, to book your
airline tickets or to deposit your money in the bank the database systems may be used. The
database system makes most of the operations automated. A very good example for this is the
billing system used for the items purchased in a super market. Obviously this is done with the
help of a database application package. Inventory systems used in a drug store or in a
manufacturing industry are some more examples of database. We can add similar kind of
examples to this list.
Apart from these traditional database systems, more sophisticated database systems are used
in the Internet where a large amount of information is stored and retrieved with efficient
search engines. For instance, http://www.google.com is a famous web site that enables users
to search for their favorite information on the net. In a database we can store starting from
text data to very complex data like audio, video, etc.
What is Data?
The raw facts are called as data. The word “raw” indicates that they have not been processed. Ex:
For example 89 is the data.
What is information?
The processed data is known as information. Ex: Marks: 89; then it becomes information.
What is Knowledge?
1. Knowledge refers to the practical use of information.
2. Knowledge necessarily involves a personal experience.
DATA/INFORMATION PROCESSING
The process of converting the data (raw facts) into meaningful information is called as
data/information processing.
Data
Data is the raw material from which useful information is derived. The word data is the plural
of Datum. Data is commonly used in both singular and plural forms. It is defined as raw facts
or observations. It takes variety of forms, including numeric data, text and voice and images.
Data is a collection of facts, which is unorganized but can be made organized into useful
information. The term Data and Information come across in our daily life and are often
interchanged.
Example: Weights, prices, costs, number of items sold etc.
Information
Data that have been processed in such a way as to increase the knowledge of the person who
uses the data. The term data and information are closely related. Data are raw material
resources that are processed into finished information products. The information as data that
has been processed in such way that it can increase the knowledge of the person who uses it.
In practice, the database today may contain either data or information.
Data Processing
The process of converting the facts into meaningful information is known as data processing.
Data processing is also known as information processing.
Metadata
Data that describe the properties or characteristics of other data.
Data is only become useful when placed in some context. The primary mechanism for
providing context for data is Metadata. Metadata are data that describe the properties, or
characteristics of other data. Some of these properties include data definition, data structures
and rules or constraints. The Metadata describes the properties of data but do not include that
data.
What is RDBMS?
• RDBMS stands for Relational Database Management Systems.
All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL
and Microsoft Access are based on RDBMS.
• RDBMS applications store data in a tabular form.
It is a computerized system whose overall purpose is to store information and to allow users to
Dr.B.Srinivas, Assoc.Prof., CSE(AIML&DS), ACET, Surampalem 3
DATABASE MANAGEMENT SYSTEMS UNIT – I : INTRODUCTION
retrieve and update that information on demand.
• Users of the system can perform variety of operations involving such files, for example
Adding new files to the database
Inserting data into existing files
Retrieving data from existing files
Deleting data from existing files
Changing data in existing files
1980s:
Research relational prototypes evolve into commercial systems
SQL becomes industry standard
Parallel and distributed database systems
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
2000s:
XML and XQuery standards
Automated database administration
Increasing use of highly parallel database systems
Web-scale distributed data storage systems
File System:
The file system is basically a way of arranging the files in a storage medium like a hard disk. The file
system organizes the files and helps in the retrieval of files when they are required. File systems
consist of different files which are grouped into directories. The directories further contain other
folders and files. The file system performs basic operations like management, file naming, giving
access rules, etc.
Example: NTFS(New Technology File System), EXT(Extended File System).
In earlier days, the databases were created directly on top of file systems. File system has many
disadvantages.
1. Not enough primary memory to process large data sets. If data is maintained in other
storage devices like disks, tapes and bringing relevant data to main memory, it increases the
cost of performance. Problem in accessing the large data due to addressing the data using
32 bit or 64 bit mode addressing mechanism.
2. Programs must be written to process the user request to process the data stored in files which
are complex in nature because of large volume of data to be searched.
3. Inconsistent data and complexity in providing concurrent accesses.
4. Not sufficiently flexible to enforce security policies in which different users have permission to
access different subsets of the data.
A DBMS is a piece of software that is designed to make the preceding tasks easier. By storing
data in a DBMS, rather than as a collection of operating system Files, we can use the DBMS's
features to manage the data in a robust and efficient manner.
Database Management System is basically software that manages the collection of related data. It is
used for storing data and retrieving the data effectively when it is needed. It also provides proper
Dr.B.Srinivas, Assoc.Prof., CSE(AIML&DS), ACET, Surampalem 5
DATABASE MANAGEMENT SYSTEMS UNIT – I : INTRODUCTION
security measures for protecting the data from unauthorized access. In Database Management
System the data can be fetched by SQL queries and relational algebra. It also provides mechanisms
for data recovery and data backup.
The following are the major advantages of using a Database Management System (DBMS):
Data independence: Application programs should be as independent as possible from details
of data representation and storage. The DBMS can provide an abstract view of the data to
insulate application code from such details.
Efficient data access: A DBMS utilizes a variety of sophisticated techniques to store and
retrieve data efficiently. This feature is especially important if the data is stored on external
storage devices.
Data integrity and security: The DBMS can enforce integrity constraints on the data. The
DBMS can enforce access controls that govern what data is visible to different classes of users.
Data administration: When several users share the data, centralizing the administration of data
can offer significant improvements. It can be used for organizing the data representation to
minimize redundancy and for fine-tuning the storage of the data to make retrieval efficient.
Concurrent access and crash recovery: A DBMS schedules concurrent accesses to the data in
such a manner that users can think of the data as being accessed by only one user at a time.
Further, the DBMS protects users from the effects of system failures. .
Reduced application development time: Clearly, the DBMS supports many important
functions that are common to many applications accessing data stored in the DBMS.
Backup and recovery operations are complex in a DBMS environment, and this is an increment
in a concurrent multi-user database system. A database system requires a certain amount of
controlled redundancies and duplication to enable access to related data items.
People who work with a database can be categorized as database users or database
administrators.
• There are four different types of database-system users, differentiated by the way they expect to
interact with the system. Different types of user interfaces have been designed for the different
types of users.
(i) Naıve users:
o Unsophisticated users who interact with the system by invoking one of the application
programs that have been written previously.
For example, a clerk in the university who needs to add a new instructor to department A
invokes a program called new_hire.
o The typical user interface for naıve users is a forms interface, where the user can fill in
appropriate fields of the form. Naıve users may also simply read reports generated from the
database.
(ii) Application programmers
o Computer professionals who write application programs.
o Rapid application development (RAD) tools are tools that enable an application programmer to
construct forms and reports with minimal programming effort.
(iii) Sophisticated users
o Interact with the system without writing programs.
o They form their requests either using a database query language or by using tools such as
data analysis software.
Ex: Analysts who submit queries to explore data in the database.
(iv) Specialized users
o Sophisticated users who write specialized database applications that do not fit into the
traditional data-processing framework.
computer-aided design systems
knowledgebase and expert systems
systems that store data with complex data types (for example, graphics data and audio data)
environment-modeling systems
2. Database Administrator
o A person who has such central control over the system is called a database administrator (DBA).
A schema is a description of a particular collection of data, using the given data model. The
relational model of data is the most widely used model today.
Main concept: relation, basically a table with rows and columns. Every relation has a schema,
which describes the columns, or fields.
Data Model is a collection of high-level data description constructs that hide many low-level
storage details. A DBMS allows a user to define the data to be stored in terms of a data model.
Most database management systems today are based on the Relational data model. Relational
models include – IBM’s DB2, Informix, Oracle, Sybase, Microsoft’s Access, Foxbase, Paradox,
Tandem and Teradata.
• Conceptual (high-level, semantic) data models: Provide concepts that are close to the way
many users perceive data (Also called entity-based or object-based data models).
• Physical (low-level, internal) data models: Provide concepts that describe details of how data
is stored in the computer.
• Implementation (representational) data models: Provide concepts that fall between the
above two.
Advantages:
Hierarchical model is simple to construct and operate on.
Dr.B.Srinivas, Assoc.Prof., CSE(AIML&DS), ACET, Surampalem 9
DATABASE MANAGEMENT SYSTEMS UNIT – I : INTRODUCTION
Corresponds to a number of natural hierarchical organized domains – e.g., assemblies in
manufacturing, personal organization in companies.
Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN
PARENT etc.,
Disadvantages:
Navigational and procedural nature of processing.
Database is visualized as a linear arrangement of records.
Little scope for “query optimization”.
One-to-many relationships.
2. Network model:
This model is the extension of hierarchical data model. In this model data elements are linked by
graph structure. In this model also there exist a parent child relationship but a child data element
can have more than one parent element or no parent at all.
Advantages:
Network model is able to model complex relationships and represents semantics of
add/delete on the relationships.
Can handle most situations for modeling using record types and relationship types.
Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND
NEXT within set, GET etc. Programmers can do optimal navigation through the database.
Disadvantages:
Navigational and procedural nature of processing.
Database contains a complex array of pointers that are expensive and difficult to update
when inserting and deleting.
Little scope for automated “query optimization”.
3. Relational model:
The relational model was invented by E. F. Codd at IBM in 1970.The relational model represents
data and relationships among data by a collection of tables, each of which has a number of rows
and column. Relational data model is used widely around the world for data storage and
processing.
A relation, basically a table with rows and columns.
Every relation has a schema, which describes the columns, or fields.
Student information in a university database may be stored in a relation with the following
schema
Students (sid: string, name: string, login: string, age: integer, gpa: real)
The choice of relations, and the choice of fields for each relation, is not always obvious, and
the process of arriving at a good conceptual schema is called conceptual database design.
A database system is partitioned into modules that deal with each of the responsibilities of the
overall system.
The functional components of a database system can be broadly divided into the storage manager
and the query processor components.
Also some data structures are required as part of the physical system implementation:
1. Data Files: The data files store the database by itself.
2. Data Dictionary: It stores metadata about the structure of the database, as it is used heavily.
3. Indices: It provides fast access to data items that hold particular values.
4. Statistical Data: It stores statistical information about the data in the database. This
information used by the query processor to select efficient ways to execute a query.
Constructing the database is the process of storing the data itself on some storage medium that is
controlled by the DBMS.
Manipulating a database includes such functions as querying the database to retrieve specific
data, updating the database to reflect changes in the mini world, and generating reports from the
data.
Sharing a database allows multiple users and programs to access the database concurrently.
Other important functions provided by the DBMS include protecting the database and
maintaining
it over a long period of time.
Protection includes both system protection against hardware or software malfunction (or crashes),
and security protection against unauthorized or malicious access. A typical large database may
have a life cycle of many years, so the DBMS must be able to maintain the database system by
allowing the system to evolve as requirements change over time. We can call the database and
DBMS software together a database system.
The architecture of a DBMS can be seen as either single tier or multi-tier. The tiers are classified as
follows:
1- tier architecture
2- tier architecture
3- tier architecture (n-tier architecture)
2- tier architecture:
The two-tier is based on Client Server architecture. The two-tier architecture is like client server
application. The direct communication takes place between client and server. There is no
intermediate between client and server. At the early stages, client server computing it was called
two-tier computing model in which client is considered as data capture and validation tier and
server was considered as data storage tire.
3- tier architecture:
A 3-tier architecture separates its tiers from each other based on the complexity of the users and
how they use the data present in the database. It is the most widely used architecture to design a
DBMS.
The resources provided by specialized servers can be accessed by many client machines. The
client machines provide the user with the appropriate interfaces to utilize these servers, as well
A server is a machine that can provide services to the client machines, such as file access,
printing, archiving, or database access. In the general case, some machines install only client
software, others only server software, and still others may include both client and server
software. However, it is more common that client and server software usually run on separate
machines.
In client/server architecture, the user interface programs and application programs can run
on the client side. When DBMS access is required, the program establishes a connection to the
DBMS (which is on the server side); once the connection is created, the client program can
communicate with the DBMS. A standard called Open Database Connectivity (ODBC) provides an
application programming interface (API), which allows client-side programs to call the DBMS,
as long as both client and server machines have the necessary software installed. Most DBMS
vendors provide ODBC drivers for their systems.
References:
Raghurama Krishnan, Johannes Gehrke, Database Management Systems, 3rd Edition, Tata
McGraw Hill.
Database System Concepts,5/e, Silberschatz, Korth, TMH
C.J. Date, Introduction to Database Systems, Pearson Education.
Elmasri Navrate, Fundamentals of Database Systems, Pearson Education.