0% found this document useful (0 votes)
83 views

1.1 Introduction To Databases: Database

The document provides an introduction to databases and database management systems. It discusses that [1] a database is a collection of related data organized in a specific way, [2] a DBMS allows for defining, constructing, manipulating, and sharing databases among users and applications, and [3] key characteristics of the database approach include self-describing nature, insulation between programs and data, support of multiple views, and sharing of data among multiple users. The document also outlines various roles involved in large database systems, such as database administrators, designers, and different types of end users.

Uploaded by

Harshitha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views

1.1 Introduction To Databases: Database

The document provides an introduction to databases and database management systems. It discusses that [1] a database is a collection of related data organized in a specific way, [2] a DBMS allows for defining, constructing, manipulating, and sharing databases among users and applications, and [3] key characteristics of the database approach include self-describing nature, insulation between programs and data, support of multiple views, and sharing of data among multiple users. The document also outlines various roles involved in large database systems, such as database administrators, designers, and different types of end users.

Uploaded by

Harshitha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

1.

1 Introduction to Databases
Databases play a critical role in almost all areas where computers are used, including business,
electronic commerce, engineering, medicine, law, education, and library science.

Database:

A database is a collection of related data and the way it is organized (Indexed address book,
Microsoft Access or Excel, ER model and Relational model etc..).

Data

Data means known facts that can be recorded and that have implicit meaning.

A database has the following implicit properties:

a) A database represents some aspect of the real world, sometimes called the miniworld.
Changes to the miniworld are reflected in the database.

b) A database is a logically coherent collection of data with some inherent meaning.

c) A database is designed, built, and populated with data for a specific purpose.

A database to be accurate and reliable at all times, it must be a true reflection of the miniworld that it
represents; therefore, changes must be reflected in the database as soon as possible.

A database can be of any size and complexity.

A database may be generated and maintained manually(library card catalog) or it may be


computerized(volume of books).

Database Management System (DBMS)

A database management system (DBMS) is a collection of programs that enables users to create and
maintain a database.

Definition:

The DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.

Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database.
Constructing the database is the process of storing the data on some storage medium that is controlled
by the DBMS.

Manipulating a database includes functions such as querying the database to retrieve specific data,
updating the database to reflect changes in the miniworld, and generating reports from the data.

Sharing a database allows multiple users and programs to access the database simultaneously.

Application program:

An application program accesses the database by sending queries or requests for data to the DBMS.

Query

A query is used to retrieve some data from the database.

Transaction

a transaction may cause some data to be read and some data to be written into the database.

The DBMS functionality includes protecting the database and maintaining it over a long period of time.
Here protection includes system protection against hardware or software malfunction (or crashes) and
security protection against unauthorized or malicious access and maintain the database system by
allowing the system to evolve as requirements change over time.

Database system

Database and DBMS software together called a database system


Fig 1: A simplified database system environment.

1.2 Characteristics of the Database Approach

File system vs. Database

File system DBMS


A collection of individual files accessed by A database management system is collection of
applications programs programs that enables to create and maintain a
database.
In file systems, each application is free to name In the database approach, a single repository
data elements independently. maintains data that is defined once and then
accessed by various users.
File system has more data redundancy DBMS has less data redundancy
File system has less flexibility in accessing data DBMS has more flexibility in accessing data.
File system does not provide data consistency DBMS provides data consistency
File system is less complex DBMS is more complex.
Provides less security while accessing data Provides more security while accessing data
Useful to store and maintain small amount of Useful to store and maintain large amount of
data data
It may contain more unstructured / unrelated It contain structured data with well defined data
data formats
It doesn't provide backup and recovery DBMS provide backup and recovery

The main characteristics of the database approach

The main characteristics of the database approach are

A. Self-describing nature of a database system


B. Insulation between programs and data, and data abstraction
C. Support of multiple views of the data
D. Sharing of data and multiuser transaction processing

Self-describing nature of a database system

A fundamental characteristic of the database approach is that the database system contains not only
the database itself but also a complete definition or description of the database structure and
constraints.

metadata(Data about the data)

The DBMS catalog contains information such as the structure of each file, the type and storage format
of each data item, and various constraints on the data. This catalog information is called metadata and
this meta data is used the DBMS software and also by database users to know the information about
the database structure.

Insulation between Programs and Data, and Data Abstraction

In traditional file processing, the structure of data files is embedded in the application programs, so
any changes to the structure of a file may require changing all programs that access that file.

In DBMS the structure of data files is stored in the DBMS catalog separately from the access programs
so any changes in the database doesn't require changing all programs that access that database, this
is called program-data independence.

In some types of database systems, such as object-oriented database users can define operations on
data as part of the database definitions. An operation is specified in two parts .

1. Interface (operation name and data types of its arguments).

2. Implementation (the code part)

The implementation part can be changed without affecting the interface.


This is called program-operation independence.

The characteristic that allows program-data independence and program-operation Independence is


called data abstraction.

Support of Multiple Views of the Data

A database has many users and each user may require a different perspective or view of the database.

A view may be a subset of the database or it may contain virtual data that is derived from the database
files but is not explicitly stored.

Sharing of Data and Multiuser Transaction Processing

A multiuser DBMS must allow multiple users to access the database at the same time.

A transaction is an executing program or process that includes one or more database accesses, such
as reading or updating of database records.

Each transaction is supposed to execute a logically correct database access, without interference from
other transactions.

A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate
correctly and efficiently.

The DBMS must include concurrency control software to ensure that several users trying to update
the same data do so in a controlled manner so that the result of the updates is correct (ex: airline
reservation system).

Actors on the Scene

For a small personal database one person is adequate to define, construct, and manipulate the
database, and there is no sharing.

In large organizations, many people are involved in the design, use, and maintenance of a large
database with hundreds of users; we call them the actors on the scene.

They are

1. Database Administrators
2. Database Designers
3. End Users
a) Casual end users
b) Naive or parametric end users
c) Sophisticated end users
d) Standalone users
4. System Analysts and Application Programmers (Software Engineers)

1. Database Administrators:

In a database environment, the primary resource is the database itself, and the secondary resource is
the DBMS and related software. Administering these resources is the responsibility of the database
administrator (DBA).

The DBA is responsible for authorizing access to the database, coordinating and monitoring its use,
and acquiring software and hardware resources as needed.

2. Database Designers:

Database designers are responsible for identifying the data to be stored in the database and for
choosing appropriate structures to represent and store this data.

Database designers must communicate with all prospective database users in order to understand
their requirements and to create a design that meets these requirements.

3. End Users

End users are the people whose jobs require access to the database for querying, updating, and
generating reports.

There are several categories of end users:

3.1 Casual end users

Casual end users occasionally access the database, but they may need different information each
time. For example middle- or high-level managers.

3.2 Naive or parametric end users

Naive or parametric end users are used only a sizable portion of database.

Their main job function is constantly querying and updating the database, using standard types of
queries and updates.

For example

Bank tellers check account balances and post withdrawals and deposits.

 Reservation agents for airlines, hotels, and car rental companies check availability for a
given request and make reservations.
3.3 Sophisticated end users

Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly
familiarize themselves with the facilities of the DBMS in order to implement their own applications to
meet their complex requirements.

3.4 Standalone users

Standalone users maintain personal databases by using ready-made program(e.g.: Tax packages)
packages that provide easy-to-use menu-based or graphics-based interfaces.

4. System Analysts and Application Programmers

System analysts determine the requirements of end users, especially naive and parametric end
users, and develop specifications for standard canned transactions that meet these requirements.

Application programmers(software engineers) implement these specifications as programs; then


they test, debug, document, and maintain these canned transactions.

Workers behind the Scene

some people are associated with the design, development, and operation of the DBMS software and
system environment (they are not interested in the database content) , they are called workers behind
the scene.

They are

1. DBMS system designers and implementers


2. Tool developers
3. Operators and maintenance personnel

1. DBMS system designers and implementers

DBMS system designers and implementers design and implement the DBMS modules and
interfaces as a software package.

These modules, include Implementing the catalog,


Query language processing,
Interface processing,
Accessing and buffering data,
Controlling concurrency, and
Handling data recovery and security.
2. Tool developers

Tool developers design and implement software packages that help database modeling and
design, database system design, and improved performance.

Tools are optional packages that are often purchased separately.


The packages include Database design,
Performance monitoring,
Natural language or Graphical interfaces,
Prototyping,
Simulation, and test data generation.

In many cases, independent software vendors develop and market these tools.

3. Operators and maintenance personnel

Operators and maintenance personnel (system administration personnel) are responsible for
the running and maintenance of the hardware and software environment for the database
system.

Advantages of Using the DBMS Approach

1) Controlling Redundancy
2) Restricting Unauthorized Access
3) Providing Persistent Storage for Program Objects
4) Providing Storage Structures and Search Techniques for Efficient Query Processing
5) Providing Backup and Recovery
6) Providing Multiple User Interfaces
7) Representing Complex Relationships among Data
8) Enforcing Integrity Constraints

Controlling Redundancy

Redundancy means storing the same data multiple times.

If our data contains redundancy it leads to several problems such as

Duplication of effort
storage space is wasted
Data may become inconsistent

we should have a database design that stores each logical data item such as a student’s name or birth
date—in only one place in the database. This is known as data normalization, and it ensures
consistency and saves storage space.

with the help of controlled redundancy to improve the performance of queries.

Restricting Unauthorized Access


When multiple users share a large database, it is likely that most users will not be authorized to access
all information in the database.

For example, financial data is often considered confidential, and only authorized persons are allowed
to access such data. In addition, some users may only be permitted to retrieve data.

A DBMS should provide a security and authorization subsystem to gain access by providing
user names and passwords

Providing Persistent Storage for Program Objects

Databases can be used to provide persistent storage for program objects and data structures. This
is one of the main feature of object-oriented database systems.

The values of program variables or objects are discarded once a program terminates, unless the
programmer explicitly stores them in permanent files, which often involves converting these complex
structures into a format suitable for file storage. When the need arises to read this data once more,
the programmer must convert from the file format to the program variable or object structure.

Object-oriented database systems are compatible with programming languages such as C++ and
Java, and the DBMS software automatically performs any necessary conversions.

Impedance mismatch problem-----the data structures provided by the DBMS were incompatible with
the programming language’s data structures.

Providing Storage Structures and Search Techniques for Efficient Query


Processing

Generally the database is stored on disk, the DBMS must provide specialized data structures and
search techniques to speed up disk search for the desired records.

Indexes are used for this purpose. Indexes are typically based on tree data structures or hash data
structures that are suitably modified for disk search.

DBMS often has a buffering or caching module that maintains parts of the database in main memory
buffers.

The query processing and optimization module of the DBMS is responsible for choosing an efficient
query execution plan for each query based on the existing storage structures.

Providing Backup and Recovery

A DBMS must provide facilities for recovering from hardware or software failures. The backup and
recovery subsystem of the DBMS is responsible for recovery.

Providing Multiple User Interfaces

A DBMS should provide a variety of user interfaces. These include


a) Query languages for casual users,
b) Programming language interfaces for application programmers,
c) Forms and command codes for parametric users, and
d) Menu-driven interfaces and natural language interfaces for standalone users.

Representing Complex Relationships among Data

A DBMS must have the capability to represent a variety of complex relationships among the data, to
define new relationships as they arise, and to retrieve and update related data easily and efficiently.

Enforcing Integrity Constraints

A DBMS should provide capabilities for defining and enforcing integrity constraints.

The simplest type of integrity constraint involves specifying a data type for each data item.

Other constraints are

a) Referential integrity constraint


b) Uniqueness constraint
c) Check constraint

Permitting Inferencing and Actions Using Rules

Some database systems provide capabilities for defining deduction rules for inferencing new
information from the stored database facts. Such systems are called deductive database systems.

For example, detention of students to enter into 3rd year if he doesn't secure required course credits.
A Brief History of Database Applications

Early Database Applications Using Hierarchical and Network Systems

Early database applications maintained records in large organizations such as corporations,


universities, hospitals, and banks. In many of these applications, there were large numbers of records
of similar structure.

These systems did not provide sufficient data abstraction and program-data independence capabilities

Another shortcoming of early systems was that they provided only programming language interfaces.
This made it time-consuming and expensive to implement new queries and transactions.

Providing Data Abstraction and Application Flexibility with Relational Databases

Relational databases were originally proposed to separate the physical storage of data from its
conceptual representation and to provide a mathematical foundation for data representation and
querying.

The relational data model also introduced high-level query languages that provided an alternative to
programming language interfaces, making it much faster to write new queries.

Relational databases now exist on almost all types of computers, from small personal computers to
large servers.

Object-Oriented Applications and the Need for More Complex Databases

The emergence of object-oriented programming languages and the need to store and share complex,
structured objects we use object-oriented databases (OODBs).

OODBs mainly used in specialized applications, such as engineering design, multimedia publishing,
and manufacturing systems.

Interchanging Data on the Web for E-Commerce Using XML

eXtended Markup Language (XML) combines concepts from the models used in document systems
with database modeling concepts.

XML is considered to be the primary standard for interchanging data among various types of
databases and Web pages.

You might also like