1.1 Introduction To Databases: Database
1.1 Introduction To Databases: Database
1 Introduction to Databases
Databases play a critical role in almost all areas where computers are used, including business,
electronic commerce, engineering, medicine, law, education, and library science.
Database:
A database is a collection of related data and the way it is organized (Indexed address book,
Microsoft Access or Excel, ER model and Relational model etc..).
Data
Data means known facts that can be recorded and that have implicit meaning.
a) A database represents some aspect of the real world, sometimes called the miniworld.
Changes to the miniworld are reflected in the database.
c) A database is designed, built, and populated with data for a specific purpose.
A database to be accurate and reliable at all times, it must be a true reflection of the miniworld that it
represents; therefore, changes must be reflected in the database as soon as possible.
A database management system (DBMS) is a collection of programs that enables users to create and
maintain a database.
Definition:
The DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.
Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database.
Constructing the database is the process of storing the data on some storage medium that is controlled
by the DBMS.
Manipulating a database includes functions such as querying the database to retrieve specific data,
updating the database to reflect changes in the miniworld, and generating reports from the data.
Sharing a database allows multiple users and programs to access the database simultaneously.
Application program:
An application program accesses the database by sending queries or requests for data to the DBMS.
Query
Transaction
a transaction may cause some data to be read and some data to be written into the database.
The DBMS functionality includes protecting the database and maintaining it over a long period of time.
Here protection includes system protection against hardware or software malfunction (or crashes) and
security protection against unauthorized or malicious access and maintain the database system by
allowing the system to evolve as requirements change over time.
Database system
A fundamental characteristic of the database approach is that the database system contains not only
the database itself but also a complete definition or description of the database structure and
constraints.
The DBMS catalog contains information such as the structure of each file, the type and storage format
of each data item, and various constraints on the data. This catalog information is called metadata and
this meta data is used the DBMS software and also by database users to know the information about
the database structure.
In traditional file processing, the structure of data files is embedded in the application programs, so
any changes to the structure of a file may require changing all programs that access that file.
In DBMS the structure of data files is stored in the DBMS catalog separately from the access programs
so any changes in the database doesn't require changing all programs that access that database, this
is called program-data independence.
In some types of database systems, such as object-oriented database users can define operations on
data as part of the database definitions. An operation is specified in two parts .
A database has many users and each user may require a different perspective or view of the database.
A view may be a subset of the database or it may contain virtual data that is derived from the database
files but is not explicitly stored.
A multiuser DBMS must allow multiple users to access the database at the same time.
A transaction is an executing program or process that includes one or more database accesses, such
as reading or updating of database records.
Each transaction is supposed to execute a logically correct database access, without interference from
other transactions.
A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate
correctly and efficiently.
The DBMS must include concurrency control software to ensure that several users trying to update
the same data do so in a controlled manner so that the result of the updates is correct (ex: airline
reservation system).
For a small personal database one person is adequate to define, construct, and manipulate the
database, and there is no sharing.
In large organizations, many people are involved in the design, use, and maintenance of a large
database with hundreds of users; we call them the actors on the scene.
They are
1. Database Administrators
2. Database Designers
3. End Users
a) Casual end users
b) Naive or parametric end users
c) Sophisticated end users
d) Standalone users
4. System Analysts and Application Programmers (Software Engineers)
1. Database Administrators:
In a database environment, the primary resource is the database itself, and the secondary resource is
the DBMS and related software. Administering these resources is the responsibility of the database
administrator (DBA).
The DBA is responsible for authorizing access to the database, coordinating and monitoring its use,
and acquiring software and hardware resources as needed.
2. Database Designers:
Database designers are responsible for identifying the data to be stored in the database and for
choosing appropriate structures to represent and store this data.
Database designers must communicate with all prospective database users in order to understand
their requirements and to create a design that meets these requirements.
3. End Users
End users are the people whose jobs require access to the database for querying, updating, and
generating reports.
Casual end users occasionally access the database, but they may need different information each
time. For example middle- or high-level managers.
Naive or parametric end users are used only a sizable portion of database.
Their main job function is constantly querying and updating the database, using standard types of
queries and updates.
For example
Bank tellers check account balances and post withdrawals and deposits.
Reservation agents for airlines, hotels, and car rental companies check availability for a
given request and make reservations.
3.3 Sophisticated end users
Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly
familiarize themselves with the facilities of the DBMS in order to implement their own applications to
meet their complex requirements.
Standalone users maintain personal databases by using ready-made program(e.g.: Tax packages)
packages that provide easy-to-use menu-based or graphics-based interfaces.
System analysts determine the requirements of end users, especially naive and parametric end
users, and develop specifications for standard canned transactions that meet these requirements.
some people are associated with the design, development, and operation of the DBMS software and
system environment (they are not interested in the database content) , they are called workers behind
the scene.
They are
DBMS system designers and implementers design and implement the DBMS modules and
interfaces as a software package.
Tool developers design and implement software packages that help database modeling and
design, database system design, and improved performance.
In many cases, independent software vendors develop and market these tools.
Operators and maintenance personnel (system administration personnel) are responsible for
the running and maintenance of the hardware and software environment for the database
system.
1) Controlling Redundancy
2) Restricting Unauthorized Access
3) Providing Persistent Storage for Program Objects
4) Providing Storage Structures and Search Techniques for Efficient Query Processing
5) Providing Backup and Recovery
6) Providing Multiple User Interfaces
7) Representing Complex Relationships among Data
8) Enforcing Integrity Constraints
Controlling Redundancy
Duplication of effort
storage space is wasted
Data may become inconsistent
we should have a database design that stores each logical data item such as a student’s name or birth
date—in only one place in the database. This is known as data normalization, and it ensures
consistency and saves storage space.
For example, financial data is often considered confidential, and only authorized persons are allowed
to access such data. In addition, some users may only be permitted to retrieve data.
A DBMS should provide a security and authorization subsystem to gain access by providing
user names and passwords
Databases can be used to provide persistent storage for program objects and data structures. This
is one of the main feature of object-oriented database systems.
The values of program variables or objects are discarded once a program terminates, unless the
programmer explicitly stores them in permanent files, which often involves converting these complex
structures into a format suitable for file storage. When the need arises to read this data once more,
the programmer must convert from the file format to the program variable or object structure.
Object-oriented database systems are compatible with programming languages such as C++ and
Java, and the DBMS software automatically performs any necessary conversions.
Impedance mismatch problem-----the data structures provided by the DBMS were incompatible with
the programming language’s data structures.
Generally the database is stored on disk, the DBMS must provide specialized data structures and
search techniques to speed up disk search for the desired records.
Indexes are used for this purpose. Indexes are typically based on tree data structures or hash data
structures that are suitably modified for disk search.
DBMS often has a buffering or caching module that maintains parts of the database in main memory
buffers.
The query processing and optimization module of the DBMS is responsible for choosing an efficient
query execution plan for each query based on the existing storage structures.
A DBMS must provide facilities for recovering from hardware or software failures. The backup and
recovery subsystem of the DBMS is responsible for recovery.
A DBMS must have the capability to represent a variety of complex relationships among the data, to
define new relationships as they arise, and to retrieve and update related data easily and efficiently.
A DBMS should provide capabilities for defining and enforcing integrity constraints.
The simplest type of integrity constraint involves specifying a data type for each data item.
Some database systems provide capabilities for defining deduction rules for inferencing new
information from the stored database facts. Such systems are called deductive database systems.
For example, detention of students to enter into 3rd year if he doesn't secure required course credits.
A Brief History of Database Applications
These systems did not provide sufficient data abstraction and program-data independence capabilities
Another shortcoming of early systems was that they provided only programming language interfaces.
This made it time-consuming and expensive to implement new queries and transactions.
Relational databases were originally proposed to separate the physical storage of data from its
conceptual representation and to provide a mathematical foundation for data representation and
querying.
The relational data model also introduced high-level query languages that provided an alternative to
programming language interfaces, making it much faster to write new queries.
Relational databases now exist on almost all types of computers, from small personal computers to
large servers.
The emergence of object-oriented programming languages and the need to store and share complex,
structured objects we use object-oriented databases (OODBs).
OODBs mainly used in specialized applications, such as engineering design, multimedia publishing,
and manufacturing systems.
eXtended Markup Language (XML) combines concepts from the models used in document systems
with database modeling concepts.
XML is considered to be the primary standard for interchanging data among various types of
databases and Web pages.