0% found this document useful (0 votes)
19 views

5 Unnamed 03 05 2023

This document provides an introduction to databases and database management systems. It defines key terms like data, database, and DBMS. It describes different types of databases and issues with using a file system to manage data. The characteristics of the database approach are outlined, including self-describing nature, insulation between programs and data, data abstraction, and support for multiple views and concurrent users. Common DBMS functionality and examples of database users and models are also summarized.

Uploaded by

Vaishnavi Jagtap
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

5 Unnamed 03 05 2023

This document provides an introduction to databases and database management systems. It defines key terms like data, database, and DBMS. It describes different types of databases and issues with using a file system to manage data. The characteristics of the database approach are outlined, including self-describing nature, insulation between programs and data, data abstraction, and support for multiple views and concurrent users. Common DBMS functionality and examples of database users and models are also summarized.

Uploaded by

Vaishnavi Jagtap
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 80

Unit 1: Introduction

Basic Definitions
Data:
Known facts that can be recorded and have an implicit meaning.
Database:
A collection of related data.
Database Management System (DBMS):
Collection of programs that enables users to create and
maintain a database.
A software package/ system to facilitate the creation and maintenance
of a computerized database.
Types of Databases and Database Applications
Numeric and Textual Databases
Multimedia Databases
Geographic Information Systems (GIS)
Data Warehouses
Real-time and Active Databases
Drawbacks of using file system
Data redundancy and inconsistency
Multiple file formats, duplication of information in different
files
 Difficulty in accessing data
Need to write a new program to carry out each new task
 Integrity problems
 Integrity constraints - provide a way of ensuring that changes
made to the database by authorized users do not result in a
loss of data consistency.
 Hard to add new constraints or change existing ones
Drawbacks of using File system
Atomicity of updates
 Failures may leave database in an inconsistent state with
partial updates carried out
 E.g. transfer of funds from one account to another should
either complete or not happen at all
Concurrent access by multiple users
Concurrent accessed needed for performance
 Uncontrolled concurrent accesses can lead to
inconsistencies
 E.g. two people reading a balance and updating it at the same
time
 Database System:
The DBMS software and the database.
Characteristics of the Database Approach
Self-describing nature of a database system:
 A DBMS catalog stores the description of the database. Which
contains information such as the structure of each file, the type and
storage format of each data item, and various constraints on the data.
The description is called meta-data.
This allows the DBMS software to work with different databases.
Insulation between programs and data:
Called program-data independence.
Allows changing data storage structures and operations without
having to change the DBMS access programs.
Characteristics of the Database Approach
Data Abstraction:
A data model is used to hide storage details and present the
users with a conceptual view of the database.

Support of multiple views of the data:


Each user may see a different view of the database, which
describes only the data of interest to that user.

Sharing of data and multiuser transaction processing :


 allowing a set of concurrent users to retrieve and to update the database.
 Concurrency control within the DBMS guarantees that each transaction is
correctly executed or completely aborted.
 OLTP (Online Transaction Processing) is a major part of database applications.
Typical DBMS Functionality
Define a database : in terms of data types, structures and
constraints
Construct or Load the Database on a secondary storage
medium
Manipulating the database : querying, generating reports,
insertions, deletions and modifications to its content,
Accessing the database through Web applications
Concurrent Processing and Sharing by a set of users and
programs – yet, keeping all data valid and consistent

Slide 1-9
Example of a Database
(with a Conceptual Data Model)

Mini-world for the example:


Part of a UNIVERSITY environment.
Some mini-world entities:
STUDENTs
COURSEs
SECTIONs (of COURSEs)
(academic) DEPARTMENTs
INSTRUCTORs

Note: The above could be expressed in the ENTITY-RELATIONSHIP


data model.

Slide 1-
11
Example of a Database
(with a Conceptual Data Model)
Some mini-world relationships:
SECTIONs are of specific COURSEs
STUDENTs take SECTIONs
COURSEs have prerequisite COURSEs
INSTRUCTORs teach SECTIONs
COURSEs are offered by DEPARTMENTs
STUDENTs major in DEPARTMENTs

Note: The above could be expressed in the ENTITY-RELATIONSHIP


data model.

Slide 1-
12
Database Users
Users may be divided into
those who actually use and control the content (called
“Actors on the Scene”) and
those who enable the database to be developed and the
DBMS software to be designed and implemented (called
“Workers Behind the Scene”).

Slide 1-
13
Database Users
Actors on the scene
 Database administrators: responsible for authorizing
access to the database, for coordinating and monitoring its
use, acquiring software and hardware resources, controlling
its use and monitoring efficiency of operations.
 Database Designers: responsible to define the content, the
structure, the constraints, and functions or transactions
against the database. They must communicate with the end-
users and understand their needs.
 End-users: they use the data for queries, reports and some
of them actually update the database content.

Slide 1-
14
Categories of End-users
Casual : access database occasionally when needed
Naive or Parametric : They make up a large section of
the end-user population. They use previously well-defined
functions in the form of “canned transactions” against the
database.
 Examples are bank-tellers or reservation clerks who do this
activity for an entire shift of operations.

Slide 1-
15
Categories of End-users
Sophisticated : these include business analysts, scientists,
engineers, others thoroughly familiar with the system
capabilities. Many use tools in the form of software packages
that work closely with the stored database.

Stand-alone : mostly maintain personal databases using


ready-to-use packaged applications. An example is a tax
program user that creates his or her own internal database.

Slide 1-
16
Workers Behind the Scene
DBMS system Designers and Implementers
Design and implement the DBMS modules and
interfaces as a software package.
Tool Developers
Design and implement tools
Operators and Maintenance Personnel
Responsible for the actual running and maintenance of
the hardware and software environment for the
database system.

Slide 1-
17
Advantages of Using the Database Approach
Controlling redundancy

Sharing of data among multiple users.

Restricting unauthorized access to data.

Providing backup and recovery services.

Providing multiple interfaces to different classes of users.

Representing complex relationships among data.

Enforcing integrity constraints on the database.


Slide 1-
18
DBMS Software
1. Oracle
2. SQL server
3. IBM DB2
4. SAP Sybase ASE
5. Postgre SQL
6. MYSQL
7. Tera data
8. Informix
9. Ingres
10. MariaDB
Largest Databases in the World
1. The World Data Centre for Climate
2. National Energy Research Scientific computing Center
3. AT&T – Telecommunication
4. Google
5. Sprint – Telecommunication
6. Choice point
7. Youtube
8. Amazon
9. Central Intelligence Agency (CIA)
10. Library of Congress
When not to use a DBMS
Main inhibitors (costs) of using a DBMS:
High initial investment in hardware, software and training.
Overhead for providing generality, security, concurrency
control, recovery, and integrity functions.
When a DBMS may be unnecessary:
If the database and applications are simple, well defined, and
not expected to change.
If there are stringent real-time requirements that may not be
met because of DBMS overhead.
If access to data by multiple users is not required.

Slide 1-
21
When not to use a DBMS
When no DBMS may suffice:
If the database system is not able to handle the complexity of
data because of modeling limitations
If the database users need special operations not supported by
the DBMS.

Slide 1-
22
Data Models
Data Model:
 A collection of concepts to describe the structure of a database, the
operations for manipulating these structures, and certain constraints
that the database should obey.
 Data Model Structure and Constraints:
 Constructs are used to define the database structure
 Constructs typically include elements (and their data types) as well
as groups of elements (e.g. entity, record, table), and relationships
among such groups
 Constraints specify some restrictions on valid data; these constraints
must be enforced at all times
Categories of Data Models
Conceptual (high-level, semantic) data models:
 Provide concepts that are close to the way many users perceive data. (Also
called entity-based or object-based data models.)
 Physical (low-level, internal) data models:
 Provide concepts that describe details of how data is stored in the computer.
 These are usually specified through DBMS design and administration
manuals
 Implementation (representational) data models:
 Provide concepts that fall between the above two, used by many commercial
 DBMS implementations (e.g. relational data models used in many
commercial systems).
Schemas versus Instances
Database Schema:
The description of a database.
Includes descriptions of the database structure, data types, and
the constraints on the database.

Slide 1-25
Database State/Database instance :

The data stored in a database at a particular moment in time.


This includes the collection of all the data in the database.
Refers to the content of a database at a moment in time.
Also called database instance (or occurrence or snapshot).
The term instance is also applied to individual database
components, e.g. record instance, table instance, entity instance
Initial Database State:
Refers to the database state when it is initially loaded into the
system. (Inserting first record values and validating it)
 Valid State:
A state that satisfies the structure and constraints of the database.
Distinction
The database schema changes very infrequently.
The database state changes every time the database is updated.
 Schema is also called intension.
 State is also called extension.
Three schema architecture
Architecture goal: To separate the user application and the
database
Important characteristics of database approach:
 Insulation of programs and data
 Support multiple user views
 Use of a catalog to store the database description (Schema)
An architecture of schema is proposed to achieve these
characteristics

Three schema architecture
Three schema architecture
Defines DBMS schemas at three levels:
Internal schema at the internal level to describe physical
storage structures and access paths. Typically uses a physical
data model.
Conceptual schema at the conceptual level to describe the
structure and constraints for the whole database for a community
of users. Uses a conceptual or an implementation data model.
External schemas at the external level to describe the various
user views. Usually uses the same data model as the conceptual
level.
Data Independence
Capacity to change the schema at one level of database
system without having to change the schema at the next
higher level
Logical Data Independence: The capacity to change the
conceptual schema without having to change the external
schemas and their application programs.
Physical Data Independence: The capacity to change the
internal schema without having to change the conceptual
schema.
Once the design of a database is

completed and a DBMS is chosen to


implement the database, the first order of
the day is to specify conceptual and
internal schemas for the database and any
mappings between the two.
In many DBMS’s where no strict separation of
levels is maintained, one language, called the data
definition language (DDL), is used by the DBA
and by database designers to define both schemas.
This Language is used define data structures and
specially database schemas. these statements are
used to create, alter, or drop data structures.
ALTER ,CREATE ,DROP are some examples of
DDL.

The DBMS will have a DDL compiler
whose function is to process DDL
statements in order to identify
descriptions of the schema constructs and
to store the schema description in the
DBMS catalog.
In DBMSs where a clear separation is
maintained between the conceptual and
internal levels, the DDL is used to specify the
conceptual schema only. Another language, the
storage definition language (SDL), is used to
specify the internal schema.
This language is used to define internal
schema. It defines that what will be the
physical structure of database, how many bites
per field will be used, what will be the order of
fields, and how records will be accesses etc.
The mappings between the two schemas
may be specified in either one of these
languages. For a true three-schema
architecture, we would need a third
language, the view definition language
(VDL), to specify user views and their
mappings to the conceptual schema, but
in most DBMS’s the DDL is used to
define both conceptual and external
schemas.
Once the database schemas are compiled and
the database is populated with data, users must
have some means to manipulate the database.
Typical manipulations include retrieval,
insertion, deletion, and modification of the
data.
The DBMS provides a data manipulation
language (DML) for these purposes.
DBMSs, the preceding types of
languages are usually not considered
distinct languages; rather, a
comprehensive integrated language is
used that includes constructs for
conceptual schema definition, view
definition, and data manipulation
Storage definition is typically kept
separate, since it is used for defining
physical storage structures to fine-tune the
performance of the database system, and
it is usually utilized by the DBA staff
A typical example of a comprehensive
database language is the SQL relational
database language which represents a
combination of DDL, VDL, and DML, as
well as statements for constraint
specification and schema evolution. The
SDL was a component in earlier versions
of SQL but has been removed from the
language to keep it at the conceptual and
external levels only.
There are two main types of DMLs. A high-
level or nonprocedural DML can be used on
its own to specify complex database operations
in a concise manner. Many DBMSs allow
high-level DML statements either to be entered
interactively from a terminal (or monitor) or to
be embedded in a general-purpose
programming language.
In the latter case, DML statements must
be identified within the program so that
they can be extracted by a pre-compiler
and processed by the DBMS. A low-level
or procedural DML must be embedded
in a general-purpose programming
language.
Low-level DMLs are also called record-at-a-
time DMLs because of this property.
High-level DMLs, such as SQL, can specify
and retrieve many records in a single DML
statement and are hence called set-at-a-time or
set-oriented DMLs.
A query in a high-level DML often specifies
which data to retrieve rather than how to
retrieve it; hence, such languages are also
called declarative
Whenever DML commands, whether high-
level or low-level, are embedded in a general-
purpose programming language, that language
is called the host language and the DML is
called the data sublanguage.
On the other hand, a high-level DML used in a
stand-alone interactive manner is called a
query language.
DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the
following:
Menu-Based Interfaces for Web Clients or Browsing. These
interfaces present the user with lists of options (called menus) that lead
the user through the formulation of a request.
Forms-Based Interfaces. A forms-based interface displays a form to
each user.
Graphical User Interfaces. A GUI typically displays a schema to the
user in diagrammatic form
Natural Language Interfaces. These interfaces accept requests written
in English or some other language and attempt to understand them.
Speech Input and Output. Limited use of speech as an input
query and speech as an answer to a question or result of a
request is becoming commonplace.
Interfaces for Parametric Users. Parametric users, such as
bank tellers, often have a small set of operations that they
must perform repeatedly.
Interfaces for the DBA. Most database systems contain
privileged commands that can be used only by the DBA staff.
These include commands for creating
accounts, setting system parameters, granting account
authorization, changing a schema, and reorganizing the
storage structures of a database.
DBMS Components

Slide 1-50
A database system being a complex software
system is partitioned into several software
components that handle various tasks such as

 data definition and manipulation,


security and data integrity,
data recovery and concurrency control, and
 performance optimization
Data definition:
DBMS provides functions to define the structure of the data.
functions include defining and modifying the record structure, the
data type of fields, and the various constraints to be satisfied by the
data in each field.

It is the responsibility of database administrator to define the


database, and make changes to its definition (if required) using the
DDL and other privileged commands.

 The DDL compiler component of DBMS processes these schema


definitions, and stores the schema descriptions in the DBMS catalog
(data dictionary). Other DBMS components then refer to the catalog
information as and when required.
Data manipulation
Once the data structure is defined, data needs
to be manipulated. The manipulation of data
includes insertion, deletion, and modification
of records. The functions that perform these
operations are also part of the DBMS.
The queries that are defined as a part of the
application programs are known as planned
queries.
The application programs are submitted to a
precompiler, which extracts DML commands
from the application program and send them to
DML compiler for compilation.
The rest of the program is sent to the host
language compiler.
The object codes of both the DML commands
and the rest of the program are linked and sent
to the query evaluation engine for execution.
The sudden queries that are executed as and when the
need arises are known as unplanned queries
(interactive queries). These queries are compiled by
the query complier, and then optimized by the
query optimizer.
The query optimizer consults the data dictionary for
statistical and other physical information about the
stored data. (The optimized query is finally passed to
the query evaluation engine for execution)
The naive users of the database can also

query and update the database by using some


already given application program interfaces.
The object code of these queries is also passed
to query evaluation engine for processing.
Data security and integrity:
The DBMS contains functions, which handle

the security and integrity of data stored in the


database. Since these functions can be easily
invoked by the application, the application
programmer need not code these functions in the
programs.
Concurrency and data recovery:
The DBMS also contains some functions that

deal with the concurrent access of records by


multiple users and the recovery of data after a
system failure
Performance optimization:
The DBMS has a set of functions that

optimize the performance of the queries by


evaluating the different execution plans of a
query and choosing the best among them.
Slide 1-68
Hierarchical model

Slide 1-76
Relational model

Slide 1-78
Slide 1-79

You might also like