0% found this document useful (0 votes)
16 views

DBMS Full

Uploaded by

Rohit Mane
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

DBMS Full

Uploaded by

Rohit Mane
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

Database Management Systems

Introduction
What is a database?

A very large, integrated collection of


data.
 Models real-world application :

– Entities (students, courses)


– Relationships (links,refrences)
 Usually
data is too large to fit into main
memory, and often used by many users
Database applications ?

 E-commerce : Amazon.com, etc.


 Airlines and travel services
 Scientific data such as biology,
oceanography, etc.
 Spatial data such as maps, travel
networks,
 World Wide Web
 Digital libraries of artifacts of any kind
What is a DBMS ?

 DBMSstands for Database


Management System

 software
package designed to store,
manage and provide access to
databases.
?
Why Study Databases??
 Shift from computation to information

 Datasets increasing in diversity and volume.


– Digital libraries, interactive video, Human Genome
project, EOS project
– ... need for DBMS exploding

 DBMS encompasses most of CS :


– OS, languages, theory, AI, multimedia, logic
Terminology : Data Models

 A data model :
– is a collection of concepts for describing data.
 A schema :
– is a description of a particular collection of data,
using the given data model.
 The relational model of data
– The most widely used model today.
– Main concept: relation, basically a table with rows
and columns.
– Every relation has a schema, which describes the
columns, or fields.
Data Definition Language (DDL)
 Specification notation for defining the
database schema
 DDL compiler generates a set of tables
stored in a data dictionary
 Data dictionary contains metadata (data
about data)
 Data storage and definition language –
special type of DDL in which the storage
structure and access methods used by the
database system are specified
Data Manipulation Language (DML)

 Language for accessing and


manipulating the data organized by the
appropriate data model
 Two classes of languages
– Procedural – user specifies what data is
required and how to get those data
– Nonprocedural – user specifies what data
is required without specifying how to get
those data
Database schema
 Itsstructure described in a formal
language supported by (DBMS) and
refers to the organization of data to
create a blueprint of how a database will
be constructed
DBMS Keys
 A key is an attribute also known as a
combination of attribute that is used to
identify records. Sometimes we might
have to retrieve data from more than one
table, in those cases we require to join
tables with the help of keys. The purpose
of the key is to bind data together across
tables without repeating all of the data in
every table.
Levels of Abstraction
 Many views:
– Views describe how users View 1 View 2 View 3
see the data.
 Single conceptual (logical) Conceptual Schema
schema
– Conceptual schema defines Physical Schema
logical structure
 Single physical schema:
– Physical schema describes
the files and indexes used.

 Schemas are defined using Data Definition Language ;


 data is modified/queried using Data Manipulation Language
Example: University Database
 Conceptual schema:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
 Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.
 External Schema (View):
– Course_info(cid:string, enrollment:integer)
– CS542Students(sid: string, grade:string)
Data Independence *
 Applications insulated from how data is
structured and stored.
 Logical data independence:
– Protection from changes in logical structure
of data.
 Physical data independence:
– Protection from changes in physical
structure of data.

 One of the most important benefits of using a DBMS!


Files vs. DBMS
If we were to use files, we would have to :
 Stage large datasets between main memory
and secondary storage (buffering, page-
oriented access)
 Must write special code for different queries
 Must protect data from inconsistency due to
multiple concurrent users
 Must manage crash recovery in some
special-purpose manner
 Must provide good methods for access
control
Why Use a DBMS?
 Reduced application development time.
 Data independence
 Efficient data access.
 Data integrity under updates.
 Concurrent access
 Recovery from crashes.
 Security
 Uniform data administration.
Transaction Management
 A transaction is a collection of operations that
performs a single logical function in a database
application.
 Transaction-management component ensures
that the database remains in a consistent (correct)
state despite system failures (e.g. power failures
and operating system crashes) and transaction
failures.
 Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
Concurrency Control
 Concurrent execution of user programs is
essential for good DBMS performance.
– Because disk accesses are frequent, and relatively slow, it is
important to keep CPU humming by working on several user
programs concurrently.

 Interleaving actions of different user programs can lead


to inconsistency:
– e.g., check is cleared while account balance is being
computed.

 DBMS ensures such data inconsistency problems don’t


arise:
– E.g., users can pretend they are using a single-user system
Concurrency Control
Key Concepts of CC
 Key concept is transaction, which is an
atomic sequence of multiple database
actions (reads/writes)

 Each transaction, executed completely,


must leave the DB in a consistent state

 Utilize
locking of resources and other
protocols for guaranteeing consistency.
System Crash : Ensuring Atomicity
 If system crashes in the middle of a
Xact, then DBMS ensures atomicity

 Idea: Keep a log (history) of all actions


carried out by the DBMS while
executing a set of Xacts:
– Before a change is made to database,
corresponding log entry is forced to a safe
location (commit of transaction)
– After a crash, the effects of partially
executed transactions are undone using
the log (rollback of transaction)
Storage Management
 A storage manager is a program module that
provides the interface between the low-level
data stored in the database and the
application programs and queries submitted
to the system.
 The storage manager is responsible for the
following tasks:
– Interaction with the file manager
– Efficient storing, retrieving, and updating of data
Databases make these folks
happy ...
 End users and DBMS vendors
 DB application programmers
– E.g., smart webmasters
 Database administrator (DBA)
– Designs logical /physical schemas
– Handles security and authorization
– Data availability, crash recovery
– Database tuning as needs evolve
Must understand how a DBMS works!
These layers
Structure of a DBMS must consider
concurrency
control and
recovery
 A typical DBMS Query Optimization
has a layered and Execution
architecture. Relational Operators

Files and Access Methods


 Concurrency Buffer Management
control and
Disk Space Management
recovery
components not
shown. DB
KEYS
 Associative addressing is based on keys – a
column, or group of columns, used to identify rows.
 Simple key – a key formed from a single column
 Composite key – a key formed from several
columns
 The relational model has five kinds of keys
• Super
• Candidate
• Primary
• Alternate (secondary)
• Foreign
Super Key

A superkey is any set of attributes that


uniquely identifies a row. A superkey
differs from a candidate key in that it
does not require the non redundancy
property.
Candidate
 A candidate key is an attribute (or set of
attributes) that uniquely identifies a row. A
candidate key must possess the following
properties:
Unique identification - For every row the value
of the key must uniquely identify that row.
Non redundancy - No attribute in the key can
be discarded without destroying the property of
unique identification.
Primary Key
A primary key is the candidate key
which is selected as the principal unique
identifier. Every relation must contain a
primary key. The primary key is usually
the key selected to identify a row when
the database is physically implemented.
For example, a part number is selected
instead of a part description
Alternate (secondary)

 TheCandidate key which is not


selected for primary key are known as
secondary or alternate key
Forign Key
A foreign key is an attribute (or set of
attributes) that appears (usually) as a
non key attribute in one relation and as
a primary key attribute in another
relation. I say usually because it is
possible for a foreign key to also be the
whole or part of a primary key
History of Database Systems
 First-generation
– Hierarchical and Network

 Second generation
– Relational

 Third generation
– Object-Relational
– Object-Oriented
Overall System Structure
naïve users application sophisticated database users
(tellers, agents, etc) programmers users administrator

application Application query database


interface program scheme

Embedded DML DDL query


DML compiler interpreter processor
precompiler
application database-
program management
object code query evaluation system
engine

storage
transaction buffer manager manager
manager

File manager

indices Statistical data disk storage

Data files Data dictionary

CIS-552 Introduction 32
Summary
 DBMS used to maintain & query large datasets.
 Benefits include recovery from system crashes,
concurrent access, quick application
development, data integrity and security.
 Levels of abstraction give data independence.
 A DBMS typically has a layered architecture.

 DBAs hold rewarding jobs. 


 DBMS R&D is one of the broadest,
most exciting areas in CS.

You might also like