0% found this document useful (0 votes)
199 views

1 DDBMS Introduction

A distributed database management system (DDBMS) allows for the distributed management of logically related data that is physically stored across a computer network. Key points: 1. A distributed database consists of data fragments distributed across multiple networked sites that can be accessed through applications. 2. Each site has autonomous processing capabilities and a local DBMS to handle local applications involving only that site's data. 3. Global applications require accessing data from multiple sites in a coordinated way using communication between the local DBMSs. 4. The DDBMS provides distribution transparency so applications can access the distributed data as if it were a single centralized database.

Uploaded by

Akhil Singhal
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views

1 DDBMS Introduction

A distributed database management system (DDBMS) allows for the distributed management of logically related data that is physically stored across a computer network. Key points: 1. A distributed database consists of data fragments distributed across multiple networked sites that can be accessed through applications. 2. Each site has autonomous processing capabilities and a local DBMS to handle local applications involving only that site's data. 3. Global applications require accessing data from multiple sites in a coordinated way using communication between the local DBMSs. 4. The DDBMS provides distribution transparency so applications can access the distributed data as if it were a single centralized database.

Uploaded by

Akhil Singhal
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 18

What is a

DDBMS?
Availability of Database

+
Availability of Computer Network
= Distributed Database

A distributed database is, in brief, an integrated


database which is built on top of a computer network
rather than on a single computer.

The data which constitute the database are stored at


the different sites of the computer network, and the
application programs which are run by the computer
access data at different sites.
Concepts
Distributed Database
A logically interrelated collection of shared data (and a
description of this data), physically distributed over a
computer network.

Distributed DBMS
Software system that permits the management of the
distributed database and makes the distribution
transparent to users.
• Collection of logically-related shared data.
• Data split into fragments.
• Fragments may be replicated.
• Fragments/replicas allocated to sites.
• Sites linked by a communications network.
• Data at each site is under control of a DBMS.
• DBMSs handle local applications autonomously.
• Each DBMS participates in at least one global application. 3
Distributed DBMS -
Distributed Processing

A centralized database that can be accessed


over a computer network. 5
2 important aspects
• Distribution: data are not resident at
the same site (processor), so that we
can distinguish a distributed
database from a single centralized
database.
• Logical correlation: data have some
properties which tie them together.
• Consider a Bank that has three branches
at different locations: Delhi, Bombay,
Chennai.
• At each branch, a computer controls
the teller terminals of the branch and the
account database of that branch.
• Each computer with its local account
database at one branch constitutes one
site of the distributed database;
computers are connected by a
communication network.
• During normal operations the
applications which are requested from
the terminals of a branch need only to
access the database of that branch.
• These applications are completely
executed by the computer of the branch
where they are issued and will therefore
be called local applications.
• Example of a local application: debit or
credit card application performed on an
account stored at the same branch at
which the application is requested.
• Understanding the distinction between
distributed database and a set of local
database.
• The important aspect is the existence of
some applications which access data at
more than one branch. These
applications are called global
applications or distributed
applications.
• (Remember logical correlation aspect
of DDBMS)
• A typical global application is a transfer of
funds from an account of one branch to an
account of another branch.
• This application requires updating the databases
at two different branches. This application is
something more than just performing two local
updates at two individual branches ( a debit and
a credit), because it is also necessary to ensure
that both updates are performed or neither.
• Ensuring this requirement for global applications
is a difficult task.
• In the example cited, computers are at
geographically different locations, however
distributed databases can be built also on
a local network (higher throughput).
• Different data sites connected through a
network makes a DDBMS.
• Local application: Locality is not defined
with respect to the geographical
distribution of the computers which
execute it, but with respect to the fact
that only one computer with its own
database is involved.
Please remember the following
definition of DDBMS
• A distributed database is a collection of
data which are distributed over different
computers of a computer network.
• Each site of the network has autonomous
processing ability and can perform local
applications.
• Each site also participates in the execution
of at least one global application,
which requires accessing data at several
sites using a communication subsystem.
Features of Distributed vs.
Centralized Database
• Centralized Control: In case of a DBMS, the fundamental
function of a DBA was to guarantee the safety of data; the data
itself was recognised to be an important investment of the
enterprise which required a centralized responsibility.
• In distributed databases, the idea of centalized control is much
less emphasized. In general, in distributed databases, it is
possible to identify a hierarchical control structure based on a
global database administrator who has the central responsibility
of the whole database, and on local database administrator who
have the responsibility of their respective local database.
• Local database administrator may have a high degree of
autonomy, up to the point that a global database administrator
is completely missing and the inter site coordination is
performed by the local administrator themselves. This
characteristics is called SITE AUTONOMY.
• Distributed databases differ very much in the degree of site
autonomy: from complete site autonomy without any centralized
database administrator to almost completely centralized control.
Features of Distributed vs.
Centralized Database
• Data Independence: Data Independence means
that the actual organization of the data is
transparent to the application programmer.
PROGRAMS are UNAFFECTED by the CHANGES
in the PHYSICAL ORGANIZATION OF DATA.
• In distributed databases, data independence
has the same importance as in traditional
databases; however a new aspect is added to
the usual notion of data independence, namely
DISTRIBUTION TRANSPERENCY.
• By Distribution Transparency we mean that
programs can be written as if the database were
not distributed. Thus the correctness of
programs is unaffected by the movement of
data from one site to another; however their
speed of execution is affected.
Features of Distributed vs.
Centralized Database
• Reduction of redundancy: In traditional
databases, redundancy was reduced as far as
possible for 2 reasons:
– First, inconsistencies among several copies of the
same logical data are automatically avoided by having
only one copy
– Second, storage space is saved by eliminating
redundancy.
In distributed databases, there are several reasons for
considering data redundancy as a desirable feature:
1. The LOCALITY OF APPLICATION can be increased if
the data is replicated at all sites where applications
need it.
2. The AVAILABILITY of the system can be increased,
because a site failure does not stop the execution of
applications at other sites if the data is replicated.
The same reason against redundancy which were
given for the traditional environment are still valid.
• The convenience of replicating a data
item increases with the ratio of
Retrieval accesses VRS. Update
accesses, performed by applications
to it.
• If we have several copies of an item,
RETRIEVAL can be performed ON
ANY COPY, while UPDATES must be
performed consistently ON ALL
COPIES.
Distributed Database Design
Fragmentation
Relation may be divided into a number of sub-relations,
which are then distributed.
Allocation
Each fragment is stored at site with "optimal" distribution.
Replication
Copy of fragment may be maintained at several sites.
Fragmentation
Definition and allocation of fragments carried out strategically to achieve:
Locality of Reference
Improved Reliability and Availability
Improved Performance
Balanced Storage Capacities and Costs
Minimal Communication Costs.
Involves analyzing most important applications, based on quantitative/qualitative information.
Quantitative information may include:
frequency with which an application is run; site from which an application is run;
performance criteria for transactions and applications.
Qualitative information may include transactions that are executed by application, type of access 27
(read or write), and predicates of read operations.
Data Allocation
Four alternative strategies regarding placement of data:
Centralized / Partitioned (or Fragmented) / Complete Replication / Selective Replication

Centralized
Consists of single database and DBMS stored at one site with
users distributed across the network.

Partitioned
Database partitioned into disjoint fragments, each fragment
assigned to one site.

Complete Replication
Consists of maintaining complete copy of database at each site.

Selective Replication
Combination of partitioning, replication, and centralization.

31
Comparison of Strategies
for Data Distribution

33

You might also like