DDBMS Design
DDBMS Design
Design the conceptual (all the data which are used by the database applications)
schema
Design the physical database (mapping the conceptual schema to storage areas
and determining appropriate access methods)
Points to remember
Although the design of application programs is made after schema design, the
knowledge of application requirements influences schema design, since schemata
must be able to support applications efficiently.
The site from which the application is issued is called site of origin of the
application.
Objectives of data distribution design
Processing locality – Place data as close as possible to the applications which use
them.
Availability and reliability – A high degree of availability for read-only
applications is achieved by storing multiple copies of the same information.
Reliability is also achieved by storing multiple copies of the same information,
since it is possible to recover from crashes of one of the copies by using the other.
Top-down design:
Design the global schema
Fragment the database
Allocate the fragments to the sites
Create the physical images
This approach is suitable for systems which are developed from scratch since it allows
performing the design rationally.
When the distributed database is developed as the aggregation of the existing databases,
bottom-up approach is normally followed.
Bottom-up design:
Select a common database model for describing the global schema of the database
Translate each local schema into the common data model
Integrate all the local schema into a common global schema
By integration it means the merging of common data definition and the resolution of
conflicts among different representations given to the same data.
Example:
Consider a distributed database for a company in West Bengal having 3 sites at North
Bengal (site 1), Kolkata (site 2) and South Bengal (site 3). Kolkata is located about
halfway between NB and SB. There are 30 depts physically grouped as follows: the first
10 are close to NB, depts. between 11 and 20 are close to Kolkata, and depts. over 20 are
close to SB.
Suppliers of the company are all either in the city of NB or in the city of SB. Moreover
NB is in area ‘North’ and SB is in ‘South’. Kolkata falls on the border with some depts.
in North and some are in South.
Y1: deptnum 10
Y2: (10 < deptnum 20) AND (area = “North”)
Y3: (10 < deptnum 20) AND (area = “South”)
Y4: deptnum > 20
Allocation of fragments:
Fragments corresponding to Y1 and Y4 can easily be allocated at sites 1 and 3.
The allocation of fragments Y2 and Y3 needs a trade-off between two conflicting
requirements as follows:
o Administrative applications which would like fragments to be allocated at
site 1 and 3 respectively.
o Regular application would like fragments to be allocated at site 2.
NB: In this example fragments Y2 and Y3 are appropriate units for the allocation
Problem.
R1
R1 S1
S1 R2
R2
S2
R3 R3
S3 S2
R4 R4
S3
R5
(a) Join graph (b) Partitioned join graph (c) Simple join graph
A distributed join can be represented by join graphs. It is defined as a graph (N,E) where
nodes N represent fragments of R and S and non-directed edges represent joins between
fragments which are not intrinsically empty.
Total join graph–Graph contains all possible edges between fragments of R and S
Reduced join graph – Some of the edges between fragments of R and fragments
of S are missing
Allocation of fragments
Decide whether we would go for non-redundant or redundant allocation.
Non-redundant – The best-fit approach
o A measure is associated with each possible allocation
o The site with the highest measure is selected