Chapter 5 Distributed Database Design
Chapter 5 Distributed Database Design
- A relation is not a suitable unit for distribution because application views are
usually subsets of relations. Therefore subsets of relations are more suitable as
distribution unit.
Disadvantages of fragmentation
PROJ
PNO PNAME BUDGET LOC
P1 x 150 000 Montreal
P2 y 135 000 New York
P3 z 250 000 New York
Horizontal
PROJ1
PNO PNAME BUDGET LOC
P1 x 150 000 Montreal
P2 y 135 000 New York
PROJ2
PNO PNAME BUDGET LOC
P3 z 250 000 New York
Vertical
PROJ1
PNO BUDGET
P1 150 000
P2 135 000
P3 250 000
PROJ2
PNO PNAME LOC
P1 x Montreal
P2 y New York
P3 z New York
Completeness
If a relation R is decomposed into fragments R1, R2,…Rn, each data item that can be
found in R can also be found in one or more Ri. For horizontal fragmentation, item =
tuple and for vertical fragmentation, item = attribute
Reconstruction
If a relation R is decomposed into fragments R1, R2,…Rn, it should be possible to
define a relational operator Δ such that
R= Δ Ri
Disjointness
If a relation R is horizontally decomposed into fragments R1, R2,…Rn, and data item,
d is in Rj, it is not in any other fragment Rk (j≠k)
For vertical fragmentation, primary key is repeated in all fragments, therefore
disjointness is defined on the non primary key attributes.
Allocation Alternatives
- Nonreplicated
- Only one copy of any fragment on the network
- Replication
- Fully replicated
- Partially replicated
Horizontal Fragmentation
Information Requirements
1) Database Information
- Concerns the global conceptual schema
- How relations are connected to one another (ER Diagram)
2) Application Information
Qualitative
- Determine the most important predicates used in user queries
Quantitative
Min term selectivity
- Number of tuples accessed by a query specified according to a given minterm
predicate
Access frequency
- Access frequency of a query in a given period
Primary Horizontal Fragmentation
- Selection operation on the owner relations of a database schema
Ri = Fi (R) , 1 i w
Complete
If and only if there is an equal probability of access by every application to
any tuple belonging to any minterm predicate defined according to Pr
Minimal
If all the predicates of a set Pr are relevant
2) Derive the set of minterm predicates from the predicates in set Pr. These minterm
predicates determine the fragments used as candidates in allocation step.
When there is more than one possible derived horizontal fragmentation, which
candidate fragmentation to choose is based on 2 criteria;
Completeness
- Primary horizontal fragmentation
Fragmentation is complete if the selection predicates are complete
Reconstruction
- Reconstruction of a global relation from its fragments is performed by the union
operator for primary and derived horizontal fragmentation
Disjointness
- Primary horizontal fragmentation
Disjointness is guaranteed if the minterm predicates are mutually exclusive
2) Splitting
- Start with a relation and decides on the beneficial partitioning based on the access
behavior of applications to the attributes
- Non-overlapping of fragments
Information Requirements of Vertical Fragmentation
- Vertical partitioning places in one fragment those attributes usually accessed
together
- Attribute usage values are not sufficient for attribute splitting and fragmentation as
they do not represent the weight of application frequencies. Therefore, we need to
form Attribute Affinity
1) Initialization
A1 A2
A1 45 0
A2 0 80
A3 45 5
A4 0 75
2) Iteration
cont(A1,A2, A3) = 2bond(A1, A2) + 2bond(A2, A3) - 2bond(A1, A3)
= 2*225 + 2*890 – 2*4410 = -6590
A1 A3 A2
A1 45 45 0
A2 0 5 80
A3 45 53 5
A4 0 3 75
Continue with column A4
A1 A3 A2 A4
A1 45 45 0 0
A2 0 5 80 75
A3 45 53 5 3
A4 0 3 75 78
3) Row ordering
A1 A3 A2 A4
A1 45 45 0 0
A3 45 53 5 3
A2 0 5 80 75
A4 0 3 75 78
Thus
PROJ1= {PNO, BUDGET}
PROJ2= {PNO, PNAME, LOC}