0% found this document useful (0 votes)

97 views

Chapter 5 Distributed Database Design

The document discusses different types of data fragmentation in distributed databases including horizontal, vertical, and hybrid fragmentation. Horizontal fragmentation splits a relation based on attribute values, such as fragmenting the PROJ relation based on the LOC attribute. Vertical fragmentation splits a relation based on its attributes into different relations. An example is splitting the PROJ relation into two relations with attributes {PNO, BUDGET} and {PNO, PNAME, LOC}. Hybrid fragmentation applies both horizontal and vertical fragmentation, such as vertically fragmenting first then horizontally fragmenting the results.

Uploaded by

Park Nie

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views

Chapter 5 Distributed Database Design

Uploaded by

Park Nie

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Chapter 5 Distributed Database Design

- Design of a distributed computer system involves making decision on the

placement of data and programs across the sites of a computer network

- This course concentrates on distribution of data

Alternative Design Strategies

- Top-Down Design Process (Refer text page 104)
- Bottom-Up Design Process (Refer text page 106)

Reasons for Fragmentation

- A relation is not a suitable unit for distribution because application views are
usually subsets of relations. Therefore subsets of relations are more suitable as
distribution unit.

- Relation is not replicated (high volume of remote data accesses)

Relation is replicated at all or some sites (unnecessary replication causes update
and storage problem)

- Increase concurrency and system throughput (Parallel execution of query by

dividing the query into sub queries that operate on fragments)

Disadvantages of fragmentation

- Performance degradation – if applications prevent the decomposition of the

relation into mutually exclusive fragments and the applications views are defined on
more than one fragment

- Difficulty in semantic data control (Integrity checking) as attributes are allocated

to different sites as a result of fragmentation.
Fragmentation

PROJ
PNO PNAME BUDGET LOC
P1 x 150 000 Montreal
P2 y 135 000 New York
P3 z 250 000 New York

Horizontal

PROJ1
PNO PNAME BUDGET LOC
P1 x 150 000 Montreal
P2 y 135 000 New York

PROJ2
PNO PNAME BUDGET LOC
P3 z 250 000 New York

Vertical

PROJ1
PNO BUDGET
P1 150 000
P2 135 000
P3 250 000

PROJ2
PNO PNAME LOC
P1 x Montreal
P2 y New York
P3 z New York

Note: Primary key (PNO) is included in both fragments

Degree of Fragmentation
Not to fragment at all  fragment to individual tuples/ attributes

Correctness Rules of Fragmentation

- To ensure the database does not undergo semantic change during fragmentation

Completeness
If a relation R is decomposed into fragments R1, R2,…Rn, each data item that can be
found in R can also be found in one or more Ri. For horizontal fragmentation, item =
tuple and for vertical fragmentation, item = attribute

Reconstruction
If a relation R is decomposed into fragments R1, R2,…Rn, it should be possible to
define a relational operator Δ such that

R= Δ Ri

Disjointness
If a relation R is horizontally decomposed into fragments R1, R2,…Rn, and data item,
d is in Rj, it is not in any other fragment Rk (j≠k)
For vertical fragmentation, primary key is repeated in all fragments, therefore
disjointness is defined on the non primary key attributes.

Allocation Alternatives

- Nonreplicated
- Only one copy of any fragment on the network

- Replication
- Fully replicated
- Partially replicated
Horizontal Fragmentation

Information Requirements

1) Database Information
- Concerns the global conceptual schema
- How relations are connected to one another (ER Diagram)

2) Application Information

Qualitative
- Determine the most important predicates used in user queries

- Simple Predicates – E.g SAL > 20 000, TITLE=”Programmer”

- Min term Predicates - Conjunction of simple predicates

- SAL > 20 000  TITLE=”Programmer”

Quantitative
Min term selectivity
- Number of tuples accessed by a query specified according to a given minterm
predicate

Access frequency
- Access frequency of a query in a given period
Primary Horizontal Fragmentation
- Selection operation on the owner relations of a database schema
Ri =  Fi (R) , 1  i  w

1) Determine a set of simple predicates, Pr (complete and minimal)

Simple predicates are said to be;

Complete
If and only if there is an equal probability of access by every application to
any tuple belonging to any minterm predicate defined according to Pr

Minimal
If all the predicates of a set Pr are relevant

2) Derive the set of minterm predicates from the predicates in set Pr. These minterm
predicates determine the fragments used as candidates in allocation step.

3) Elimination of meaningless minterm fragments.

Derived Horizontal Fragmentation

Defined on member relation according to selection operation specified on owner

relation

Ri = R x Si, 1  i  w, where Si =  Fi (S), 1  i  w

Refer example 5.12

When there is more than one possible derived horizontal fragmentation, which
candidate fragmentation to choose is based on 2 criteria;

Refer figure 5.7

1) Fragmentation used in more applications

- Try to facilitate the accesses of heavy users to improve system performance

2) Fragmentation with better join characteristic

- Query execution will be faster when join is performed on smaller relations
- System throughput improves when query can be executed in parallel
Checking for the correctness rules of fragmentation

Completeness
- Primary horizontal fragmentation
Fragmentation is complete if the selection predicates are complete

- Derived horizontal fragmentation

Let R be the member relation,
S be the owner relation,
A be the join attribute
Then for each tuple t of R, there should be a tuple t’ of S such that
t[A] = t’[A]

Reconstruction
- Reconstruction of a global relation from its fragments is performed by the union
operator for primary and derived horizontal fragmentation

Disjointness
- Primary horizontal fragmentation
Disjointness is guaranteed if the minterm predicates are mutually exclusive

- Derived horizontal fragmentation

Disjointness is guaranteed if the join graph is simple
Vertical Fragmentation
Objective
- Partition a relation into smaller relations so that many of the user application will
run on only one fragment

- Minimize execution time of user applications that run on the fragments by

allowing user queries to deal with smaller relation causing a smaller number of page
accesses

There are 2 heuristic approaches for vertical fragmentation

1) Grouping
- Assigning each attributes to one fragment, and at each step join some of fragments
until some criteria is satisfied

- Results in overlapping of fragments

2) Splitting
- Start with a relation and decides on the beneficial partitioning based on the access
behavior of applications to the attributes

- Non-overlapping of fragments
Information Requirements of Vertical Fragmentation
- Vertical partitioning places in one fragment those attributes usually accessed
together

- Attribute usage value,

use(qi, Aj) = 1 if attribute Aj is referenced by query qi
0 otherwise

Refer to example 5.15

Note: Attribute usage matrix

- Attribute usage values are not sufficient for attribute splitting and fragmentation as
they do not represent the weight of application frequencies. Therefore, we need to
form Attribute Affinity

Refer to example 5.16

Note: Attribute Affinity Matrix
Clustering Algorithm
- Bond energy algorithm is used to group the attributes based on attribute affinity
values
- Bond energy algorithm takes as input the attribute affinity matrix, permutes its
rows and columns, to generate Clustered Affinity Matrix in 3 steps

1) Initialization
A1 A2
A1 45 0
A2 0 80
A3 45 5
A4 0 75

2) Iteration
cont(A1,A2, A3) = 2bond(A1, A2) + 2bond(A2, A3) - 2bond(A1, A3)
= 2*225 + 2*890 – 2*4410 = -6590

cont(A1,A3, A2) = 2bond(A1, A3) + 2bond(A3, A2) - 2bond(A1, A2)

= 2*4410 + 2*890 – 2*225 = 10150

cont(A3,A1, A2) = 2bond(A3, A1) + 2bond(A1, A2) - 2bond(A3, A1)

= 2*4410 + 2*225 – 2*890 = 7490

Since the contribution of the ordering (1-3-2) is the largest, therefore

A1 A3 A2
A1 45 45 0
A2 0 5 80
A3 45 53 5
A4 0 3 75
Continue with column A4

cont(A3,A2, A4) = 2bond(A3, A2) + 2bond(A2, A4) - 2bond(A3, A4)

= 2*890 + 2*11865 – 2*768 = 23974

cont(A3,A4, A2) = 2bond(A3, A4) + 2bond(A4, A2) - 2bond(A3, A2)

= 2*768 + 2*11865 – 2*890 = 23486

cont(A4,A3, A2) = 2bond(A4, A3) + 2bond(A3, A2) - 2bond(A4, A2)

= 2*768 + 2*890 – 2*11865 = -20414

Since the contribution of the ordering (3-2-4) is the largest, therefore

A1 A3 A2 A4
A1 45 45 0 0
A2 0 5 80 75
A3 45 53 5 3
A4 0 3 75 78
3) Row ordering

A1 A3 A2 A4
A1 45 45 0 0
A3 45 53 5 3
A2 0 5 80 75
A4 0 3 75 78

- Based on Clustered Affinity Matrix, we have 2 fragments

- When the partition algorithm is applied to CA matrix obtained from relation
PROJ, the result is the definition of fragments FPROJ = {PROJ1, PROJ2}, where
PROJ1= {A1, A3} and PROJ1= {A1, A2, A4}

Thus
PROJ1= {PNO, BUDGET}
PROJ2= {PNO, PNAME, LOC}

Hybrid / Mixed / Nested Fragmentation

- Sometimes a simple horizontal or vertical fragmentation of a database will not
sufficient to satisfy the requirements of user application

- We may have a vertical fragmentation followed by horizontal fragmentation or

vice versa

Refer to figure 5.19

- To reconstruct the original global relation in case of hybrid fragmentation, starts at

the leaves of the tree and moves upward by performing joins and unions

Refer to figure 5.20

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Project Report On Ultratech Cement Limited PDF
No ratings yet
Project Report On Ultratech Cement Limited PDF
45 pages
Ddbms Lab Manual
No ratings yet
Ddbms Lab Manual
100 pages
CO4752 Web Development Assignment (2020 A) : Learning Outcomes Assessed
No ratings yet
CO4752 Web Development Assignment (2020 A) : Learning Outcomes Assessed
3 pages
ddb03 2
No ratings yet
ddb03 2
62 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
DBMS - LAB Manual
No ratings yet
DBMS - LAB Manual
22 pages
Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
No ratings yet
Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
17 pages
Cloud Computing Assignment 1
No ratings yet
Cloud Computing Assignment 1
7 pages
Access Control Models and Methods - Types of Access Control
No ratings yet
Access Control Models and Methods - Types of Access Control
12 pages
Chapter 4 Physical and Logical Security
No ratings yet
Chapter 4 Physical and Logical Security
22 pages
Final Exam Cloud Computing
No ratings yet
Final Exam Cloud Computing
2 pages
CCNA Lab Manual
100% (1)
CCNA Lab Manual
85 pages
Cs9152 DBT Unit I Notes
100% (1)
Cs9152 DBT Unit I Notes
53 pages
Graphviz Tutorial
No ratings yet
Graphviz Tutorial
40 pages
DBMS Module 4 (Transactions) - 5th Semester - Computer Science and Engineering
No ratings yet
DBMS Module 4 (Transactions) - 5th Semester - Computer Science and Engineering
41 pages
System Design Karanpratapsingh
No ratings yet
System Design Karanpratapsingh
191 pages
Unit 3 Routing Algorithms Computer Networks
No ratings yet
Unit 3 Routing Algorithms Computer Networks
78 pages
Last Minute Notes - Computer Networks - GeeksforGeeks
No ratings yet
Last Minute Notes - Computer Networks - GeeksforGeeks
14 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
Logic Circuit and Design (LABORATORY) Prelims
No ratings yet
Logic Circuit and Design (LABORATORY) Prelims
20 pages
Oop PPT 1
No ratings yet
Oop PPT 1
94 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
100 pages
Implementation Techniques - Unit 4
No ratings yet
Implementation Techniques - Unit 4
29 pages
UNIT-IV-MCA-305-ADVANCED DBMS
No ratings yet
UNIT-IV-MCA-305-ADVANCED DBMS
15 pages
Unit-Iii Distributed Objects and Remote Invocation
No ratings yet
Unit-Iii Distributed Objects and Remote Invocation
12 pages
Leftist Trees Extended Binary Trees
No ratings yet
Leftist Trees Extended Binary Trees
9 pages
Equivalence Rules
No ratings yet
Equivalence Rules
3 pages
ADBMS Sem 1 Mumbai University (MSC - CS)
No ratings yet
ADBMS Sem 1 Mumbai University (MSC - CS)
39 pages
23PCA11 Unit 1 Cloud Computing
No ratings yet
23PCA11 Unit 1 Cloud Computing
49 pages
CS3401 - Algorithm
No ratings yet
CS3401 - Algorithm
37 pages
4 Bit Aritchmatic Logic Unit
No ratings yet
4 Bit Aritchmatic Logic Unit
18 pages
Software Engineering Notes (Unit-III)
No ratings yet
Software Engineering Notes (Unit-III)
21 pages
Android 100 MCQS
No ratings yet
Android 100 MCQS
39 pages
Distributed Operating Systems: Unit - 2
No ratings yet
Distributed Operating Systems: Unit - 2
48 pages
DBMS Unit-4 Notes
No ratings yet
DBMS Unit-4 Notes
62 pages
HPC Unit 456
No ratings yet
HPC Unit 456
25 pages
Chapter 0 - Introduction To Computing
No ratings yet
Chapter 0 - Introduction To Computing
43 pages
Cyclomatic Complexity Notes
No ratings yet
Cyclomatic Complexity Notes
3 pages
Unit-V: Database Management System
No ratings yet
Unit-V: Database Management System
5 pages
O.R- Unit - I, II, III
No ratings yet
O.R- Unit - I, II, III
44 pages
Stucor Mg8591 KL
No ratings yet
Stucor Mg8591 KL
111 pages
JNTUH Mobile Application Development Syllabi
No ratings yet
JNTUH Mobile Application Development Syllabi
2 pages
004 IP Addressing
No ratings yet
004 IP Addressing
11 pages
Cloud Computing Unit 1
No ratings yet
Cloud Computing Unit 1
23 pages
Module Number: 2 Module Heading: Managing The Information Systems Project Module Objectives
No ratings yet
Module Number: 2 Module Heading: Managing The Information Systems Project Module Objectives
28 pages
Data Analytics Lab File Rohit
No ratings yet
Data Analytics Lab File Rohit
23 pages
Functional Dependency
No ratings yet
Functional Dependency
2 pages
Big Data Analytics – Unit 4
No ratings yet
Big Data Analytics – Unit 4
32 pages
Unit 3 AI Srs 13-14
No ratings yet
Unit 3 AI Srs 13-14
45 pages
CS602PC - Compiler - Design - Lecture Notes - Unit - 5
No ratings yet
CS602PC - Compiler - Design - Lecture Notes - Unit - 5
28 pages
III Year V Sem Cs6503 Theory of Computation
No ratings yet
III Year V Sem Cs6503 Theory of Computation
44 pages
Unit - II R
No ratings yet
Unit - II R
61 pages
ER Diagram-Automotive Sales Company
No ratings yet
ER Diagram-Automotive Sales Company
11 pages
Unit 5 - SE - Notes
No ratings yet
Unit 5 - SE - Notes
45 pages
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
From Everand
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
Joseph O. Esin
No ratings yet
Chapter 3: Distributed Database Design
No ratings yet
Chapter 3: Distributed Database Design
44 pages
Lecture 9
No ratings yet
Lecture 9
53 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
f33 FT Computing Lec03 Combin
No ratings yet
f33 FT Computing Lec03 Combin
19 pages
Reliability Methods For Solving Complex Systems: Abstract
No ratings yet
Reliability Methods For Solving Complex Systems: Abstract
7 pages
Diesel Generator Set Brochure
No ratings yet
Diesel Generator Set Brochure
27 pages
AASHTO Publications Catalog - 2017, Volume 2 (Spring)
No ratings yet
AASHTO Publications Catalog - 2017, Volume 2 (Spring)
21 pages
1 TQM Quality and Quality Control AD
No ratings yet
1 TQM Quality and Quality Control AD
46 pages
Research Methodology: The Buying Behaviour of Consumer Regarding Cars in Ludhiana
No ratings yet
Research Methodology: The Buying Behaviour of Consumer Regarding Cars in Ludhiana
34 pages
DSTU_EN_1708_2_2019
No ratings yet
DSTU_EN_1708_2_2019
27 pages
Dana Molded Products Webcast/Onsite Auction Brochure
No ratings yet
Dana Molded Products Webcast/Onsite Auction Brochure
5 pages
Property Valuation
No ratings yet
Property Valuation
7 pages
An Economic Analysis of Selected Road PR
No ratings yet
An Economic Analysis of Selected Road PR
22 pages
Mars Exploration Rover Mobility and Robotic Arm Operational Performance
100% (1)
Mars Exploration Rover Mobility and Robotic Arm Operational Performance
8 pages
E Cars International Pvt. LTD.,: Tax Invoice
No ratings yet
E Cars International Pvt. LTD.,: Tax Invoice
2 pages
S CURVE Details
No ratings yet
S CURVE Details
23 pages
Unicode Conversion Preparation
100% (1)
Unicode Conversion Preparation
28 pages
NASA Requirements Management
100% (1)
NASA Requirements Management
42 pages
Build Better Gantt Charts With Teamgantt For Free!: First Sample Project
No ratings yet
Build Better Gantt Charts With Teamgantt For Free!: First Sample Project
10 pages
(WWW - Entrance-Exam - Net) - B.tech in Civil Engineering 7th Sem - Estimation, Costing & Valuation Sample Paper 4
No ratings yet
(WWW - Entrance-Exam - Net) - B.tech in Civil Engineering 7th Sem - Estimation, Costing & Valuation Sample Paper 4
4 pages
4g63 MIVEC Engine
No ratings yet
4g63 MIVEC Engine
7 pages
Original of Kaplan Turbine
No ratings yet
Original of Kaplan Turbine
58 pages
Kawan Lama Brosur
No ratings yet
Kawan Lama Brosur
28 pages
Ibm CHN
No ratings yet
Ibm CHN
3 pages
WB & BB Exercises
No ratings yet
WB & BB Exercises
3 pages
Baypren 213-2
No ratings yet
Baypren 213-2
2 pages
QM Pillar: (Quality Maintenance)
No ratings yet
QM Pillar: (Quality Maintenance)
17 pages
IR Pegasus 15-37
No ratings yet
IR Pegasus 15-37
8 pages
HB 161-2005 Guide To Plastering
0% (2)
HB 161-2005 Guide To Plastering
10 pages
Santosh
No ratings yet
Santosh
8 pages
Divyanshu 01315603617
No ratings yet
Divyanshu 01315603617
2 pages
SAP Community Call - Diverse Integration at SIKA AG
No ratings yet
SAP Community Call - Diverse Integration at SIKA AG
37 pages
Chapter 2.1
No ratings yet
Chapter 2.1
3 pages
BS5839-1 2013 Changes (Detector) PDF
No ratings yet
BS5839-1 2013 Changes (Detector) PDF
6 pages

Chapter 5 Distributed Database Design

Uploaded by

Chapter 5 Distributed Database Design

Uploaded by

Chapter 5 Distributed Database Design

- Design of a distributed computer system involves making decision on the

- This course concentrates on distribution of data

Alternative Design Strategies

Reasons for Fragmentation

- Relation is not replicated (high volume of remote data accesses)

- Increase concurrency and system throughput (Parallel execution of query by

- Performance degradation – if applications prevent the decomposition of the

- Difficulty in semantic data control (Integrity checking) as attributes are allocated

Note: Primary key (PNO) is included in both fragments

Correctness Rules of Fragmentation

- Simple Predicates – E.g SAL > 20 000, TITLE=”Programmer”

- Min term Predicates - Conjunction of simple predicates

1) Determine a set of simple predicates, Pr (complete and minimal)

Simple predicates are said to be;

3) Elimination of meaningless minterm fragments.

Defined on member relation according to selection operation specified on owner

Ri = R x Si, 1  i  w, where Si =  Fi (S), 1  i  w

Refer example 5.12

Refer figure 5.7

1) Fragmentation used in more applications

2) Fragmentation with better join characteristic

- Derived horizontal fragmentation

- Derived horizontal fragmentation

- Minimize execution time of user applications that run on the fragments by

There are 2 heuristic approaches for vertical fragmentation

- Results in overlapping of fragments

- Attribute usage value,

Refer to example 5.15

Refer to example 5.16

cont(A1,A3, A2) = 2bond(A1, A3) + 2bond(A3, A2) - 2bond(A1, A2)

cont(A3,A1, A2) = 2bond(A3, A1) + 2bond(A1, A2) - 2bond(A3, A1)

Since the contribution of the ordering (1-3-2) is the largest, therefore

cont(A3,A2, A4) = 2bond(A3, A2) + 2bond(A2, A4) - 2bond(A3, A4)

cont(A3,A4, A2) = 2bond(A3, A4) + 2bond(A4, A2) - 2bond(A3, A2)

cont(A4,A3, A2) = 2bond(A4, A3) + 2bond(A3, A2) - 2bond(A4, A2)

Since the contribution of the ordering (3-2-4) is the largest, therefore

- Based on Clustered Affinity Matrix, we have 2 fragments

Hybrid / Mixed / Nested Fragmentation

- We may have a vertical fragmentation followed by horizontal fragmentation or

Refer to figure 5.19

- To reconstruct the original global relation in case of hybrid fragmentation, starts at

Refer to figure 5.20

You might also like