Query

The document discusses distributed query processing which involves mapping a query on distributed data to separate queries on individual fragments. It involves stages like query mapping, localization, global optimization, and local optimization. Optimization aims to reduce data transfer costs by considering strategies like executing joins at sites where relations reside. Semi-joins can also help minimize data transfer by projecting and transferring a subset of tuples required for joining rather than entire relations.

Uploaded by

Tirth Nisar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views

Query

Uploaded by

Tirth Nisar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

DISTRIBUTED QUERY

PROCESSING
Distributed Query Processing
◦ There are various steps that are followed for query processing.
◦ A distributed database query is processed in stages as follows:
1. Query Mapping:
◦ The input query on distributed data is specified formally using a query language.
◦ It is then translated into an algebraic query on global relations.
◦ This translation is done by referring to the global conceptual schema.
◦ This translation is largely identical to the one performed in a centralized DBMS.
◦ It is first normalized, analyzed for semantic errors, simplified, and finally restructured into an algebraic
query.
Distributed Query Processing
2. Localization:
◦ This stage maps the distributed query on the global schema to separate queries on individual fragments
using data distribution and replication information.
3. Global Query Optimization:
◦ Optimization consists of selecting a strategy from a list of candidates that is closest to optimal.
◦ A list of candidate queries can be obtained by permuting the ordering of operations within a fragment
query generated by the previous stage.
◦ The total cost is a weighted combination of costs such as CPU cost, I/O costs, and communication costs.
4. Local Query Optimization.
◦ This stage is common to all sites in the DDB.
◦ The techniques are similar to those used in centralized systems.

The first three stages discussed above are performed at a central control site, while the last stage is
performed locally.
Distributed Query Processing
Data Transfer Costs of Distributed Query Processing
◦ In a distributed system, several additional factors further complicate query processing.
◦ The first is the cost of transferring data over the network.
◦ This data includes intermediate files that are transferred to other sites for further processing, as well as
the final result files that may have to be transferred to the site where the query result is needed.
◦ These costs may not be very high if the sites are connected via a high-performance local area network,
they become quite significant in other types of networks.
◦ DDBMS query optimization algorithms consider the goal of reducing the amount of data transfer as an
optimization criterion in choosing a distributed query execution strategy.
Distributed Query Processing
Example:
◦ Suppose that the EMPLOYEE and DEPARTMENT relations are distributed at two sites.
◦ Suppose that each record in the query result is 40 bytes long.
Distributed Query Processing
Example:
◦ The query is submitted at a distinct site 3, which is called the result site because the query result is
needed there.
◦ Neither the EMPLOYEE nor the DEPARTMENT relations reside at site 3.
◦ There are three simple strategies for executing this distributed query:
1. First:
◦ Transfer both the EMPLOYEE and the DEPARTMENT relations to the result site, and perform the join at
site 3.
◦ In this case, a total of 1,000,000 + 3,500 = 1,003,500 bytes must be transferred.
Distributed Query Processing
Example:
2. Second:
◦ Transfer the EMPLOYEE relation to site 2, execute the join at site 2, and send the result to site 3.
◦ Transfer the EMPLOYEE relation to site 2, execute the join at site 2, and send the result to site 3.
◦ The size of the query result is 40 * 10,000 = 400,000 bytes, so 400,000 + 1,000,000 = 1,400,000 bytes
must be transferred.
3. Third:
◦ Transfer the DEPARTMENT relation to site 1, execute the join at site 1, and send the result to site 3.
◦ In this case, 400,000 + 3,500 = 403,500 bytes must be transferred.
*If minimizing the amount of data transfer is our optimization criterion, we should choose strategy 3.
Distributed Query Processing
◦ Now consider another query Q:
◦ For each department, retrieve the department name and the name of the department manager.
◦ This can be stated as follows in the relational algebra:

◦ Suppose that the query is submitted at site 3.

◦ The same three strategies for executing query Q apply to Q’, except that the result of Q’ includes only
100 records, assuming that each department has a manager:
1. First:
◦ Transfer both the EMPLOYEE and the DEPARTMENT relations to the result site, and perform the join at
site 3.
◦ In this case, a total of 1,000,000 + 3,500 = 1,003,500 bytes must be transferred.
Distributed Query Processing
2. Second:
◦ Transfer the EMPLOYEE relation to site 2, execute the join at site 2, and send the result to site 3.
◦ The size of the query result is 40 * 100 = 4,000 bytes, so 4,000 + 1,000,000 = 1,004,000 bytes must be
transferred.
3. Third:
◦ Transfer the DEPARTMENT relation to site 1, execute the join at site 1, and send the result to site 3.
◦ In this case, 4,000 + 3,500 = 7,500 bytes must be transferred.
Distributed Query Processing Semi Join
◦ The idea behind distributed query processing using the semijoin operation is to reduce
the number of tuples in a relation before transferring it to another site.
◦ Intuitively, the idea is to send the joining column of one relation R to the site where the
other relation S is located; this column is then joined with S.
◦ Following that, the join attributes, along with the attributes required in the result, are
projected out and shipped back to the original site and joined with R.
◦ Hence, only the joining column of R is transferred in one direction, and a subset of S
with no extraneous tuples or attributes is transferred in the other direction.
◦ If only a small fraction of the tuples in S participate in the join, this can be quite an
efficient solution to minimizing data transfer.
Distributed Query Processing Semi Join
◦ Consider the following strategy for executing Q or Q:
1. Project the join attributes of DEPARTMENT at site 2, and transfer them to site
1. For Q, we transfer F = πDnumber(DEPARTMENT), whose size is 4 * 100 = 400
bytes, whereas, for Q’, we transfer F’= πMgr_ssn(DEPARTMENT), whose size is
9 * 100 = 900 bytes.
2. Join the transferred file with the EMPLOYEE relation at site 1, and transfer the
required attributes from the resulting file to site 2. For Q, we transfer R = πDno,
Fname, Lname(F Dnumber=Dno EMPLOYEE), whose size is 34 * 10,000 = 340,000
bytes, whereas, for Q’, we transfer R’ = πMgr_ssn, Fname, Lname (F’ Mgr_ssn=Ssn
EMPLOYEE), whose size is 39 * 100 = 3,900 bytes.
Distributed Query Processing Semi Join
3. Execute the query by joining the transferred file R or Rwith DEPARTMENT,
and present the result to the user at site
◦ Using this strategy, we transfer 340,400 bytes for Q and 4,800 bytes for Q’.
◦ We limited the EMPLOYEE attributes and tuples transmitted to site 2 in step 2 to
only those that will actually be joined with a DEPARTMENT tuple in step 3.
◦ For query Q, this turned out to include all EMPLOYEE tuples, so little
improvement was achieved.
◦ However, for Q’ only 100 out of the 10,000 EMPLOYEE tuples were needed.
Distributed Query Processing Semi Join
◦ The semijoin operation was devised to formalize this strategy.
◦ A semijoin operation R A=B S, where A and B are domain-compatible attributes
of R and S, respectively, produces the same result as the relational algebra
expression πR(R A=B S).
◦ In a distributed environment where R and S reside at different sites, the semijoin is
typically implemented by first transferring F = πB (S) to the site where R resides
and then joining F with R.

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (77)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
2025 MandateForLeadership FULL
70% (10)
2025 MandateForLeadership FULL
920 pages
How To Kiss A Woman's Breast
60% (114)
How To Kiss A Woman's Breast
14 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
1001 Songs
70% (71)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Lect#2 DDBS (Characteristics and Layers of Query Processing)
78% (9)
Lect#2 DDBS (Characteristics and Layers of Query Processing)
20 pages
Unit-2_Query Processing in Distributed DBMS
No ratings yet
Unit-2_Query Processing in Distributed DBMS
4 pages
Implications of A Distributed Environment Part 2
No ratings yet
Implications of A Distributed Environment Part 2
38 pages
Olap Exp05
No ratings yet
Olap Exp05
10 pages
Rahul Chugh Adbms Asiignment 2
No ratings yet
Rahul Chugh Adbms Asiignment 2
10 pages
Query Optimization in Distributed Systems
No ratings yet
Query Optimization in Distributed Systems
4 pages
DDS Unit - 2
No ratings yet
DDS Unit - 2
7 pages
Distributed Databases
No ratings yet
Distributed Databases
58 pages
Advanced Database Systems: Chapter 3:query Processing and Evaluation
100% (1)
Advanced Database Systems: Chapter 3:query Processing and Evaluation
36 pages
Query Processing in Distributed Database
No ratings yet
Query Processing in Distributed Database
20 pages
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
No ratings yet
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
4 pages
DDBMS-Chapter-4-SE-LectureNote (Version 1)
No ratings yet
DDBMS-Chapter-4-SE-LectureNote (Version 1)
11 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
UT 1 QB Solution
No ratings yet
UT 1 QB Solution
4 pages
Unit2 1
No ratings yet
Unit2 1
10 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Query Processing
No ratings yet
Query Processing
3 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
23 pages
A Data Throughput Prediction Using Scheduling and Assignment Technique
No ratings yet
A Data Throughput Prediction Using Scheduling and Assignment Technique
5 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
Data Communication Basics CH 2
No ratings yet
Data Communication Basics CH 2
36 pages
Module - 1
No ratings yet
Module - 1
94 pages
Module 2
No ratings yet
Module 2
17 pages
DD Design
No ratings yet
DD Design
17 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
dpco_addressing modes
No ratings yet
dpco_addressing modes
46 pages
Query Proceessing
No ratings yet
Query Proceessing
5 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Using Tabu Search To Find Optimal Switched LAN Configurations
No ratings yet
Using Tabu Search To Find Optimal Switched LAN Configurations
4 pages
Database MC A
No ratings yet
Database MC A
16 pages
Query Processing and Optimization: Dessalegn Mequanint
No ratings yet
Query Processing and Optimization: Dessalegn Mequanint
31 pages
Query Processing in Distributed Database
No ratings yet
Query Processing in Distributed Database
24 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
7 Distributed DB
No ratings yet
7 Distributed DB
38 pages
UNIT 4 Query Processing and Different types of Databases
No ratings yet
UNIT 4 Query Processing and Different types of Databases
13 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Query Trees and Heuristics For Query Optimization
No ratings yet
Query Trees and Heuristics For Query Optimization
29 pages
Mid Sem Sol
No ratings yet
Mid Sem Sol
4 pages
Unit 6
No ratings yet
Unit 6
34 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Advanced Database Ch2 and 3
100% (1)
Advanced Database Ch2 and 3
73 pages
Sampling Based Range Partition Methods For Big Data Analytics
No ratings yet
Sampling Based Range Partition Methods For Big Data Analytics
16 pages
Presentation 12
No ratings yet
Presentation 12
10 pages
PLC Counters... Timers Vikas M Sampath
No ratings yet
PLC Counters... Timers Vikas M Sampath
57 pages
Sudhansu,DBMS-3rd
No ratings yet
Sudhansu,DBMS-3rd
6 pages
Query Processing and Query Optimization Techniques
No ratings yet
Query Processing and Query Optimization Techniques
20 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Measures of Query Cost
No ratings yet
Measures of Query Cost
15 pages
Unit II QUERY PROCESSING AND DECOMPOSITION
No ratings yet
Unit II QUERY PROCESSING AND DECOMPOSITION
24 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
Netiq Using Chariot For Switch and Router Performance Testing
No ratings yet
Netiq Using Chariot For Switch and Router Performance Testing
9 pages
csd8 Summer07solns
No ratings yet
csd8 Summer07solns
18 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Query Processing
No ratings yet
Query Processing
5 pages
Run-Time Optimizations of Join Queries For Distributed Databases Over The Internet
No ratings yet
Run-Time Optimizations of Join Queries For Distributed Databases Over The Internet
22 pages
Midterm Exam With SOLUTIONS PDF
No ratings yet
Midterm Exam With SOLUTIONS PDF
6 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
FakeNews Paper
No ratings yet
FakeNews Paper
3 pages
7 Graph Database
No ratings yet
7 Graph Database
10 pages
Concurrency Control
No ratings yet
Concurrency Control
30 pages
Distributed Databases Introduction
100% (1)
Distributed Databases Introduction
16 pages
Deadlock Management
No ratings yet
Deadlock Management
20 pages
Transaction Management
No ratings yet
Transaction Management
21 pages
MIS Question Bank
No ratings yet
MIS Question Bank
5 pages
64squares Established in 2012 64squares Is An AI Technology
No ratings yet
64squares Established in 2012 64squares Is An AI Technology
4 pages
AI & ChatGPT
No ratings yet
AI & ChatGPT
64 pages
1509846749lecture 11. 2k14eee Signal Flow Graph
No ratings yet
1509846749lecture 11. 2k14eee Signal Flow Graph
62 pages
Lecture 3
No ratings yet
Lecture 3
21 pages
Chapter - 3 System Modelling
No ratings yet
Chapter - 3 System Modelling
32 pages
Man203 Chapter 2 Simplex Method
No ratings yet
Man203 Chapter 2 Simplex Method
5 pages
Advanced Dynamic and Control I: 1. Problems Statment
No ratings yet
Advanced Dynamic and Control I: 1. Problems Statment
6 pages
12.adaptive Software Development
No ratings yet
12.adaptive Software Development
2 pages
First Coursework Sheet ELEC2220 Control and Communications
No ratings yet
First Coursework Sheet ELEC2220 Control and Communications
3 pages
Master of Computer Application: 2nd Year, Semester-3
No ratings yet
Master of Computer Application: 2nd Year, Semester-3
3 pages
Software Design Specification Template
No ratings yet
Software Design Specification Template
6 pages
Module 1
No ratings yet
Module 1
66 pages
Expert Systems Prolog PDF
0% (1)
Expert Systems Prolog PDF
2 pages
Unit 2
No ratings yet
Unit 2
51 pages
First Law of Thermodynamics
100% (2)
First Law of Thermodynamics
21 pages
Just-In-time Manufacturing - Wikipedia, The Free Encyclopedia
No ratings yet
Just-In-time Manufacturing - Wikipedia, The Free Encyclopedia
6 pages
Programming Paradigm Defined
No ratings yet
Programming Paradigm Defined
7 pages
CHAPTER II - ACMv2
No ratings yet
CHAPTER II - ACMv2
6 pages
Ai Question Paper2
No ratings yet
Ai Question Paper2
2 pages
Machine(s) : Date: Product: Shift: Availability: Min Min: Page 1 of 1
No ratings yet
Machine(s) : Date: Product: Shift: Availability: Min Min: Page 1 of 1
3 pages
Expression Evaluation: Using Artificial Neural Network
No ratings yet
Expression Evaluation: Using Artificial Neural Network
9 pages
Ai in Surveying and Geomatics
No ratings yet
Ai in Surveying and Geomatics
13 pages
User Stories Applied For Agile Software Development
No ratings yet
User Stories Applied For Agile Software Development
2 pages

Query

Uploaded by

Query

Uploaded by

DISTRIBUTED QUERY

◦ Suppose that the query is submitted at site 3.

You might also like