0% found this document useful (0 votes)
9 views

Query Processing and Query Optimization

Uploaded by

Chiran Govinna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Query Processing and Query Optimization

Uploaded by

Chiran Govinna
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

22/05/2024

Overview

Query Processing and Query • Query Processing


• Steps of query processing

Optimization
• Translating SQL queries into Relational Algebra
• Algorithms for external sorting
• Algorithms for SELECT and JOIN operations
CO527 Advanced Database Systems • Algorithms for PROJECT and Set operations
• Combining operations using pipelining
Dr. (Eng.) Sampath Deegalla
• Query Optimization
• Using Heuristics in query optimization
• Using Selectivity and Cost Estimates in query optimization

Translating SQL queries into Relational


Query Processing
Algebra
• SQL query is first translated into an equivalent extended relational
algebra expression
• It represented as a query tree data structure (or query graph)
• Then query is optimized by choosing suitable query execution
strategy
• SQL queries are decomposed into query blocks
• A query block contains a single SELECT-FROM-WHERE clause
• Therefore, nested queries should be identified as separate blocks.
22/05/2024

Query Tree Relational algebra operations


• Query tree includes the relational algebra operations being executed
and is used as a possible data structure for the internal
representation of the query in an RDBMS.

Example of a query tree


• For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last
name, address, and birth date.
22/05/2024

Example of a query tree Alternative ways to evaluate a query

Execution plan General Approaches to Optimization


• Execution plan defines exactly what algorithm is used for each • Heuristics in query optimization
operation, and how the execution of the operations is coordinated. • Given a query perform selection and projections as early as possible.
• Eliminate duplicate computations.
• Cost-based query optimization
• Estimate the cost of different equivalent query expressions and choose the
execution plan with the lowers cost estimation.
22/05/2024

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


• Process for heuristics optimization • Query tree: a tree data structure that corresponds to a relational algebra
1. The parser of a high-level query generates an initial internal representation; expression. It represents the input relations of the query as leaf nodes of
2. Apply heuristics rules to optimize the internal representation. the tree, and represents the relational algebra operations as internal
3. A query execution plan is generated to execute groups of operations based nodes.
on the access paths available on the files involved in the query. • An execution of the query tree consists of executing an internal node
• The main heuristic is to apply first the operations that reduce the size operation whenever its operands are available and then replacing that
of intermediate results. internal node by the relation that results from executing the operation.
• E.g., Apply SELECT and PROJECT operations before applying the JOIN or other binary • Query graph: a graph data structure that corresponds to a relational
operations. calculus expression. It does not indicate an order on which operations to
perform first. There is only a single graph corresponding to each query.

Using Heuristics in Query Optimization Query Trees


• Example:
For every project located in ‘Stafford’, retrieve the project number, the controlling department
number and the department manager’s last name, address and birthdate.

Relation algebra:
PNUMBER, DNUM, LNAME, ADDRESS, BDATE (((PLOCATION=‘STAFFORD’(PROJECT))
DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))

SQL query:
Q2: SELECT P.NUMBER,P.DNUM,E.LNAME, E.ADDRESS, E.BDATE
FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E
WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND
P.PLOCATION=‘STAFFORD’;

(a) (b)
22/05/2024

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


Heuristic Optimization of Query Trees:
• The same query could correspond to many different relational algebra
expressions — and hence many different query trees.
• The task of heuristic optimization of query trees is to find a final
query tree that is efficient to execute.
• Example:
Q: SELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME = ‘AQUARIUS’ AND PNMUBER=PNO
AND ESSN=SSN AND BDATE > ‘1957-12-31’; (a) Initial query tree (b) Moving SELECT operations
down the query tree

Using Heuristics in Query Optimization Using Heuristics in Query Optimization

(c) Apply more restrictive SELECT (d) Replacing CARTESIAN


PRODUCT and SELECT operations
(e) Moving PROJECT operations down the query tree
22/05/2024

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


General Transformation Rules for Relational Algebra Operations: General Transformation Rules for Relational Algebra Operations (cont.)
1. Cascade of 𝜎 : A conjunctive selection condition can be broken up into a 5. Commutativity of ( and x ): The operation is commutative as is the x
cascade (sequence) of individual s operations: operation:
𝜎 … (R) ≡ 𝜎 (𝜎 (...(𝜎 (R))...)) R C S ≡ S C R; R x S ≡ S x R
2. Commutativity of 𝜎 : The 𝜎 operation is commutative 6. Commuting  with (or x ): If all the attributes in the selection
𝜎 (𝜎 (R)) ≡ 𝜎 (𝜎 (R)) condition c involve only the attributes of one of the relations being
joined—say, R—the two operations can be commuted as follows :
3. Cascade of 𝜋 : In a cascade (sequence) of 𝜋 operations, all but the last one can
be ignored: c ( R S ) ≡ (c (R)) S
𝜋 (𝜋 (...(𝜋 (R))...)) ≡ 𝜋 (R) Alternatively, if the selection condition c can be written as (c1 and c2),
4. Commuting 𝜎 with 𝜋 : If the selection condition c involves only the attributes where condition c1 involves only the attributes of R and condition c2
A1, ..., An in the projection list, the two operations can be commuted: involves only the attributes of S, the operations commute as follows:
𝜋 , ,…, (c (R)) ≡ c (𝜋 , ,…, (R)) c ( R S ) ≡ ( (R)) ( (S))

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


General Transformation Rules for Relational Algebra Operations (cont.): General Transformation Rules for Relational Algebra Operations (cont.):
7. Commuting  with (or x ): Suppose that the projection list is L = 8. Commutativity of set operations: The set operations ∪ and ∩ are
{A1, ..., An, B1, ..., Bm}, where A1, ..., An are attributes of R and B1, ..., commutative but − is not.
Bm are attributes of S. If the join condition c involves only attributes 9. Associativity of , ×, ∪, and ∩ : These four operations are individually
in L, the two operations can be commuted as follows: associative; that is, if  stands for any one of these four operations
L ( R C S ) ≡ (𝜋 ,…, (R)) C (𝜋 ,…, (S)) (throughout the expression), we have
If the join condition c contains additional attributes not in L, these ( R  S )  T ≡ R  ( S  T )
must be added to the projection list, and a final  operation is 10. Commu ng s with set opera ons: The s opera on commutes with υ , ∩ ,
needed. and –. If q stands for any one of these three operations, we have
c ( R  S ) ≡ (c (R))  (c (S))
22/05/2024

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


General Transformation Rules for Relational Algebra Operations (cont.): Outline of a Heuristic Algebraic Optimization Algorithm:
11. The 𝜋 operation commutes with ∪. 1. Using rule 1, break up any select operations with conjunctive conditions
into a cascade of select operations.
L ( R ∪ S )  (L (R)) ∪ (L (S)) 2. Using rules 2, 4, 6, and 10 concerning the commutativity of select with
other operations, move each select operation as far down the query
12. Converting a (𝜎, ×) sequence into : If the condition c of a 𝜎 that tree as is permitted by the attributes involved in the select condition.
follows a × Corresponds to a join condition, convert the (𝜎, ×) 3. Using rule 9 concerning associativity of binary operations, rearrange the
sequence into a as follows: leaf nodes of the tree so that the leaf node relations with the most
restrictive select operations are executed first in the query tree
(C (R x S)) = (R C S) representation.
13. Other transformations 4. Using Rule 12, combine a Cartesian product operation with a
subsequent select operation in the tree into a join operation.

Using Heuristics in Query Optimization Using Heuristics in Query Optimization


5. Using rules 3, 4, 7, and 11 concerning the cascading of project and Summary of Heuristics for Algebraic Optimization:
the commuting of project with other operations, break down and 1. The main heuristic is to apply first the operations that reduce the size of
move lists of projection attributes down the tree as far as possible intermediate results.
by creating new project operations as needed. 2. Perform select operations as early as possible to reduce the number of
tuples and perform project operations as early as possible to reduce
the number of attributes. (This is done by moving select and project
6. Identify subtrees that represent groups of operations that can be operations as far down the tree as possible.)
executed by a single algorithm. 3. The select and join operations that are most restrictive should be
executed before other similar operations. (This is done by reordering
the leaf nodes of the tree among themselves and adjusting the rest of
the tree appropriately.)
22/05/2024

Using Heuristics in Query Optimization


Query Execution Plans
• An execution plan for a relational algebra query consists of a
combination of the relational algebra query tree and information
about the access methods to be used for each relation as well as
the methods to be used in computing the relational operators
Using Selectivity and Cost
stored in the tree. Estimates in Query Optimization
• Materialized evaluation: the result of an operation is stored as a
temporary relation.
• Pipelined evaluation: as the result of an operator is produced, it is
forwarded to the next operator in sequence.

Cost-based query optimization Cost components for query execution


• Cost-based query optimization: Estimate and compare the costs of executing 1. Access cost to secondary storage
a query using different execution strategies and choose the strategy with the 2. Disk storage cost
lowest cost estimate.
3. Computation cost
(Compare to heuristic query optimization)
4. Memory usage cost
5. Communication cost
• Issues
• Cost function
• Number of execution strategies to be considered Note: Different database systems may focus on different cost components.
22/05/2024

Catalog Information Used in Cost Functions


• Information about the size of a file
• number of records (tuples) (r),
• record size (R),
• number of blocks (b)
• blocking factor (bfr)
• Information about indexes and indexing attributes of a file
• Number of levels (x) of each multilevel index
• Number of first-level index blocks (bI1)
• Number of distinct values (d) of an attribute
• Selectivity (sl) of an attribute
• Selection cardinality (s) of an attribute. (s = sl * r)

You might also like