0% found this document useful (0 votes)
13 views

Plan For The Query Optimization Topic: COMP302 Database Systems

This document provides an overview of query optimization techniques in database systems. It discusses typical steps in query processing like parsing, optimization, and execution. Query optimization techniques covered include moving selects down the tree, introducing join operators, project operations, cost-based optimization, and different join algorithms like nested loop, sort-merge, and hash joins. The goal of optimization is to find the most efficient execution plan by transforming the query tree and estimating costs based on statistics from the catalog.

Uploaded by

boka987
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Plan For The Query Optimization Topic: COMP302 Database Systems

This document provides an overview of query optimization techniques in database systems. It discusses typical steps in query processing like parsing, optimization, and execution. Query optimization techniques covered include moving selects down the tree, introducing join operators, project operations, cost-based optimization, and different join algorithms like nested loop, sort-merge, and hash joins. The goal of optimization is to find the most efficient execution plan by transforming the query tree and estimating costs based on statistics from the catalog.

Uploaded by

boka987
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

VICTORIA UNIVERSITY OF WELLINGTON

Te Whare Wananga o te Upoko o te Ika a Maui

COMP302
Database Systems

Plan for the Query Optimization topic

COMP302 Database Systems

Query Optimisation_04 1

What is Query Optimization and Why

Query Optimisation_04 2

COMP302 Database Systems

Typical Steps in Query Processing


Scanning (identifying tokens)

Tokens are
SQL keywords,
attribute and
relation names

Parsing (syntax checking of SQL keywords)

Generating query tree of logical operators

Validation

COMP302 Database Systems

Attribute and
relation names
are checked
against Catalog

Query Optimisation_04 3

Typical Steps in Query Processing


Optimization (looking for an execution plan)

Generating query tree of physical operators

Query code generating

Query execution

COMP302 Database Systems

Query Optimisation_04 4

Query Optimization Techniques

COMP302 Database Systems

Query Optimisation_04 5

Query Optimization Techniques

COMP302 Database Systems

Query Optimisation_04 6

Translating SQL into Relational Algebra

COMP302 Database Systems

Query Optimisation_04 7

Heuristic Optimization - an Example

Query Optimisation_04 8

COMP302 Database Systems

Initial Query Tree


C
F = f N1. A = N2. A N2. E = N3. E D > d

Legend:
Tree
node

r(N3)

Base
relation
r(N1)
COMP302 Database Systems

r(N3)
Query Optimisation_04 9

Structure of the Initial Query Tree

COMP302 Database Systems

Query Optimisation_04 10

A Question for You

COMP302 Database Systems

Query Optimisation_04 11

Analysis of the Initial Query Tree

Query Optimisation_04 12

COMP302 Database Systems

Query Tree After Moving Down Selects


N2. E = N3. E

N1. A = N2. A

D > d

F = f
r(N3)

r(N2)

Further improvement
can be achieved by
replacing each
Cartesian product
followed by a select
according to a join
condition with a join
operator

r(N1)
COMP302 Database Systems

Query Optimisation_04 13

Query Tree After Introducing Joins


C
N2. E = N3. E

N1. A = N2. A

D > d

r(N2)

r(N1)
COMP302 Database Systems

F = f
r(N3)

Next improvement
can be achieved by
switching the
positions of N1 and
N3, so that the very
restrictive select
operation

F = f

could be applied as
early as possible
Query Optimisation_04 14

Question for You

COMP302 Database Systems

Query Optimisation_04 15

Query Tree After Switching Positions


C
N2.A = N1.A

D > d

N3.E = N2.E

F = f

r(N2)

r(N1)

Final improvement
can be achieved by
keeping in intermediate
relations only the
attributes needed by
subsequent operations
This can be
accomplished by
applying defined, or
even introducing new undefined (but logically
implied) project
()
operations as early as
possible

r(N3)

Query Optimisation_04 16

COMP302 Database Systems

Query Tree After Introducing Project Ops


N2.A = N1.A

N1.C

N2.A
N2.E = N3.E

N3.E

N2.(A, E )

F = f

r(N2)

N1.(A, C )
D > d

r(N1)

r(N3)
COMP302 Database Systems

Query Optimisation_04 17

Effect of Project Operations

COMP302 Database Systems

Query Optimisation_04 18

Cost Based Optimization

COMP302 Database Systems

Query Optimisation_04 19

Cost Components of a Query Execution

COMP302 Database Systems

Query Optimisation_04 20

Cost Related Catalog Content

COMP302 Database Systems

Query Optimisation_04 21

Some Assumptions

COMP302 Database Systems

Query Optimisation_04 22

Cost Function of a Project Operation

COMP302 Database Systems

Query Optimisation_04 23

Cost Function of a Select Operation

COMP302 Database Systems

Query Optimisation_04 24

Cost Functions of Join Operation

COMP302 Database Systems

Query Optimisation_04 25

Join Selectivity

Query Optimisation_04 26

COMP302 Database Systems

Question for You

COMP302 Database Systems

Query Optimisation_04 27

Nested - Loop Join

Query Optimisation_04 28

COMP302 Database Systems

Nested Loop Join Three Buffers


Main Memory
Buffer 1 Block 1

Relation
N
Block 1

Relation
N M

Buffer 2 Blocks 1- p
Block m

Block 1
Buffer 3 Block 1
Block q

Relation
M
Block 1

For each of m relation N block, all


relation M blocks are transferred
into main memory, so mp
accesses
COMP302 Database Systems

Block p

Query Optimisation_04 29

Cost of Nested - Loop Join

COMP302 Database Systems

Query Optimisation_04 30

Single - Loop Join

COMP302 Database Systems

Query Optimisation_04 31

Single Loop Join

COMP302 Database Systems

Query Optimisation_04 32

Sort - Merge Join

COMP302 Database Systems

Query Optimisation_04 33

Hash Join

COMP302 Database Systems

Query Optimisation_04 34

Partition Hash Join (partitioning phase)

COMP302 Database Systems

Query Optimisation_04 35

Partitioning Phase - Diagram


N1

M1

N2

buffer1

M2

buffer2
h

Nm M m

First all Ni then all Mi

bufferm
h

h
Input buffer

Main Memory

block
record

First N then M

COMP302 Database Systems

M
Query Optimisation_04 36

Probing (joining) Phase

COMP302 Database Systems

Query Optimisation_04 37

Probing Phase - Diagram


joined tuples

probing

Join Result

Output buffer

buffer1

buffer2

bufferbi

Input buffer

N1

M1

N2

M2

COMP302 Database Systems

Main Memory

Iteration i = 2
out of m

Nm M m
Query Optimisation_04 38

Question for You

COMP302 Database Systems

Query Optimisation_04 39

Combining the Optimization Techniques

COMP302 Database Systems

Query Optimisation_04 40

Left Deep Trees

COMP302 Database Systems

Query Optimisation_04 41

Left Deep Tree - An Example


C
N2.A = N1.A

N3.E = N2.E

F = f

D > d

N2

N1

N3
COMP302 Database Systems

Query Optimisation_04 42

Nested Query Optimization

COMP302 Database Systems

Query Optimisation_04 43

Query Tree of Physical Operators

Query Optimisation_04 44

COMP302 Database Systems

Query Tree of Physical Operators (Example)


C
Use SortMerge Join

N2.A = N1.A

Use NestedLoop Join

Use B-tree
on F

Since DISTINCT,
Use Sort and Drop
Duplicates

N3.E = N2.E

F = f

N2

D > d

Use Multi-List
with B-tree on
D

N1

N3
COMP302 Database Systems

Query Optimisation_04 45

Summary

COMP302 Database Systems

Query Optimisation_04 46

Summary (heuristic optimization)

COMP302 Database Systems

Query Optimisation_04 47

Summary (cost based optimization)

COMP302 Database Systems

Query Optimisation_04 48

Plan for Transaction Processing topic

COMP302 Database Systems

Query Optimisation_04 49

VICTORIA UNIVERSITY OF WELLINGTON


Te Whare Wananga o te Upoko o te Ika a Maui

COMP302
Database Systems

Relationship Between b, r, f

L=

n
i =1

COMP302 Database Systems

li

Query Optimisation_04 51

The Effect of Projection

COMP302 Database Systems

Query Optimisation_04 52

Evaluation of DISTINCT Project

COMP302 Database Systems

Query Optimisation_04 53

Cost of DISTINCT Project Operation

Query Optimisation_04 54

COMP302 Database Systems

Attribute Selection Cardinality

COMP302 Database Systems

Query Optimisation_04 55

Selection Cardinality

COMP302 Database Systems

Query Optimisation_04 56

Cost Functions of Select Operation

COMP302 Database Systems

Query Optimisation_04 57

Cost Functions of Select Operation

COMP302 Database Systems

Query Optimisation_04 58

Select Operation Methods

COMP302 Database Systems

Query Optimisation_04 59

Avoiding Sorting with DISTINCT

Query Optimisation_04 60

COMP302 Database Systems

The Order of Select and Project Ops

COMP302 Database Systems


Query Optimisation_04 61

The Order of Select and Project Ops

Query Optimisation_04 62

COMP302 Database Systems

Join Selectivity (an Example)


N

rN = 3, rM = 6,
Since relation N key is B, and referential integrity M [B ] N [B ] is
satisfied, | N
M | = rM = 6
js = | N

M | / (rN * rM ) = rM / (rN * rM ),

hence js = 1 / rN = 0.33
COMP302 Database Systems

Query Optimisation_04 63

Join Selectivity (an Example)


N

B is still the key of N1, but


referential integrity M [B ]
N1 [B ] is not satisfied, so
join selectivity is
js = 1 / dM(B )
where dM(B ) is the number
of (distinct) B values in M,
and an even distribution
of B values in M is
supposed
N1
M

N1

A = 0 (N ) =
M

N1
=

COMP302 Database Systems

Query Optimisation_04 64

The Size of the Join Result Block

COMP302 Database Systems

Query Optimisation_04 65

Nested Loop Join Four Buffers


Main Memory
Buffer 1 Block 1

Relation
N
Block 1

Buffer 2 Block 2

Relation
N M

Buffer 3 Blocks 1- p

Block 1

Buffer 4 Block 1
Block q

Block 2
Block m
Relation
M
Block 1

Each two successive blocks of


the relation N are transferred in
the main memory, so all relation
M blocks are transferred into the
main memory only m / 2 times
(mp / 2 accesses )

Block p

Query Optimisation_04 66

COMP302 Database Systems

Cost of Nested Loop Join (an Example)

COMP302 Database Systems

Query Optimisation_04 67

Single Loop Join


Join
Result

bufferres
value of the join attribute

bufferN

bufferM

bufferi

blockN1

address

blockM1

Index
blockNn
COMP302 Database Systems

blockMm
Query Optimisation_04 68

Index Function f (index)

COMP302 Database Systems

Query Optimisation_04 69

Nested Loop Versus Sort/Merge

COMP302 Database Systems

Query Optimisation_04 70

Comparing Nested and Single Loop

COMP302 Database Systems

Query Optimisation_04 71

Sort/Merge Versus Single Loop

COMP302 Database Systems

Query Optimisation_04 72

In Memory Partition Hash

COMP302 Database Systems

Query Optimisation_04 73

Hybrid Hash Join

Query Optimisation_04 74

COMP302 Database Systems

Hash-Join Versus Other Algorithms

COMP302 Database Systems

Query Optimisation_04 75

Hash-Join Buffer Requirement

COMP302 Database Systems

Query Optimisation_04 76

Comparing Costs of pjp and jp

COMP302 Database Systems

Query Optimisation_04 77

Cost of the Set Theoretic Operations

Query Optimisation_04 78

COMP302 Database Systems

Cost of the set theoretic operations

COMP302 Database Systems

Query Optimisation_04 79

Cost of aggregate functions

Query Optimisation_04 80

COMP302 Database Systems

Left Deep Tree

COMP302 Database Systems

Query Optimisation_04 81

VICTORIA UNIVERSITY OF WELLINGTON


Te Whare Wananga o te Upoko o te Ika a Maui

COMP302
Database Systems

Query Optimization Example

COMP302 Database Systems

Query Optimisation_04 83

Heuristic Optimization Tree


C,G

SORT

N2.A = N1.A

A, G

A, C

E < e

N1

N2
COMP302 Database Systems

Query Optimisation_04 84

Cost Optimization






 



  
COMP302 Database Systems

Query Optimisation_04 85

Cost of a SELECT

Query Optimisation_04 86

COMP302 Database Systems

Cost of a Project

COMP302 Database Systems

Query Optimisation_04 87

Cost of a Join

Query Optimisation_04 88

COMP302 Database Systems

Cost of a Sort

COMP302 Database Systems

Query Optimisation_04 89

You might also like