0% found this document useful (0 votes)

4 views

1_1b_query_optimization_sil_7ed_ch16_SPLIT

Chapter 16 focuses on query optimization, which involves finding the best query execution plan (QEP) among various alternatives. It covers generating equivalent expressions, estimating statistics of expression results, and choosing evaluation plans using dynamic programming. The chapter emphasizes the importance of cost-based query optimization and provides equivalence rules for relational algebra expressions.

Uploaded by

michaelnicolsamai8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

1_1b_query_optimization_sil_7ed_ch16_SPLIT

Uploaded by

michaelnicolsamai8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Dario Della Monica

Chapter 16: Query Optimization

These slides are a modified version of the slides provided with the book:

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

(however, chapter numeration refers to 7th Ed.)

The original version of the slides is available at: https://www.db-book.com/

Chapter 16: Query Optimization
Introduction
Generating Equivalent Expressions
Equivalence rules
How to generate (all) equivalent expressions
Estimating Statistics of Expression Results
The Catalog
Size estimation
Selection
Join
Other operations (projection, aggregation, set operations, outer join)

Estimation of number of distinct values

Choice of Evaluation Plans
Dynamic Programming for Choosing Evaluation Plans

Database System Concepts - 7th Edition 16.2 ©Silberschatz, Korth and Sudarshan
Introduction
Query optimization: finding the “best” query execution plan (QEP) among the
many possible ones
User is not expected to write queries efficiently (DBMS optimizer takes care of that)
Alternative ways to execute a given query – 2 levels
Equivalent relational algebra expressions
Different implementation choices for each relational algebra operation
Algorithms, indices, coordination between successive operations, …

Database System Concepts - 7th Edition 16.3 ©Silberschatz, Korth and Sudarshan
Introduction
Query optimization: finding the “best” query execution plan (QEP) among the
many possible ones
User is not expected to write queries efficiently (DBMS optimizer takes care of that)
Alternative ways to execute a given query – 2 levels
Equivalent relational algebra expressions
Different implementation choices for each relational algebra operation
Algorithms, indices, coordination between successive operations, …

INSTR(i_id, name, dept_name, ...) The name of all instructors in the department of Music
COURSE(c_id, title, ...) together with the titles of all courses they teach
TEACHES(i_id, c_id, ...)

Database System Concepts - 7th Edition 16.4 ©Silberschatz, Korth and Sudarshan
Introduction
Query optimization: finding the “best” query execution plan (QEP) among the
many possible ones
User is not expected to write queries efficiently (DBMS optimizer takes care of that)
Alternative ways to execute a given query – 2 levels
Equivalent relational algebra expressions
Different implementation choices for each relational algebra operation
Algorithms, indices, coordination between successive operations, …

INSTR(i_id, name, dept_name, ...) The name of all instructors in the department of Music
COURSE(c_id, title, ...) together with the titles of all courses they teach
TEACHES(i_id, c_id, ...)

SELECT I.name, C.title

FROM INSTR I, COURSE C, TEACHES T
WHERE I.i_id = T.i_id
AND T.c_id = C.c_id
AND dept_name=“Music”

Database System Concepts - 7th Edition 16.5 ©Silberschatz, Korth and Sudarshan
Introduction
Query optimization: finding the “best” query execution plan (QEP) among the
many possible ones
User is not expected to write queries efficiently (DBMS optimizer takes care of that)
Alternative ways to execute a given query – 2 levels
Equivalent relational algebra expressions
Different implementation choices for each relational algebra operation
Algorithms, indices, coordination between successive operations, …

INSTR(i_id, name, dept_name, ...) The name of all instructors in the department of Music
COURSE(c_id, title, ...) together with the titles of all courses they teach
TEACHES(i_id, c_id, ...)

SELECT I.name, C.title

FROM INSTR I, COURSE C, TEACHES T
WHERE I.i_id = T.i_id
AND T.c_id = C.c_id
AND dept_name=“Music”

∏ (σ ( INSTR (TEACHES COURSE ))) ∏ (σ ( INSTR) (TEACHES COURSE ))

Database System Concepts - 7th Edition 16.6 ©Silberschatz, Korth and Sudarshan
Introduction (Cont.)
A query evaluation plan (QEP) defines exactly what algorithm is used
for each operation, and how the execution of the operations is
coordinated

Find out how to view query execution plans on your favorite database

Database System Concepts - 7th Edition 16.7 ©Silberschatz, Korth and Sudarshan
Introduction (Cont.)

Cost difference between query evaluation plans can be enormous

E.g. seconds vs. days in some cases

It is worth spending time in finding “best” QEP
Steps in cost-based query optimization
1. Generate logically equivalent expressions using equivalence
rules
2. Annotate in all possible ways resulting expressions to get
alternative QEP
3. Evaluate/estimate the cost (execution time) of each QEP
4. Choose the cheapest QEP based on estimated cost
Estimation of QEP cost based on:
Statistical information about relations (stored in the Catalog)
number of tuples, number of distinct values for an attribute
Statistics estimation for intermediate results
to compute cost of complex expressions
Cost formulae for algorithms, computed using statistics
Database System Concepts - 7th Edition 16.8 ©Silberschatz, Korth and Sudarshan
Generating Equivalent Expressions
Equivalence rules
How to generate (all) equivalent expressions

These slides are a modified version of the slides provided with the book:

Database System Concepts, 6th Ed.

©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use

(however, chapter numeration refers to 7th Ed.)

The original version of the slides is available at: https://www.db-book.com/

Transformation of Relational Expressions

Two relational algebra expressions are said to be equivalent if the two

expressions generate the same set of tuples on every legal database
instance
Note: order of tuples is irrelevant (and also order of attributes)
We don’t care if they generate different results on databases that
violate integrity constraints (e.g., uniqueness of keys)
In SQL, inputs and outputs are multisets of tuples
Two expressions in the multiset version of the relational algebra are
said to be equivalent if the two expressions generate the same multiset
of tuples on every legal database instance
We focus on relational algebra and treat relations as sets
An equivalence rule states that expressions of two forms are equivalent
One can replace an expression of first form by one of the second form,
or vice versa

Database System Concepts - 7th Edition 16.10 ©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a
sequence of individual selections.
σ θ ∧θ ( E) = σ θ (σ θ ( E))
1 2 1 2

Database System Concepts - 7th Edition 16.11 ©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a
sequence of individual selections.
σ θ ∧θ ( E) = σ θ (σ θ ( E))
1 2 1 2
2. Selection operations are commutative.
σ θ (σ θ ( E)) = σ θ (σ θ ( E))
1 2 2 1

Database System Concepts - 7th Edition 16.12 ©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a
sequence of individual selections.
σ θ ∧θ ( E) = σ θ (σ θ ( E))
1 2 1 2
2. Selection operations are commutative.
σ θ (σ θ ( E)) = σ θ (σ θ ( E))
1 2 2 1

3. Only the last in a sequence of projection operations is

needed, the others can be omitted
Π L1 (Π L2 (K (Π Ln ( E )) K)) = Π L1 ( E )
where L1 ⊆ L2 ⊆ K ⊆ Ln

Database System Concepts - 7th Edition 16.13 ©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a
sequence of individual selections.
σ θ ∧θ ( E) = σ θ (σ θ ( E))
1 2 1 2
2. Selection operations are commutative.
σ θ (σ θ ( E)) = σ θ (σ θ ( E))
1 2 2 1

3. Only the last in a sequence of projection operations is

needed, the others can be omitted
Π L1 (Π L2 (K (Π Ln ( E )) K)) = Π L1 ( E )
where L1 ⊆ L2 ⊆ K ⊆ Ln
4. Selections can be combined with Cartesian products and
theta joins.
a. σθ(E1 x E2) = E1 θ E2
b. σθ1(E1 θ2 E2) = E1 θ1∧ θ2 E2

Database System Concepts - 7th Edition 16.14 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join (and thus natural joins) operations are commutative.
E1 θ E2 = E2 θ E1
(but the order is important for efficiency)

Database System Concepts - 7th Edition 16.15 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join (and thus natural joins) operations are commutative.
E1 θ E2 = E2 θ E1
(but the order is important for efficiency)

6. (a) Natural join operations are associative:

(E1 E2) E3 = E1 (E2 E3)
(again, the order is important for efficiency)

Database System Concepts - 7th Edition 16.16 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join (and thus natural joins) operations are commutative.
E1 θ E2 = E2 θ E1
(but the order is important for efficiency)

6. (a) Natural join operations are associative:

(E1 E2) E3 = E1 (E2 E3)
(again, the order is important for efficiency)

(b) Theta joins are associative in the following manner:

(E1 θ1 E2) θ2∧θ3 E3 = E1 θ1∧ θ3 (E2 θ2 E3)
where θ1 involves attributes from only E1 and E2
and θ2 involves attributes from only E2 and E3

Database System Concepts - 7th Edition 16.17 ©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)

7. (a) Selection distributes over theta join in the following manner:

σθ (E1 ⋈θ E2) = (σθ (E1)) ⋈θ E2
1 1
where θ1 involves attributes from only E1

(b) Complex selection distributes over theta join in the following manner:
σθ ∧θ2(E1 ⋈θ E2) = (σθ (E1)) ⋈θ (σθ (E2))
1 1 2

where θ1 involves attributes from only E1

and θ2 involves attributes from only E2

More equivalences at Ch. 16.2 of the book ⋆

⋆ Silberschatz, Korth, and Sudarshan, Database System Concepts, 7° ed.

Database System Concepts - 7th Edition 16.18 ©Silberschatz, Korth and Sudarshan
Pictorial Depiction of Equivalence Rules

Database System Concepts - 7th Edition 16.19 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

Database System Concepts - 7th Edition 16.20 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

Definition (left outer join): the result of a left outer join T = R S is a super-set of the
result of the join T’ = R S in that all tuples in T’ appear in T. In addition, T preserve
those tuples that are lost in the join, by creating tuples in T that are filled with null
values

Database System Concepts - 7th Edition 16.21 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

STUD stud_id name surname

1 gino bianchi
2 filippo neri stud_id name surname course grade
3 mario rossi 1 gino bianchi Math 30
2 filippo neri DB 22
TAKES stud_id course grade
2 filippo neri Logic 30
1 Math 30
2 DB 22
2 Logic 30

Database System Concepts - 7th Edition 16.22 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

STUD stud_id name surname

1 gino bianchi STUD TAKES
2 filippo neri stud_id name surname course grade
3 mario rossi 1 gino bianchi Math 30
2 filippo neri DB 22
TAKES stud_id course grade
2 filippo neri Logic 30
1 Math 30
3 mario rossi null null
2 DB 22
2 Logic 30

Database System Concepts - 7th Edition 16.23 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

STUD stud_id name surname

Database System Concepts - 7th Edition 16.24 ©Silberschatz, Korth and Sudarshan
Exercise
Disprove the equivalence

(R S) T = R (S T)

STUD stud_id name surname

Database System Concepts - 7th Edition 16.25 ©Silberschatz, Korth and Sudarshan
Solution
Disprove the equivalence ( R S) T = R (S T)

Database System Concepts - 7th Edition 16.26 ©Silberschatz, Korth and Sudarshan
Solution
Disprove the equivalence ( R S) T = R (S T)