DBMS UNIT 4 Part 1
DBMS UNIT 4 Part 1
Query Processing: Overview, Measures of Query cost, Selection operation, sorting, Join Operation,
other operations, Evaluation of Expressions.
Query optimization: Overview, Transformation of Relational Expressions, Estimating statistics of
Expression results, Choice of Evaluation Plans, Materialized views, Advanced Topics in Query
Optimization.
*****END******
Sorting:
Why sorting?
SQL queries can specify that the output be sorted.
Several of the relational operations, such as joins, can be implemented efficiently if
the input relations are first sorted.
We may build an index on the relation, and then use the index to read the relation in
sorted order. May lead to one disk block access for each tuple.
For relations that fit in memory, techniques like quick sort can be used.
For relations that don’t fit in memory, external sort-merge is a good choice.
External Sort-Merge Algorithm:
Sorting of relations that do not fit in memory is called external sorting. The most
commonly used technique for external sorting is the external sort–merge algorithm.
Let M denote the number of blocks in the main memory buffer available for sorting
1. In the first stage, a number of sorted runs are created; each run is sorted but contains only
some of the records of the relation.
i = 0;
repeat
read M blocks of the relation into memory
sort the in-memory part of the relation;
write the sorted data to run file Ri;
i = i + 1;
until the end of the relation
2. In the second stage, the runs are merged. Suppose, for now, that the total number of
runs, N, is less than M, so that we can allocate one block to each run and have space left
to hold one block of output. The merge stage operates as follows:
read one block of each of the N files Ri into a buffer block in memory;
repeat
o choose the first tuple (in sort order) among all buffer blocks;
o write the tuple to the output, and delete it from the buffer block;
o if the buffer block of any run Ri is empty and not end-of-file(Ri)
then read the next block of Ri into the buffer block;
until all input buffer blocks are empty
Example: External Sorting Using Sort-Merge:
a 19 a 19
g 24 d 31 a 14
b 14
a 19 g 24 a 19
c 33
d 31 b 14
b 14 d 31
c 33 c 33
c 33 e 16
b 14 d 7
e 16 g 24
e 16 d 21
r 16 d 21 d 31
a 14
d 21 m 3 e 16
d 7
m 3 r 16 g 24
d 21
p 2 m 3
m 3
d 7 a 14 p 2
p 2
a 14 d 7 r 16
r 16
p 2
initial sorted
relation runs runs output
create merge merge
runs pass–1 pass–2
Cost of External Merge Sort: Block Transfers
*****END*****
Join Operation:
Several different algorithms to implement joins
o Nested-loop join
o Block nested-loop join
o Indexed nested-loop join
o Merge-join
o Hash-join
Nested-Loop Join:
Algorithm:
r is called the outer relation and s the inner relation of the join.
Requires no indices and can be used with any kind of join condition.
Expensive since it examines every pair of tuples in the two relations. If the smaller
relation fits entirely in main memory, use that relation as the inner relation.
Nested loop join requires no indices
Cost for Nested-Loop Join:
The cost of the nested-loop join algorithm.
o The number of pairs of tuples to be considered is nr * ns where nr denotes the number
of tuples in r, and ns denotes the number of tuples in s.
For each record in r, complete scan on s is performed.
In the worst case, if there is enough memory only to hold one block of each relation, the
estimated cost is nr * bs + br disk accesses.
If the smaller relation i.e best case fits entirely in memory, use that as the inner relation.
This reduces the cost estimate to br + bs disk accesses.
Example:
Number of records of student: nstudent = 5000.
Number of blocks of student: bstudent = 100.
Number of records of takes: ntakes = 10, 000.
Number of blocks of takes: btakes = 400.
o nstudent as outer relation and ntakes as inner realtion
o Assuming the worst case memory availability scenario, cost estimate will be
5000 *400 + 100 = 2, 000, 100 disk accesses
o If the smaller relation (depositor ) fits entirely in memory, the cost estimate will
be 400 + 100 = 500 disk accesses.
Block Nested-Loop Join:
Algorithm:
The primary difference between Nested-Loop join and Block Nested-Loop join is, each
block in the inner relation S is read only once for each block in the outer relation (instead
of once for each tuple in the outer relation)
Worst case estimate Block Nested-Loop join is: br * bs + br block accesses.
Best case for Block Nested-Loop join is: br + bs block accesses.
Example:
Number of records of student: nstudent = 5000.
Number of blocks of student: bstudent = 100.
Number of records of takes: ntakes = 10, 000.
Number of blocks of takes: btakes = 400.
o nstudent as outer relation and ntakes as inner realtion
o Assuming the worst case memory availability scenario, cost estimate will be
100 *400 + 100 = 40, 100 disk accesses
o If the smaller relation (depositor ) fits entirely in memory, the cost estimate will
be 400 + 100 = 500 disk accesses.
Indexed nested-loop join:
In a nested-loop join, if an index is available on the inner loop’s join attribute, index
lookups can replace file scans.
For each tuple tr in the outer relation r, the index is used to look up tuples in s that will
satisfy the join condition with tuple tr.
It can be used with existing indices, as well as with temporary indices created for the sole
purpose of evaluating the join.
For each tuple in outer relation r, a lookup is performed on the index s and relevant
tuples are retrieved.
Looking up tuples in s that will satisfy the join conditions with a given tuple tr is
essentially a selection on s
Worst case: buffer has space for only one page of r and one page of the index.
o br disk accesses are needed to read relation r , and, for each tuple in r , we
perform an index lookup on s.
o Cost of the join: br + nr * c, where c is the cost of a single selection on s using
the join condition.
If indices are available on both r and s, use the one with fewer tuples as the outer relation.
Example:
Number of records of student: nstudent = 5000.
Number of blocks of student: bstudent = 100.
Number of records of takes: ntakes = 10, 000.
Number of blocks of takes: btakes = 400.
o nstudent as outer relation and ntakes as inner realtion
o Assuming the worst case memory availability scenario, cost estimate will be
100 +5000 * 5 = 25, 100 disk accesses
Merge-join
The merge-join algorithm (also called the sort-merge-join algorithm) can be used to compute
natural joins and equi-joins. Let r(R) and s(S) be the relations whose natural join is to be
computed, and let R ∩ S denote their common attributes.
1. First sort both relations on their join attribute (if not already sorted on the join attributes).
2. Merge the sorted relations to join them
o Join step is similar to the merge stage of the sort-merge algorithm.
o Main difference is handling of duplicate values in join attribute — every pair with same
value on join attribute must be matched
Cost Analysis:
Each block needs to be read only once (assuming all tuples for any given value of the join
attributes fit in memory
Thus the cost of merge join is:
bR + bS block transfers + [(bR / bb)+ (bS / bb)] seeks + the cost of sorting if relations are
unsorted
nR = number of tuples of R
bR = number of blocks containing tuples of R
bb = number of buffer blocks allocated for each relation
hybrid merge-join:
1. hybrid merge-join: If one relation is sorted, and the other has a secondary B+-tree index
on the join attribute
o Merge the sorted relation with the leaf entries of the B+-tree.
o Sort the result on the addresses of the unsorted relation’s tuples
o Scan the unsorted relation in physical address order and merge with previous
result, to replace addresses by the actual tuples
Hash-join
A hash function h is used to partition tuples of both relations into sets that have the same
hash value on the join attributes, as follows:
Hash–Join algorithm
The hash-join of r and s is computed as follows.
1. Partition the relations s using hashing function h. When partitioning a relation, one block
of memory is reserved as the output buffer for each partition.
2. Partition r similarly.
3. For each i :
a) Load Hsi into memory and build an in-memory hash index on it using the join
attribute. This hash index uses a different hash function than the earlier one h.
b) Read the tuples in Hri from disk one by one. For each tuple tr locate each matching
tuple ts in Hsi using the in-memory hash index. Output the concatenation of their
attributes.
Relation s is called the build input and r is called the probe input.
Recursive partitioning:
Recursive partitioning required if number of partitions n is greater than number of
pages M of memory.
Instead of partitioning n ways, use M – 1 partitions for s
Further partition the M – 1 partitions using a different hash function
Use same partitioning method on r
Cost of Hash–Join
If recursive partitioning is not required: 3(br + bs) + 2 * max
If recursive partitioning is required, number of passes required for partitioning s is
[logM−1(bs)-1]. This is because each final partition of s should fit in memory.
The number of partitions of probe relation r is the same as that for build relation s;
******END*******
Other Operations
Duplicate elimination:
Duplicate elimination can be implemented via hashing or sorting.
On sorting duplicates will come adjacent to each other, and all but one of a set of
duplicates can be deleted.
Optimization: duplicates can be deleted during run generation as well as at intermediate
merge steps in external sort-merge.
Hashing is similar – duplicates will come into the same bucket.
Projection:
Projection is implemented by performing projection on each tuple followed by duplicate
elimination.
Aggregation:
Aggregation can be implemented in a manner similar to duplicate elimination.
o Sorting or hashing can be used to bring tuples in the same group together, and
then the aggregate functions can be applied on each group.
o Optimization: combine tuples in the same group during run generation and
intermediate merges, by computing partial aggregate values.
Set operations:
Set operations (ᴜ, ᴧ and −): can either use variant of merge-join after sorting, or variant
of hash-join.
Outer Join
****END****
Evaluation of Expressions
The obvious way to evaluate an expression is simply to evaluate one operation at a time,
in an appropriate order.
To evaluate an expression two approachs are used 1. Materialized and 2.pipeline
The result of each evaluation is materialized in a temporary relation for subsequent use.
A disadvantage to this approach is the need to construct the temporary relations, which
(unless they are small) must be written to disk.
An alternative approach is to evaluate several operations simultaneously in a pipeline,
with the results of one operation passed on to the next, without the need to store a
temporary relation
Materialized
Pipelining
Pipelined evaluation : evaluate several operations simultaneously, passing the results of
one operation on to the next.
E.g., in previous expression tree, don’t store result of
o instead, pass tuples directly to the join. Similarly, don’t store result of join, pass
tuples directly to projection.
Much cheaper than materialization: no need to store a temporary relation to disk.
Pipelining may not always be possible – e.g., sort, hash-join.
For pipelining to be effective, use evaluation algorithms that generate output tuples even
as tuples are received for inputs to the operation.
Pipelines can be executed in two ways: demand driven and producer driven
In demand driven or lazy evaluation
system repeatedly requests next tuple from top level operation
Each operation requests next tuple from children operations as required, in order to
output its next tuple
In between calls, operation has to maintain “state” so it knows what to return next
Implementation of demand-driven pipelining
Each operation is implemented as an iterator implementing the following operations
open()
o E.g. file scan: initialize file scan
state: pointer to beginning of file
o E.g.merge join: sort relations;
state: pointers to beginning of sorted relations
next()
o E.g. for file scan: Output next tuple, and advance and store file pointer
o E.g. for merge join: continue with merge from earlier state till next output tuple is
found. Save pointers as iterator state.
close()