0% found this document useful (0 votes)

52 views

Query Processing, Optimization, and Indexing Techniques

The document discusses techniques for efficiently processing database queries, including query optimization and indexing. It covers how a relational database management system translates an SQL query into an efficient query evaluation plan by parsing, optimizing, and evaluating the query. Key topics include logical and physical query plans, equivalence rules and transformations during optimization, and indexing structures like B-trees that help optimize queries.

Uploaded by

Kalaichelvi Mathivanan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views

Query Processing, Optimization, and Indexing Techniques

Uploaded by

Kalaichelvi Mathivanan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Query Processing, optimization, and

indexing techniques

What’s this tutorial about?

From here:
SELECT C.name AS Course, count(S.students) AS Cnt
FROM courses C, subscription S
WHERE
C.lecturer = “Calders”
AND C.courseID = S.courseID
To there:
Course Cnt
“Advanced Databases” 67
“Data mining en kennissystemen” 19

What’s in between?
How does a relational DBMS get there efficiently.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

1
Physical Reality

Cost of query evaluation is generally measured as total elapsed

time for answering query
Many factors contribute to time cost
disk accesses, CPU, or even network communication
Typically disk access is the predominant cost, and is also
relatively easy to estimate. Measured by taking into account
Number of seeks * average-seek-cost
Number of blocks read * average-block-read-cost
Number of blocks written * average-block-write-cost
Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that
the write was successful

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

What’s this tutorial about?

Factors that influence the efficiency:

How is the data stored?
Primary and secondary indices
B-trees

Composite search keys

Hashing

How is the query processed?

Relational algebra
Query evaluation plan

We start with the second part …

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

2
Basic Steps in Query Processing

1. Parsing and translation

2. Optimization
3. Evaluation

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Logical Query Plan

SQL query is translated into a relational algebra expression

can be seen as a tree
different expressions are possible

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

3
Pictorial Depiction of Equivalence Rules

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Multiple Transformations (Cont.)

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

4
Left Deep Join Trees

In left-deep join trees, the right-hand-side input for each join is

a relation, not the result of an intermediate join.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Physical Query Plan

For all relational algebra expressions

Different implementations

Best choice highly depends on

Number of tuples
Presence or absence of indices (way of storage)
Selectivity of predicates (statistics!)
Pipelined or materialized
Etc…

Physical query plan = logical query plan + choice of implementation

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

5
Physical Query Plan

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Optimization

Query Optimization: Amongst all equivalent evaluation plans choose

the one with lowest cost.
Cost is estimated
using statistical information from the
database catalog
number of tuples in each relation,
size of tuples,
based on the way the data is stored
ordered w.r.t. the primary key or not
secondary indices

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

6
Indexing Structures

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Indexing and Hashing

Basic Concepts
Ordered Indices
B+-Tree Index Files
B-Tree Index Files
Multiple-Key Access

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

7
Basic Concepts

Indexing mechanisms used to speed up access to desired data.

E.g., author catalog in library
Search Key - attribute to set of attributes used to look up records in a
file.
An index file consists of records (called index entries) of the form

search-key pointer
Index files are typically much smaller than the original file
Two basic kinds of indices:
Ordered indices: search keys are stored in sorted order
Hash indices: search keys are distributed uniformly across
“buckets” using a “hash function”.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Index Evaluation Metrics

Access types supported efficiently. E.g.,

records with a specified value in the attribute
or records with an attribute value falling in a specified range of
values.
Access time
Insertion time
Deletion time
Space overhead

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

8
Ordered Indices

In an ordered index, index entries are stored sorted on the search key
value. E.g., author catalog in library.
Primary index: in a sequentially ordered file, the index whose search
key specifies the sequential order of the file.
Also called clustering index
The search key of a primary index is usually but not necessarily the
primary key.
Secondary index: an index whose search key specifies an order
different from the sequential order of the file. Also called
non-clustering index.
Index-sequential file: ordered sequential file with a primary index.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Dense Index Files

Dense index — Index record appears for every search-key value in

the file.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

9
Sparse Index Files

Sparse Index: contains index records for only some search-key

values.
Applicable when records are sequentially ordered on search-key
To locate a record with search-key value K we:
Find index record with largest search-key value < K
Search file sequentially starting at the record to which the index
record points

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Sparse Index Files (Cont.)

Compared to dense indices:

Less space and less maintenance overhead for insertions and
deletions.
Generally slower than dense index for locating records.
Good tradeoff: sparse index with an index entry for every block in file,
corresponding to least search-key value in the block.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

10
Multilevel Index
If primary index does not fit in memory, access becomes
expensive.
Solution: treat primary index kept on disk as a sequential file
and construct a sparse index on it.
outer index – a sparse index of primary index
inner index – the primary index file
If even outer index is too large to fit in main memory, yet
another level of index can be created, and so on.
Indices at all levels must be updated on insertion or deletion
from the file.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Multilevel Index (Cont.)

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

11
Secondary Indices

Frequently, one wants to find all the records whose values in a

certain field (which is not the search-key of the primary index) satisfy
some condition.
Example 1: In the account relation stored sequentially by
account number, we may want to find all accounts in a particular
branch
Example 2: as above, but where we want to find all accounts
with a specified balance or range of balances
We can have a secondary index with an index record for each
search-key value

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Secondary Indices Example

Secondary index on balance field of account

Index record points to a bucket that contains pointers to all the

actual records with that particular search-key value.
Secondary indices have to be dense

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

12
Primary and Secondary Indices

Indices offer substantial benefits when searching for records.

BUT: Updating indices imposes overhead on database modification --
when a file is modified, every index on the file must be updated,
Sequential scan using primary index is efficient, but a sequential scan
using a secondary index is expensive
Each record access may fetch a new block from disk
Block fetch requires about 5 to 10 milliseconds
versus about 100 nanoseconds for memory access

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

B+-Tree Index Files

B+-tree indices are an alternative to indexed-sequential files.

Disadvantage of indexed-sequential files

performance degrades as file grows, since many overflow blocks
get created.
Periodic reorganization of entire file is required.
Advantage of B+-tree index files:
automatically reorganizes itself with small, local, changes, in the
face of insertions and deletions.
Reorganization of entire file is not required to maintain
performance.
(Minor) disadvantage of B+-trees:
extra insertion and deletion overhead, space overhead.
Advantages of B+-trees outweigh disadvantages
B+-trees are used extensively

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

13
B+-Tree Index Files (Cont.)

A B+-tree is a rooted tree satisfying the following properties:

All paths from root to leaf are of the same length

Each node that is not a root or a leaf has between n/2 and n
children.
A leaf node has between (n–1)/2 and n–1 values
Special cases:
If the root is not a leaf, it has at least 2 children.
If the root is a leaf (that is, there are no other nodes in the
tree), it can have between 0 and (n–1) values.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

B+-Tree Node Structure

Typical node

Ki are the search-key values

Pi are pointers to children (for non-leaf nodes) or pointers to
records or buckets of records (for leaf nodes).
The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

14
Leaf Nodes in B+-Trees

Properties of a leaf node:

For i = 1, 2, . . ., n–1, pointer Pi either points to a file record with search-
key value Ki, or to a bucket of pointers to file records, each record
having search-key value Ki. Only need bucket structure if search-key
does not form a primary key.
If Li, Lj are leaf nodes and i < j, Li’s search-key values are less than Lj’s
search-key values
Pn points to next leaf node in search-key order

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Non-Leaf Nodes in B+-Trees

Non leaf nodes form a multi-level sparse index on the leaf nodes. For
a non-leaf node with m pointers:
All the search-keys in the subtree to which P1 points are less than
K1
For 2 ≤ i ≤ n – 1, all the search-keys in the subtree to which Pi
points have values greater than or equal to Ki–1 and less than Ki
All the search-keys in the subtree to which Pn points have values
greater than or equal to Kn–1

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

15
Example of a B+-tree

B+-tree for account file (n = 3)

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Example of B+-tree

B+-tree for account file (n = 5)

Leaf nodes must have between 2 and 4 values

((n–1)/2 and n –1, with n = 5).
Non-leaf nodes other than root must have between 3 and 5
children ((n/2 and n with n =5).
Root must have at least 2 children.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

16
Queries on B+-Trees
Find all records with a search-key value of k.
1. N=root
2. Repeat
1. Examine N for the smallest search-key value > k.
2. If such a value exists, assume it is Ki. Then set N = Pi
3. Otherwise k ≥ Kn–1. Set N = Pn
Until N is a leaf node
3. If for some i, key Ki = k follow pointer Pi to the desired record or bucket.
4. Else no record with search-key value k exists.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Queries on B+-Trees (Cont.)

If there are K search-key values in the file, the height of the tree is no
more than logn/2(K).
A node is generally the same size as a disk block, typically 4
kilobytes
and n is typically around 100 (40 bytes per index entry).
With 1 million search key values and n = 100
at most log50(1,000,000) = 4 nodes are accessed in a lookup.
Contrast this with a balanced binary tree with 1 million search key
values — around 20 nodes are accessed in a lookup
above difference is significant since every node access may need
a disk I/O, costing around 20 milliseconds

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

17
Updates on B+-Trees: Insertion (Cont.)

B+-Tree before and after insertion of “Clearview”

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Examples of B+-Tree Deletion

Before and after deleting “Downtown”

Deleting “Downtown” causes merging of under-full leaves
leaf node can become empty only for n=3!
Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

18
Examples of B+-Tree Deletion (Cont.)

Deletion of “Perryridge” from result of previous example

Leaf with “Perryridge” becomes underfull (actually empty, in this special case) and
merged with its sibling.
As a result “Perryridge” node’s parent became underfull, and was merged with its sibling
Value separating two nodes (at parent) moves into merged node
Entry deleted from parent
Root node then has only one child, and is deleted
Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Example of B+-tree Deletion (Cont.)

Before and after deletion of “Perryridge” from earlier example

Parent of leaf containing Perryridge became underfull, and borrowed a
pointer from its left sibling
Search-key value in the parent’s parent changes as a result
Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

19
B+-Tree File Organization

Index file degradation problem is solved by using B+-Tree indices.

Data file degradation problem is solved by using B+-Tree File
Organization.
The leaf nodes in a B+-tree file organization store records, instead of
pointers.
Leaf nodes are still required to be half full
Since records are larger than pointers, the maximum number of
records that can be stored in a leaf node is less than the number of
pointers in a nonleaf node.
Insertion and deletion are handled in the same way as insertion and
deletion of entries in a B+-tree index.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

B+-Tree File Organization (Cont.)

Example of B+-tree File Organization

Good space utilization important since records use more space than
pointers.
To improve space utilization, involve more sibling nodes in redistribution
during splits and merges
Involving 2 siblings in redistribution (to avoid split / merge where
possible) results in each node having at least 2n / 3 entries

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

20
B-Tree Index Files

Similar to B+-tree, but B-tree allows search-key values to

appear only once; eliminates redundant storage of search
keys.
Search keys in nonleaf nodes appear nowhere else in the B-
tree; an additional pointer field for each search key in a
nonleaf node must be included.
Generalized B-tree leaf node

Nonleaf node – pointers Bi are the bucket or file record

pointers.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

B-Tree Index File Example

B-tree (above) and B+-tree (below) on same data

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

21
Multiple-Key Access

Use multiple indices for certain types of queries.

Example:
select account_number
from account
where branch_name = “Perryridge” and balance = 1000
Possible strategies for processing query using indices on single
attributes:
1. Use index on branch_name to find accounts with branch name
Perryridge; test balance = 1000
2. Use index on balance to find accounts with balances of $1000;
test branch_name = “Perryridge”.
3. Use branch_name index to find pointers to all records pertaining
to the Perryridge branch. Similarly use index on balance. Take
intersection of both sets of pointers obtained.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Indices on Multiple Keys

Composite search keys are search keys containing more than one
attribute
E.g. (branch_name, balance)
Lexicographic ordering: (a1, a2) < (b1, b2) if either
a1 < b1, or
a1=b1 and a2 < b2

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

22
Implementations of Relational Algebra
Expressions

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Selection Operation

File scan – search algorithms that locate and retrieve records that
fulfill a selection condition.
Algorithm A1 (linear search). Scan each file block and test all
records to see whether they satisfy the selection condition.
A2 (binary search). Applicable if selection is an equality
comparison on the attribute on which file is ordered.
A3 (primary index on candidate key, equality). Retrieve a single
record that satisfies the corresponding equality condition
A4 (primary index on nonkey, equality) Retrieve multiple records.
A5 (equality on search-key of secondary index).
A6 (primary index, comparison). (Relation is sorted on A)
A7 (secondary index, comparison)
…

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

23
Sorting

We may build an index on the relation, and then use the index to read
the relation in sorted order. May lead to one disk block access for
each tuple.
For relations that fit in memory, techniques like quicksort can be used.
For relations that don’t fit in memory, external
sort-merge is a good choice.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Example: External Sorting Using Sort-

Sort-Merge

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

24
Join Operation

Several different algorithms to implement joins

Nested-loop join
Block nested-loop join
Indexed nested-loop join
Merge-join
Hash-join
Choice based on cost estimate
Examples use the following information
Number of records of customer: 10,000 depositor: 5000
Number of blocks of customer: 400 depositor: 100

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Nested-Loop Join

To compute the theta join r θs

for each tuple tr in r do begin
for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition θ
if they do, add tr • ts to the result.
end
end
r is called the outer relation and s the inner relation of the join.
Requires no indices and can be used with any kind of join condition.
Expensive since it examines every pair of tuples in the two relations.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

25
Block Nested-Loop Join

Variant of nested-loop join in which every block of inner relation is

paired with every block of outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join condition
if they do, add tr • ts to the result.
end
end
end
end

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Indexed Nested-Loop Join

Index lookups can replace file scans if

join is an equi-join or natural join and
an index is available on the inner relation’s join attribute
Can construct an index just to compute a join.
For each tuple tr in the outer relation r, use the index to look up tuples in s
that satisfy the join condition with tuple tr.
Worst case: buffer has space for only one page of r, and, for each tuple
in r, we perform an index lookup on s.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

26
Merge-Join

1. Sort both relations on their join attribute (if not already sorted on the join
attributes).
2. Merge the sorted relations to join them
1. Join step is similar to the merge stage of the sort-merge algorithm.
2. Main difference is handling of duplicate values in join attribute — every
pair with same value on join attribute must be matched
3. Detailed algorithm in book

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Hash-Join (Cont.)

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

27
Hash-Join Algorithm

The hash-join of r and s is computed as follows.

1. Partition the relation s using hashing function h. When partitioning a
relation, one block of memory is reserved as the output buffer for
each partition.
2. Partition r similarly.
3. For each i:
(a) Load si into memory and build an in-memory hash index on it
using the join attribute. This hash index uses a different hash
function than the earlier one h.
(b) Read the tuples in ri from the disk one by one. For each tuple
tr locate each matching tuple ts in si using the in-memory hash
index. Output the concatenation of their attributes.

Relation s is called the build input and

r is called the probe input.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Evaluation of Expressions

So far: we have seen algorithms for individual operations

Alternatives for evaluating an entire expression tree
Materialization: generate results of an expression whose inputs
are relations or are already computed, materialize (store) it on
disk. Repeat.
Pipelining: pass on tuples to parent operations even as an
operation is being executed

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

28
Materialization

Materialized evaluation: evaluate one operation at a time,

starting at the lowest-level. Use intermediate results
materialized into temporary relations to evaluate next-level
operations.
E.g., in figure below, compute and store
σ balance< 2500 ( account)
then compute the store its join with customer, and finally
compute the projections on customer-name.

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

Pipelining

Pipelined evaluation : evaluate several operations simultaneously,

passing the results of one operation on to the next.
E.g., in previous expression tree, don’t store result of

σ balance< 2500 (account )

instead, pass tuples directly to the join.. Similarly, don’t store result of
join, pass tuples directly to projection.
Much cheaper than materialization: no need to store a temporary relation
to disk.
Pipelining may not always be possible – e.g., sort, hash-join.
For pipelining to be effective, use evaluation algorithms that generate
output tuples even as tuples are received for inputs to the operation.
Pipelines can be executed in two ways: demand driven and producer
driven

Based upon slides for: Database System Concepts - 5th Edition, Aug 27, 2005.

"Advanced Database Systems": Course Outlines
No ratings yet
"Advanced Database Systems": Course Outlines
23 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
19 pages
Dbms 2 Marks New
No ratings yet
Dbms 2 Marks New
43 pages
Internal Exam Scheme - II MCA 2022
No ratings yet
Internal Exam Scheme - II MCA 2022
6 pages
Databases I: 1 Knirsch@htw-Berlin - de
No ratings yet
Databases I: 1 Knirsch@htw-Berlin - de
17 pages
Course Outline
No ratings yet
Course Outline
3 pages
CS3492 Database Management Systems Two Mark Questions 1
No ratings yet
CS3492 Database Management Systems Two Mark Questions 1
43 pages
Query Processing
No ratings yet
Query Processing
19 pages
Database Management Systems: IV Semester
No ratings yet
Database Management Systems: IV Semester
2 pages
CT306 N Database Management Systems
No ratings yet
CT306 N Database Management Systems
4 pages
Chapter-5-Database-Information-Management
No ratings yet
Chapter-5-Database-Information-Management
39 pages
The Case For Learned Index Structures: Work Done While Author Was Affiliated With Google
No ratings yet
The Case For Learned Index Structures: Work Done While Author Was Affiliated With Google
27 pages
Cs3492 Dbms Qb
No ratings yet
Cs3492 Dbms Qb
43 pages
Chapter 12 -2 (1)
No ratings yet
Chapter 12 -2 (1)
38 pages
Database Design Process: Duration: 12hrs
No ratings yet
Database Design Process: Duration: 12hrs
27 pages
CH 1
No ratings yet
CH 1
31 pages
CS3492 Database Management Systems 2 Mark Question & Answer
No ratings yet
CS3492 Database Management Systems 2 Mark Question & Answer
49 pages
Chapter 1: Introduction: Database System Concepts, 5th Ed
No ratings yet
Chapter 1: Introduction: Database System Concepts, 5th Ed
42 pages
Example Databases Overview of Concepts Why Use Database Systems
No ratings yet
Example Databases Overview of Concepts Why Use Database Systems
9 pages
CS3492-QB
No ratings yet
CS3492-QB
43 pages
Research Paper 5 2004
No ratings yet
Research Paper 5 2004
15 pages
CH 1
No ratings yet
CH 1
34 pages
CH 1
No ratings yet
CH 1
34 pages
CH 1
No ratings yet
CH 1
40 pages
Fundamentals of Database Systems: Lesson 1: Introduction To Databases
No ratings yet
Fundamentals of Database Systems: Lesson 1: Introduction To Databases
22 pages
Advanced Database System Chapter Three Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Three Query Processing and Optimization
94 pages
DATABASE MANAGEMENT SYSTEM-DBMS-question-bank
No ratings yet
DATABASE MANAGEMENT SYSTEM-DBMS-question-bank
44 pages
Chapter 1: Introduction: Database System Concepts, 5th Ed
No ratings yet
Chapter 1: Introduction: Database System Concepts, 5th Ed
31 pages
Chapter 1: Introduction: Database System Concepts, 5th Ed
No ratings yet
Chapter 1: Introduction: Database System Concepts, 5th Ed
28 pages
Indexing & Hashing
No ratings yet
Indexing & Hashing
65 pages
D2 qb
No ratings yet
D2 qb
32 pages
CS3492-QB (1)
No ratings yet
CS3492-QB (1)
44 pages
Name: Ignacio, James Louis R. Assignment #: 5 Course Code: Activity Title
No ratings yet
Name: Ignacio, James Louis R. Assignment #: 5 Course Code: Activity Title
3 pages
10050302 Sem 3 DBMS
No ratings yet
10050302 Sem 3 DBMS
3 pages
Advanced Database System Course Outline
No ratings yet
Advanced Database System Course Outline
2 pages
Chapter 15: Query Processing
No ratings yet
Chapter 15: Query Processing
36 pages
Introduction To Database: Edited: Wei-Pang Yang, IM - NDHU
No ratings yet
Introduction To Database: Edited: Wei-Pang Yang, IM - NDHU
56 pages
DBMS Lab Manual - 23-24
No ratings yet
DBMS Lab Manual - 23-24
64 pages
CS3492 QB
No ratings yet
CS3492 QB
43 pages
CSE5003 - DAT ABA Se Syste MS: DES IGN A ND I M PLE Ment Atio N L, T, P, J, C 2,0,2,4,4
No ratings yet
CSE5003 - DAT ABA Se Syste MS: DES IGN A ND I M PLE Ment Atio N L, T, P, J, C 2,0,2,4,4
9 pages
Lecture 1 DBMS Concepts and Architecture Introduction of DBMS
No ratings yet
Lecture 1 DBMS Concepts and Architecture Introduction of DBMS
16 pages
Range of Quiz 2: Database System Implementation
No ratings yet
Range of Quiz 2: Database System Implementation
3 pages
Database Systems
No ratings yet
Database Systems
112 pages
1 Introduction
No ratings yet
1 Introduction
20 pages
6.1 Emerging Databases
No ratings yet
6.1 Emerging Databases
18 pages
Introduction To Dbms
No ratings yet
Introduction To Dbms
28 pages
Chapter 12: Query Processing
No ratings yet
Chapter 12: Query Processing
57 pages
OBIEE Interview Questions Part - 1
No ratings yet
OBIEE Interview Questions Part - 1
5 pages
Chapter 15: Query Processing
No ratings yet
Chapter 15: Query Processing
41 pages
Module - 1 DBMS Notes
No ratings yet
Module - 1 DBMS Notes
34 pages
Cs9152 DBT Unit V Notes
No ratings yet
Cs9152 DBT Unit V Notes
42 pages
DBMS
No ratings yet
DBMS
1 page
CH 1 335305600 Ch1 Introduction Database System Concepts
No ratings yet
CH 1 335305600 Ch1 Introduction Database System Concepts
27 pages
CS8492 DataBase Management Systems Question Bank Watermark
No ratings yet
CS8492 DataBase Management Systems Question Bank Watermark
213 pages
DATA MANAGEMENT II Lecture I
No ratings yet
DATA MANAGEMENT II Lecture I
18 pages
Dbms-IV Sem II Bcom (1)
No ratings yet
Dbms-IV Sem II Bcom (1)
57 pages
Advanced Database Systems: Prerequisite of ADS
No ratings yet
Advanced Database Systems: Prerequisite of ADS
4 pages
Bcse302l Database-Systems TH 1.0 67 Bcse302l
No ratings yet
Bcse302l Database-Systems TH 1.0 67 Bcse302l
3 pages
Learn SQL in 24 Hours
From Everand
Learn SQL in 24 Hours
Alex Nordeen
5/5 (4)
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Chromium Hmi Device User S Manual 4189341234 Uk
No ratings yet
Chromium Hmi Device User S Manual 4189341234 Uk
18 pages
Multiple Choice Questions
No ratings yet
Multiple Choice Questions
285 pages
Multi-Monitor Free Flow 1 8
No ratings yet
Multi-Monitor Free Flow 1 8
2 pages
AgentForce+PPT
No ratings yet
AgentForce+PPT
75 pages
CCNA 200-301 Official Cert Guide, Volume 2-25
No ratings yet
CCNA 200-301 Official Cert Guide, Volume 2-25
3 pages
Pensar Rápido, Pensar Despacio [ Think Fast, Think Slowly] _ Free Download, Borrow, and Streaming _ Internet Archive
No ratings yet
Pensar Rápido, Pensar Despacio [ Think Fast, Think Slowly] _ Free Download, Borrow, and Streaming _ Internet Archive
3 pages
Cable Tray Management Catalog - BLINE COOPER
No ratings yet
Cable Tray Management Catalog - BLINE COOPER
454 pages
MM PDF
No ratings yet
MM PDF
228 pages
Eset Nod32 Keys 10 (19 November 2016, Updated)
0% (1)
Eset Nod32 Keys 10 (19 November 2016, Updated)
5 pages
Climaveneta Accurate Close Control Server Units
No ratings yet
Climaveneta Accurate Close Control Server Units
32 pages
Vollmann and Buffa (1966)
No ratings yet
Vollmann and Buffa (1966)
20 pages
Tapping Calc Menu
No ratings yet
Tapping Calc Menu
61 pages
Final Exam
100% (1)
Final Exam
14 pages
Drops
No ratings yet
Drops
2 pages
Java 1 Lab
No ratings yet
Java 1 Lab
32 pages
2 DPR PDF
No ratings yet
2 DPR PDF
42 pages
Chapter 7: Risk Exposures and The Internal Control Structure
No ratings yet
Chapter 7: Risk Exposures and The Internal Control Structure
35 pages
Texas TI89 92
No ratings yet
Texas TI89 92
3 pages
Sampling Design: Kothari - Chapter 4
No ratings yet
Sampling Design: Kothari - Chapter 4
27 pages
SMC 5100-XX-IT Toxic Gas Detector Manual
No ratings yet
SMC 5100-XX-IT Toxic Gas Detector Manual
64 pages
BookMindAI English
No ratings yet
BookMindAI English
7 pages
CAR1248FP
No ratings yet
CAR1248FP
2 pages
csc-363 Final Project Rewards
No ratings yet
csc-363 Final Project Rewards
2 pages
Tax Invoice
No ratings yet
Tax Invoice
3 pages
Facing The Cold Start Problem in Recommender Systems PDF
No ratings yet
Facing The Cold Start Problem in Recommender Systems PDF
9 pages
DDoad A Document - Scribd
No ratings yet
DDoad A Document - Scribd
2 pages
USN Dayananda Sagar College of Engineering: (An Autonomous Institute Affiliated To VTU, Belagavi)
No ratings yet
USN Dayananda Sagar College of Engineering: (An Autonomous Institute Affiliated To VTU, Belagavi)
2 pages
YourName YourStudentID BSBWOR203 Assessment 1
No ratings yet
YourName YourStudentID BSBWOR203 Assessment 1
19 pages
Durga 1 (1)
No ratings yet
Durga 1 (1)
13 pages
Healthcare Office Assistant Professional Program Outline
No ratings yet
Healthcare Office Assistant Professional Program Outline
5 pages

Query Processing, Optimization, and Indexing Techniques

Uploaded by

Query Processing, Optimization, and Indexing Techniques

Uploaded by

Query Processing, optimization, and

What’s this tutorial about?

Cost of query evaluation is generally measured as total elapsed

What’s this tutorial about?

Factors that influence the efficiency:

Composite search keys

How is the query processed?

We start with the second part …

1. Parsing and translation

Logical Query Plan

SQL query is translated into a relational algebra expression

Multiple Transformations (Cont.)

In left-deep join trees, the right-hand-side input for each join is

Physical Query Plan

For all relational algebra expressions

Best choice highly depends on

Physical query plan = logical query plan + choice of implementation

Query Optimization: Amongst all equivalent evaluation plans choose

Indexing and Hashing

Indexing mechanisms used to speed up access to desired data.

Index Evaluation Metrics

Access types supported efficiently. E.g.,

Dense Index Files

Dense index — Index record appears for every search-key value in

Sparse Index: contains index records for only some search-key

Sparse Index Files (Cont.)

Compared to dense indices:

Multilevel Index (Cont.)

Frequently, one wants to find all the records whose values in a

Secondary Indices Example

Secondary index on balance field of account

Index record points to a bucket that contains pointers to all the

Indices offer substantial benefits when searching for records.

B+-Tree Index Files

B+-tree indices are an alternative to indexed-sequential files.

Disadvantage of indexed-sequential files

A B+-tree is a rooted tree satisfying the following properties:

All paths from root to leaf are of the same length

B+-Tree Node Structure

Ki are the search-key values

Properties of a leaf node:

Non-Leaf Nodes in B+-Trees

B+-tree for account file (n = 3)

B+-tree for account file (n = 5)

Leaf nodes must have between 2 and 4 values

Queries on B+-Trees (Cont.)

B+-Tree before and after insertion of “Clearview”

Examples of B+-Tree Deletion

Before and after deleting “Downtown”

Deletion of “Perryridge” from result of previous example

Example of B+-tree Deletion (Cont.)

Before and after deletion of “Perryridge” from earlier example

Index file degradation problem is solved by using B+-Tree indices.

B+-Tree File Organization (Cont.)

Example of B+-tree File Organization

Similar to B+-tree, but B-tree allows search-key values to

Nonleaf node – pointers Bi are the bucket or file record

B-Tree Index File Example

B-tree (above) and B+-tree (below) on same data

Use multiple indices for certain types of queries.

Indices on Multiple Keys

Example: External Sorting Using Sort-

Several different algorithms to implement joins

To compute the theta join r θs

Variant of nested-loop join in which every block of inner relation is

Indexed Nested-Loop Join

Index lookups can replace file scans if

The hash-join of r and s is computed as follows.

Relation s is called the build input and

So far: we have seen algorithms for individual operations

Materialized evaluation: evaluate one operation at a time,

Pipelined evaluation : evaluate several operations simultaneously,

σ balance< 2500 (account )

You might also like