SlideShare a Scribd company logo
Query Processing and
Optimization
Introduction
• In this chapter we shall discuss the techniques
used by a DBMS to process, optimize and execute
high-level queries.
• The techniques used to split complex queries into
multiple simple operations and methods of
implementing these low-level operations.
• The query optimization techniques are used to
chose an efficient execution plan that will
minimize the runtime as well as many other types
of resources such as number of disk I/O, CPU
time and so on.
Query Processing
• Query Processing is a procedure of transforming a high-level
query (such as SQL) into a correct and efficient execution plan
expressed in low-level language.
• When a database system receives a query for update or
retrieval of information, it goes through a series of
compilation steps, called execution plan.
• Query processing goes through various phases:
• first phase is called syntax checking phase, the system parses
the query and checks that it follows the syntax rules or not.
• It then matches the objects in the query syntax with the view
tables and columns listed in the system table.
• In second phase the SQL query is translated in to an algebraic
expression using various rules.
• So that the process of transforming a high-level SQL query into a
relational algebraic form is called Query Decomposition.
• The relational algebraic expression now passes to the query
optimizer.
• In third phase optimization is performed by substituting equivalent
expression depends on the factors such that the existence of
certain database structures, whether or not a given file is stored,
the presence of different indexes & so on.
• Query optimization module work in tandem with the join manager
module to improve the order in which joins are performed.
• At this stage the cost model and several other
estimation formulas are used to rewrite the query.
• The modified query is written to utilize system
resources so as to bring the optimal performance.
• The query optimizer then generates an action plan also
called a execution plan.
• This action plans are converted into a query codes that
are finally executed by a run time database processor.
• The run time database processor estimate the cost of
each action plan and chose the optimal one for the
execution.
Query Analyzer
• The syntax analyzer takes the query from the
users, parses it into tokens and analyses the
tokens and their order to make sure they
follow the rules of the language grammar.
• Is an error is found in the query submitted by
the user, it is rejected and an error code
together with an explanation of why the query
was rejected is return to the user.
Query Decomposition
• In query decomposition the query processing aims are to transfer
the high-level query into a relational algebra query and to check
whether that query is syntactically and semantically correct.
• Thus the query decomposition is start with a high-level query and
transform into query graph of low-level operations, which satisfy
the query.
• The SQL query is decomposed into query blocks (low-level
operations), which form the basic unit.
• Hence nested queries within a query are identified as separate
query blocks.
• The query decomposer goes through five stages of processing for
decomposition into low-level operation and translation into
algebraic expressions.
Query processing and optimization (updated)
Query Analysis
• During the query analysis phase, the query is
syntactically analyzed using the programming
language compiler (parser).
• A syntactically legal query is then validated,
using the system catalog, to ensure that all
data objects (relations and attributes) referred
to by the query are defined in the database.
• The type specification of the query qualifiers
and result is also checked at this stage.
• Example:
SELECT emp_nm FROM EMPLOYEE WHERE
emp_desg>100
This query will be rejected because the comparison
“>100” is incompatible with the data type of
emp_desg which is a variable character string.
• At the end of query analysis phase, the high-level query
(SQL) is transformed into some internal representation
that is more suitable for processing.
• This internal representation is typically a kind of query
tree.
• A Query Tree is a tree data structure that
corresponds expression.
• A Query Tree is also called a relational algebra
tree.
– Leaf node of the tree, representing the base
input relations of the query.
– Internal nodes result of applying an operation in
the algebra.
– Root of the tree representing a result of the query.
• SELECT (P.proj_no, P.dept_no, E.name, E.add,
E.dob)
• FROM PROJECT P, DEPARTMENT D, EMPLOYEE
E
• WHERE P.dept_no = D.d_no AND D.mgr_id =
E.emp_id AND P.proj_loc = ‘Mumbai’ ;
Query processing and optimization (updated)
• The three relations PROJECT, DEPARTMENT,
EMPLOYEE are represent as a leaf nodes P, D
and E, while the relational algebra operations
of the represented by internal tree nodes.
• Same SQL query can have man different
relational algebra expressions and hence
many different query trees.
• The query parser typically generates a
standard initial (canonical) query tree.
Query Normalization
• The primary phase of the normalization is to avoid
redundancy.
• The normalization phase converts the query into a
normalized form that can be more easily manipulated.
• In the normalization phase, a set of equivalency rules
are applied so that the projection and selection
operations included on the query are simplified to
avoid redundancy.
• The projection operation corresponds to the SELECT
clause of SQL query and the selection operation
correspond to the predicate found in WHERE clause.
• The equivalency transformation rules that are applied.
Semantic Analyzer
• The objective of this phase of query processing is to reduce the number of
predicates.
• The semantic analyzer rejects the normalized queries that are incorrectly
formulated.
• A query is incorrectly formulated if components do not contribute to the
generation of result.
• This happens in case of missing join specification.
• A query is contradictory if its predicate cannot satisfy by any tuple in the relation.
• The semantic analyzer examine the relational calculus query (SQL) to make sure it
contains only data objects that is table, columns, views, indexes that are defined in
the database catalog.
• It makes sure that each object in the query is referenced correctly according to its
data type.
• In case of missing join specifications the components do not contribute to the
generation of the results, and thus, a query may be incorrect formulated.
Query Simplifier
• The objectives of this phase are to detect redundant
qualification, eliminate common sub-expressions and
transform sub-graph to semantically equivalent but
more easy and efficiently computed form.
• Why to simplify?
– Commonly integrity constraints, view definitions and
access restrictions are introduced into the graph at this
stage of analysis so that the query must be simplified as
much as possible.
– Integrity constraints defines constants which must holds
for all state of database, so any query that contradict an
integrity constraints must be avoid and can be rejected
without accessing the database.
Query Restructuring
• In the final stage of the query decomposition,
the query can be restructured to give a more
efficient implementation.
• Transformation rules are used to convert one
relational algebra expression into an
equivalent form that is more efficient.
• The query can now be regarded as a relational
algebra program, consisting of a series of
operations on relation.
Query Optimization
• The primary goal of query optimization is of
choosing an efficient execution strategy for
processing a query.
• The query optimizer attempts to minimize the
use of certain resources (mainly the number of
I/O and CPU time) by selecting a best execution
plan (access plan).
• A query optimization start during the validation
phase by the system to validate the user has
appropriate privileges.
• Now an action plan is generate to perform the
query.
Block Diagram of Query Optimization
• Relational algebra query tree generated by the query
simplifier module of query decomposer.
• Estimation formulas used to determine the cardinality
of the intermediate result table.
• A cost Model.
• Statistical data from the database catalogue.
The output of the query optimizer is the execution plan
in form of optimized relational algebra query.
A query typically has many possible execution
strategies, and the process of choosing a suitable one
for processing a query is known as Query Optimization.
The basic issues in Query Optimization
• How to use available indexes?
• How to use memory to accumulate
information and perform immediate steps
such as sorting?
• How to determine the order in which joins
should be performed?
Objective of query optimization
• The term query optimization does not mean
giving always an optimal (best) strategy as the
execution plan.
• It is just a responsibly efficient strategy for
execution of the query.
• The decomposed query block of SQL is
translating into an equivalent extended
relational algebra expression and then
optimized.
Techniques for Query Optimization
• The first technique is based on Heuristic Rules
for ordering the operations in a query
execution strategy.
• The second technique involves the systematic
estimation of the cost of the different
execution strategies and choosing the
execution plan with the lowest cost.
• Semantic query optimization is used with the
combination with the heuristic query
transformation rules.
• It uses constraints specified on the database
schema such as unique attributes and other
more complex constraints, in order to modify
one query into another query that is more
efficient to execute.
Heuristic Rules
• The heuristic rules are used as an optimization
technique to modify the internal representation
of query.
• Usually, heuristic rules are used in the form of
query tree of query graph data structure, to
improve its performance.
• One of the main heuristic rule is to apply SELECT
operation before applying the JOIN or other
BINARY operations.
• This is because the size of the file resulting from a
binary operation such as JOIN is usually a multi-
value function of the sizes of the input files.
Heuristic Rules
• The SELECT and PROJECT reduced the size of
the file and hence, should be applied before
the JOIN or other binary operation.
• Heuristic query optimizer transforms the
initial (canonical) query tree into final query
tree using equivalence transformation rules.
• This final query tree is efficient to execute.
General Transformation Rules
• Cascade of σ :-
• σ c1 AND c2 AND …AND cn (R) = σ c1 (σ c2 (…(σ cn (R))…))
• Commutativity of σ :-
• σ C1 (σ C2 (R)) = σ C2 (σ C1 (R))
Cascade of Л :-
• Л List1 (Л List2 (…(Л List n (R))…)) = Л List1 (R)
• Commuting σ with Л :-
• Л A1,A2,A3…An (σ C (R) ) = σ C (Л A1,A2,A3…An (R))
General Transformation Rules
Commutativity of ⋈ AND x :-
R ⋈ c S = S ⋈ c R
R x S = S x R
• Commuting σ with ⋈ or x :-
If all attributes in selection condition c involved only attributes of one of the relation
schemas (R).
σ c (R ⋈ S) = (σ c (R) ) ⋈ S
Alternatively, selection condition c can be written as (c1 AND c2) where condition c1
involves only attributes of R and condition c2 involves only attributes of S then :
σ c (R ⋈ S) = (σ c1 (R) ) ⋈ (σ c2 (S) )
• Commuting Л with ⋈ or x :-
The projection list L = {A1,A2,..An,B1,B2,…Bm}.
A1…An attributes of R and B1…Bm attributes of S.
Join condition C involves only attributes in L then :
ЛL ( R ⋈ c S ) = ( ЛA1,…An (R) ) ⋈c ( ЛB1,…Bm(S) )
General Transformation Rules
• Commutative of SET Operation :-
R ⋃ S = S ⋃ R
R ⋂ S = S ⋂ R
Minus (R-S) is not commutative.
• Associatively of ⋈, x, ⋂, and ⋃ :-
If ∅ stands for any one of these operation throughout the expression then :
(R ∅ S) ∅ T = R ∅ (S ∅ T)
• Commutativity of σ with SET Operation :-
If ∅ stands for any one of three operations (⋃,⋂,and-) then :
σ c (R ∅ S) = (σ c (R)) ⋃ (σ c (S))
Л c (R ∅ S) = (Л c (R)) ⋃ (Лc (S))
• The Л operation comute with ⋃ :-
Л L (R ⋃ S) = (Л L(R)) ⋃ (Л L(S))
• Converting a (σ,x) sequence with ⋃
(σ c (R x S)) = (R ⋈ c S)
THANK YOU

More Related Content

What's hot (20)

PPTX
Distributed dbms architectures
Pooja Dixit
 
PPTX
Distributed DBMS - Unit 1 - Introduction
Gyanmanjari Institute Of Technology
 
PPT
Distributed Database System
Sulemang
 
PPTX
Distributed database
ReachLocal Services India
 
PPTX
Distributed design alternatives
Pooja Dixit
 
PPTX
Concurrency control
Subhasish Pati
 
PDF
Symbol table in compiler Design
Kuppusamy P
 
PPTX
Relational model
Dabbal Singh Mahara
 
PPTX
Database , 12 Reliability
Ali Usman
 
PDF
Token, Pattern and Lexeme
A. S. M. Shafi
 
PPTX
Distributed Database Management System
AAKANKSHA JAIN
 
PPTX
Lock based protocols
ChethanMp7
 
PPT
14. Query Optimization in DBMS
koolkampus
 
PPTX
Kdd process
Rajesh Chandra
 
PPT
Database fragmentation
Punjab College Of Technical Education
 
PPTX
Concurrency Control in Database Management System
Janki Shah
 
PPSX
Parallel Database
VESIT/University of Mumbai
 
PDF
Run time storage
Rasineni Madhan Mohan Naidu
 
PPTX
Data cubes
Mohammed
 
Distributed dbms architectures
Pooja Dixit
 
Distributed DBMS - Unit 1 - Introduction
Gyanmanjari Institute Of Technology
 
Distributed Database System
Sulemang
 
Distributed database
ReachLocal Services India
 
Distributed design alternatives
Pooja Dixit
 
Concurrency control
Subhasish Pati
 
Symbol table in compiler Design
Kuppusamy P
 
Relational model
Dabbal Singh Mahara
 
Database , 12 Reliability
Ali Usman
 
Token, Pattern and Lexeme
A. S. M. Shafi
 
Distributed Database Management System
AAKANKSHA JAIN
 
Lock based protocols
ChethanMp7
 
14. Query Optimization in DBMS
koolkampus
 
Kdd process
Rajesh Chandra
 
Concurrency Control in Database Management System
Janki Shah
 
Parallel Database
VESIT/University of Mumbai
 
Run time storage
Rasineni Madhan Mohan Naidu
 
Data cubes
Mohammed
 

Viewers also liked (17)

PPTX
Cost estimation for Query Optimization
Ravinder Kamboj
 
PPTX
Query processing
Ravinder Kamboj
 
PDF
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Beat Signer
 
PPTX
Query processing and Query Optimization
Niraj Gandha
 
PPTX
Normalization of Data Base
Ravinder Kamboj
 
PPT
Data Warehousing and Data Mining
idnats
 
PPTX
Lecture 1&2(rdbms-ii)
Ravinder Kamboj
 
DOCX
비아그라 판매 =<7cc.kr>=비아그라 정품 판매~비아그라판매±비아그라 정품판매∏비아그라 인터넷구입,프릴리지 인터넷구입,흥분제 인터...
成 金
 
PDF
PROMPERU - guia de mercado Ecuador
agroalimentaria.pe
 
PDF
MINAGRI - siea 2015
agroalimentaria.pe
 
PDF
MINAGRI - agroindustrial 2015
agroalimentaria.pe
 
PPT
Le molecole biologiche - didattica differenziata
Iacopo Pappalardo
 
PPTX
Sandra m lenguaje señas
maria isabel ararat tirado
 
PDF
Lecture 5 6_7 - divide and conquer and method of solving recurrences
jayavignesh86
 
ODP
BIS05 Introduction to SQL
Prithwis Mukerjee
 
PPTX
Event Driven Automation Meetup May 14/2015
Dmitri Zimine
 
Cost estimation for Query Optimization
Ravinder Kamboj
 
Query processing
Ravinder Kamboj
 
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
Beat Signer
 
Query processing and Query Optimization
Niraj Gandha
 
Normalization of Data Base
Ravinder Kamboj
 
Data Warehousing and Data Mining
idnats
 
Lecture 1&2(rdbms-ii)
Ravinder Kamboj
 
비아그라 판매 =<7cc.kr>=비아그라 정품 판매~비아그라판매±비아그라 정품판매∏비아그라 인터넷구입,프릴리지 인터넷구입,흥분제 인터...
成 金
 
PROMPERU - guia de mercado Ecuador
agroalimentaria.pe
 
MINAGRI - siea 2015
agroalimentaria.pe
 
MINAGRI - agroindustrial 2015
agroalimentaria.pe
 
Le molecole biologiche - didattica differenziata
Iacopo Pappalardo
 
Sandra m lenguaje señas
maria isabel ararat tirado
 
Lecture 5 6_7 - divide and conquer and method of solving recurrences
jayavignesh86
 
BIS05 Introduction to SQL
Prithwis Mukerjee
 
Event Driven Automation Meetup May 14/2015
Dmitri Zimine
 
Ad

Similar to Query processing and optimization (updated) (20)

PPTX
Query processing and optimization on dbms
ar1289589
 
PPTX
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
PPTX
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
PPTX
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
PPTX
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
PPT
Query optimization and processing for advanced database systems
meharikiros2
 
PPTX
Lecture 5.pptx
Shafii8
 
PPT
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
ahmed518927
 
PPT
ch02-240507064009-ac337bf1 .ppt
iamayesha2526
 
PPTX
Advanced Database System Chapter Two Query processing and Optimization.pptx
mentesnotsibatuuu
 
PDF
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
alemunuruhak9
 
PDF
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
beshahashenafe20
 
PPTX
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
PDF
dd presentation.pdf
AnSHiKa187943
 
PPTX
Query Processingin database management systems.pptx
S.A. ENGINEERING COLLEGE
 
PPTX
Query processing
Deepak Singh
 
PPTX
Query optimization
Zunera Bukhari
 
PPTX
Query-porcessing-& Query optimization
Saranya Natarajan
 
PPTX
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
PPTX
Mc seminar
Ankit Anand
 
Query processing and optimization on dbms
ar1289589
 
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
Query optimization and processing for advanced database systems
meharikiros2
 
Lecture 5.pptx
Shafii8
 
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
ahmed518927
 
ch02-240507064009-ac337bf1 .ppt
iamayesha2526
 
Advanced Database System Chapter Two Query processing and Optimization.pptx
mentesnotsibatuuu
 
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
alemunuruhak9
 
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
beshahashenafe20
 
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
dd presentation.pdf
AnSHiKa187943
 
Query Processingin database management systems.pptx
S.A. ENGINEERING COLLEGE
 
Query processing
Deepak Singh
 
Query optimization
Zunera Bukhari
 
Query-porcessing-& Query optimization
Saranya Natarajan
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
Mc seminar
Ankit Anand
 
Ad

More from Ravinder Kamboj (10)

PPTX
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
PPTX
DDBMS
Ravinder Kamboj
 
PPTX
Architecture of dbms(lecture 3)
Ravinder Kamboj
 
PPTX
Sql fundamentals
Ravinder Kamboj
 
PPTX
Java script
Ravinder Kamboj
 
PPT
File Management
Ravinder Kamboj
 
PPTX
HTML Forms
Ravinder Kamboj
 
PPTX
DHTML
Ravinder Kamboj
 
PPTX
CSA lecture-1
Ravinder Kamboj
 
PPTX
Relational database management system (rdbms) i
Ravinder Kamboj
 
Data warehouse,data mining & Big Data
Ravinder Kamboj
 
Architecture of dbms(lecture 3)
Ravinder Kamboj
 
Sql fundamentals
Ravinder Kamboj
 
Java script
Ravinder Kamboj
 
File Management
Ravinder Kamboj
 
HTML Forms
Ravinder Kamboj
 
CSA lecture-1
Ravinder Kamboj
 
Relational database management system (rdbms) i
Ravinder Kamboj
 

Recently uploaded (20)

PPTX
How to Create a Customer From Website in Odoo 18.pptx
Celine George
 
PPTX
ENGlish 8 lesson presentation PowerPoint.pptx
marawehsvinetshe
 
PDF
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
PDF
epi editorial commitee meeting presentation
MIPLM
 
PPTX
EDUCATIONAL MEDIA/ TEACHING AUDIO VISUAL AIDS
Sonali Gupta
 
PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PPTX
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 
PPTX
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
PDF
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
PPTX
Nitrogen rule, ring rule, mc lafferty.pptx
nbisen2001
 
PPTX
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
PPTX
Difference between write and update in odoo 18
Celine George
 
PPTX
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
PPTX
Introduction to Indian Writing in English
Trushali Dodiya
 
PPT
Indian Contract Act 1872, Business Law #MBA #BBA #BCOM
priyasinghy107
 
PDF
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
PPTX
SD_GMRC5_Session 6AB_Dulog Pedagohikal at Pagtataya (1).pptx
NickeyArguelles
 
PDF
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
PPTX
ENG8_Q1_WEEK2_LESSON1. Presentation pptx
marawehsvinetshe
 
PPTX
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
How to Create a Customer From Website in Odoo 18.pptx
Celine George
 
ENGlish 8 lesson presentation PowerPoint.pptx
marawehsvinetshe
 
Governor Josh Stein letter to NC delegation of U.S. House
Mebane Rash
 
epi editorial commitee meeting presentation
MIPLM
 
EDUCATIONAL MEDIA/ TEACHING AUDIO VISUAL AIDS
Sonali Gupta
 
Horarios de distribución de agua en julio
pegazohn1978
 
Post Dated Cheque(PDC) Management in Odoo 18
Celine George
 
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
The History of Phone Numbers in Stoke Newington by Billy Thomas
History of Stoke Newington
 
Nitrogen rule, ring rule, mc lafferty.pptx
nbisen2001
 
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
Difference between write and update in odoo 18
Celine George
 
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
Introduction to Indian Writing in English
Trushali Dodiya
 
Indian Contract Act 1872, Business Law #MBA #BBA #BCOM
priyasinghy107
 
Exploring the Different Types of Experimental Research
Thelma Villaflores
 
SD_GMRC5_Session 6AB_Dulog Pedagohikal at Pagtataya (1).pptx
NickeyArguelles
 
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
ENG8_Q1_WEEK2_LESSON1. Presentation pptx
marawehsvinetshe
 
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 

Query processing and optimization (updated)

  • 2. Introduction • In this chapter we shall discuss the techniques used by a DBMS to process, optimize and execute high-level queries. • The techniques used to split complex queries into multiple simple operations and methods of implementing these low-level operations. • The query optimization techniques are used to chose an efficient execution plan that will minimize the runtime as well as many other types of resources such as number of disk I/O, CPU time and so on.
  • 3. Query Processing • Query Processing is a procedure of transforming a high-level query (such as SQL) into a correct and efficient execution plan expressed in low-level language. • When a database system receives a query for update or retrieval of information, it goes through a series of compilation steps, called execution plan. • Query processing goes through various phases: • first phase is called syntax checking phase, the system parses the query and checks that it follows the syntax rules or not. • It then matches the objects in the query syntax with the view tables and columns listed in the system table.
  • 4. • In second phase the SQL query is translated in to an algebraic expression using various rules. • So that the process of transforming a high-level SQL query into a relational algebraic form is called Query Decomposition. • The relational algebraic expression now passes to the query optimizer. • In third phase optimization is performed by substituting equivalent expression depends on the factors such that the existence of certain database structures, whether or not a given file is stored, the presence of different indexes & so on. • Query optimization module work in tandem with the join manager module to improve the order in which joins are performed.
  • 5. • At this stage the cost model and several other estimation formulas are used to rewrite the query. • The modified query is written to utilize system resources so as to bring the optimal performance. • The query optimizer then generates an action plan also called a execution plan. • This action plans are converted into a query codes that are finally executed by a run time database processor. • The run time database processor estimate the cost of each action plan and chose the optimal one for the execution.
  • 6. Query Analyzer • The syntax analyzer takes the query from the users, parses it into tokens and analyses the tokens and their order to make sure they follow the rules of the language grammar. • Is an error is found in the query submitted by the user, it is rejected and an error code together with an explanation of why the query was rejected is return to the user.
  • 7. Query Decomposition • In query decomposition the query processing aims are to transfer the high-level query into a relational algebra query and to check whether that query is syntactically and semantically correct. • Thus the query decomposition is start with a high-level query and transform into query graph of low-level operations, which satisfy the query. • The SQL query is decomposed into query blocks (low-level operations), which form the basic unit. • Hence nested queries within a query are identified as separate query blocks. • The query decomposer goes through five stages of processing for decomposition into low-level operation and translation into algebraic expressions.
  • 9. Query Analysis • During the query analysis phase, the query is syntactically analyzed using the programming language compiler (parser). • A syntactically legal query is then validated, using the system catalog, to ensure that all data objects (relations and attributes) referred to by the query are defined in the database. • The type specification of the query qualifiers and result is also checked at this stage.
  • 10. • Example: SELECT emp_nm FROM EMPLOYEE WHERE emp_desg>100 This query will be rejected because the comparison “>100” is incompatible with the data type of emp_desg which is a variable character string. • At the end of query analysis phase, the high-level query (SQL) is transformed into some internal representation that is more suitable for processing. • This internal representation is typically a kind of query tree.
  • 11. • A Query Tree is a tree data structure that corresponds expression. • A Query Tree is also called a relational algebra tree. – Leaf node of the tree, representing the base input relations of the query. – Internal nodes result of applying an operation in the algebra. – Root of the tree representing a result of the query.
  • 12. • SELECT (P.proj_no, P.dept_no, E.name, E.add, E.dob) • FROM PROJECT P, DEPARTMENT D, EMPLOYEE E • WHERE P.dept_no = D.d_no AND D.mgr_id = E.emp_id AND P.proj_loc = ‘Mumbai’ ;
  • 14. • The three relations PROJECT, DEPARTMENT, EMPLOYEE are represent as a leaf nodes P, D and E, while the relational algebra operations of the represented by internal tree nodes. • Same SQL query can have man different relational algebra expressions and hence many different query trees. • The query parser typically generates a standard initial (canonical) query tree.
  • 15. Query Normalization • The primary phase of the normalization is to avoid redundancy. • The normalization phase converts the query into a normalized form that can be more easily manipulated. • In the normalization phase, a set of equivalency rules are applied so that the projection and selection operations included on the query are simplified to avoid redundancy. • The projection operation corresponds to the SELECT clause of SQL query and the selection operation correspond to the predicate found in WHERE clause. • The equivalency transformation rules that are applied.
  • 16. Semantic Analyzer • The objective of this phase of query processing is to reduce the number of predicates. • The semantic analyzer rejects the normalized queries that are incorrectly formulated. • A query is incorrectly formulated if components do not contribute to the generation of result. • This happens in case of missing join specification. • A query is contradictory if its predicate cannot satisfy by any tuple in the relation. • The semantic analyzer examine the relational calculus query (SQL) to make sure it contains only data objects that is table, columns, views, indexes that are defined in the database catalog. • It makes sure that each object in the query is referenced correctly according to its data type. • In case of missing join specifications the components do not contribute to the generation of the results, and thus, a query may be incorrect formulated.
  • 17. Query Simplifier • The objectives of this phase are to detect redundant qualification, eliminate common sub-expressions and transform sub-graph to semantically equivalent but more easy and efficiently computed form. • Why to simplify? – Commonly integrity constraints, view definitions and access restrictions are introduced into the graph at this stage of analysis so that the query must be simplified as much as possible. – Integrity constraints defines constants which must holds for all state of database, so any query that contradict an integrity constraints must be avoid and can be rejected without accessing the database.
  • 18. Query Restructuring • In the final stage of the query decomposition, the query can be restructured to give a more efficient implementation. • Transformation rules are used to convert one relational algebra expression into an equivalent form that is more efficient. • The query can now be regarded as a relational algebra program, consisting of a series of operations on relation.
  • 19. Query Optimization • The primary goal of query optimization is of choosing an efficient execution strategy for processing a query. • The query optimizer attempts to minimize the use of certain resources (mainly the number of I/O and CPU time) by selecting a best execution plan (access plan). • A query optimization start during the validation phase by the system to validate the user has appropriate privileges. • Now an action plan is generate to perform the query.
  • 20. Block Diagram of Query Optimization
  • 21. • Relational algebra query tree generated by the query simplifier module of query decomposer. • Estimation formulas used to determine the cardinality of the intermediate result table. • A cost Model. • Statistical data from the database catalogue. The output of the query optimizer is the execution plan in form of optimized relational algebra query. A query typically has many possible execution strategies, and the process of choosing a suitable one for processing a query is known as Query Optimization.
  • 22. The basic issues in Query Optimization • How to use available indexes? • How to use memory to accumulate information and perform immediate steps such as sorting? • How to determine the order in which joins should be performed?
  • 23. Objective of query optimization • The term query optimization does not mean giving always an optimal (best) strategy as the execution plan. • It is just a responsibly efficient strategy for execution of the query. • The decomposed query block of SQL is translating into an equivalent extended relational algebra expression and then optimized.
  • 24. Techniques for Query Optimization • The first technique is based on Heuristic Rules for ordering the operations in a query execution strategy. • The second technique involves the systematic estimation of the cost of the different execution strategies and choosing the execution plan with the lowest cost.
  • 25. • Semantic query optimization is used with the combination with the heuristic query transformation rules. • It uses constraints specified on the database schema such as unique attributes and other more complex constraints, in order to modify one query into another query that is more efficient to execute.
  • 26. Heuristic Rules • The heuristic rules are used as an optimization technique to modify the internal representation of query. • Usually, heuristic rules are used in the form of query tree of query graph data structure, to improve its performance. • One of the main heuristic rule is to apply SELECT operation before applying the JOIN or other BINARY operations. • This is because the size of the file resulting from a binary operation such as JOIN is usually a multi- value function of the sizes of the input files.
  • 27. Heuristic Rules • The SELECT and PROJECT reduced the size of the file and hence, should be applied before the JOIN or other binary operation. • Heuristic query optimizer transforms the initial (canonical) query tree into final query tree using equivalence transformation rules. • This final query tree is efficient to execute.
  • 28. General Transformation Rules • Cascade of σ :- • σ c1 AND c2 AND …AND cn (R) = σ c1 (σ c2 (…(σ cn (R))…)) • Commutativity of σ :- • σ C1 (σ C2 (R)) = σ C2 (σ C1 (R)) Cascade of Л :- • Л List1 (Л List2 (…(Л List n (R))…)) = Л List1 (R) • Commuting σ with Л :- • Л A1,A2,A3…An (σ C (R) ) = σ C (Л A1,A2,A3…An (R))
  • 29. General Transformation Rules Commutativity of ⋈ AND x :- R ⋈ c S = S ⋈ c R R x S = S x R • Commuting σ with ⋈ or x :- If all attributes in selection condition c involved only attributes of one of the relation schemas (R). σ c (R ⋈ S) = (σ c (R) ) ⋈ S Alternatively, selection condition c can be written as (c1 AND c2) where condition c1 involves only attributes of R and condition c2 involves only attributes of S then : σ c (R ⋈ S) = (σ c1 (R) ) ⋈ (σ c2 (S) ) • Commuting Л with ⋈ or x :- The projection list L = {A1,A2,..An,B1,B2,…Bm}. A1…An attributes of R and B1…Bm attributes of S. Join condition C involves only attributes in L then : ЛL ( R ⋈ c S ) = ( ЛA1,…An (R) ) ⋈c ( ЛB1,…Bm(S) )
  • 30. General Transformation Rules • Commutative of SET Operation :- R ⋃ S = S ⋃ R R ⋂ S = S ⋂ R Minus (R-S) is not commutative. • Associatively of ⋈, x, ⋂, and ⋃ :- If ∅ stands for any one of these operation throughout the expression then : (R ∅ S) ∅ T = R ∅ (S ∅ T) • Commutativity of σ with SET Operation :- If ∅ stands for any one of three operations (⋃,⋂,and-) then : σ c (R ∅ S) = (σ c (R)) ⋃ (σ c (S)) Л c (R ∅ S) = (Л c (R)) ⋃ (Лc (S)) • The Л operation comute with ⋃ :- Л L (R ⋃ S) = (Л L(R)) ⋃ (Л L(S)) • Converting a (σ,x) sequence with ⋃ (σ c (R x S)) = (R ⋈ c S)