SlideShare a Scribd company logo
2
Most read
3
Most read
Scientific Journal Impact Factor (SJIF): 1.711
International Journal of Modern Trends in Engineering
and Research
www.ijmter.com
@IJMTER-2014, All rights Reserved 78
e-ISSN: 2349-9745
p-ISSN: 2393-8161
Issues in Query Processing and Optimization
Vineet Mehan1
, Kaushik Adhikary2
, Amandeep Singh Bhatia3
1
CSE, MAIT, Maharaja Agrasen University, H.P.
2
CSE, MAIT, Maharaja Agrasen University H.P.
3
CSE, MAIT, Maharaja Agrasen University H.P.
Abstract—The paper identifies the various issues in query processing and optimization while
choosing the best database plan. It is unlike preceding query optimization techniques that uses only a
single approach for identifying best query plan by extracting data from database. Our approach takes
into account various phases of query processing and optimization, heuristic estimation techniques
and cost function for identifying the best execution plan. A review report on various phases of query
processing, goals of optimizer, various rules for heuristic optimization and cost components involved
are presented in this paper.
Keywords- Query processing; Optimization; Heuristic estimation
I. INTRODUCTION
Users of the database have less knowledge about the working of the database. So this burden
of choosing the best query should be put on the DBMS/RDBMS and not on the user. Here comes the
role of Query Processing and Query Optimization [1]. There are ‘n’ numbers of ways to run a query.
A question arises which way is the best? To choose the best way we must know the following:
 How a Query is processed by DBMS/RDBMS.
 What are the different ways or plans in which a query can be formed?
 Which plan is the best among all the other plans?
II. PHASES IN QUERY PROCESSING
DDL commands are not processed by the query optimizer. Only DML commands are
processed by the query optimizer [2]. Steps involved in Query processing include:
 Search Query
 Parsing and Validating
 Optimization
 Code Generation
 Query Execution
 Search Results
Search query is any SQL query for which optimization is to be done. Parser is a tool that transforms
a query to structure. It checks for the correct syntax (i.e. Table Name, Attribute Names, Data types
etc.). It resolves names and references and converts the query into parse tree/query tee. To simplify
the query translation process query is broken into blocks [3]. Each query block consists of a
SELECT, FROM, WHERE block along with some blocks with AND, GROUP BY and HAVING
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 79
clauses. Parser may also check if user is authorized to execute the query or not. Parsed query is then
sent to the next step for Query Optimization.
Output of the parser acts as an input the query optimizer. Goal of optimizer can be any one or all of
the following:
Goal 1: Minimize Processing Time
Goal 2: Minimize Response Time
Goal 3: Minimize Memory Used
Goal 4: Minimize Network Time
Once the query optimizer has determined the execution plan (the specific ordering of access
routines). The code generator writes out the actual access routines to be executed. The query code is
interpreted and passed directly to the runtime database processor for execution. It is also possible to
compile the access routines and store them for later execution [4]. Query has been scanned, parsed,
optimized, and (possibly) compiled. The runtime database processor then executes the access
routines against the database. The results are returned to the application that made the query in the
first place. Any runtime errors are also returned. When the query is executed, results are obtained to
be displayed to the user.
III. HEURISTIC OPTIMIZATION TECHNIQUES
Heuristic refers to experience-based techniques for problem solving, learning, and discovery.
The solution obtained may or may not be optimal. Under heuristic optimization one should identify
the techniques which make our queries optimized [5]. Common Heuristic based technique is Rule of
Thumb. The term originated with carpenters who used width of their thumbs for measuring rather
than measuring scales. The main reason for such a measurement was on the basis of experience.
When you are an experienced carpenter you think that your measurement is right. Various rules
identified include:
a) Carry out Selection as early as possible.
b) Projections are executed as early as possible.
c) Cascading Selections(S) and Projections (P): When S and P are on the same operand
then operations may be carried out together. It saves the cost of scanning a table more
than once.
d) Optimal Ordering of Joins: Ordering of joins should be such that the results are small
rather than large.
e) Combining certain Projections and Selections instead of a Join.
f) If there is more than one projection on the same table the projections should be
carried out simultaneously.
g) Sorting is deferred as much as possible.
IV. COST FUNCTIONS
Several Cost components include:
 Access cost to secondary storage (hard disk).
 Storage Cost for intermediate result sets.
 Computation costs: CPU, memory transfers, etc. for performing in-memory operations.
 Communications Costs to ship data around a network. E.g., in a distributed or client/server
database.
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 80
Access cost to secondary storage involves access cost in terms of number of rows and access cost in
terms of memory [6]. Since the execution takes place step by step. Intermediate results are stored in
temporary files/tables. Cost of accessing tables and memory is also involved. This accounts for
Storage Cost for intermediate result sets. Every storage access involves certain kind of computation
cost. E.g. when a particular application is running say Microsoft power point then CPU is being
utilized. Similarly to access database for some queries certain computation cost is involved. If the
model that we are using is client server based then communication cost is also involved. Bandwidth
is required to fetch a heavy database. E.g. If a website works on Oracle database then, there may be
certain dates on which to due to heavy traffic the site hangs up or becomes slow. For discussion let
us take Cost function for SELECT statement and Cost function for JOIN.
Cost function for select means cost function for Selection. Selection in relation algebra accounts for
conditions. So the cost function for selection depends upon the conditions as shown in Table 1.
Table1. Conditions in cost function for selection.
S. No. Conditions in “where” clause
1 Attribute A = value v
2 Attribute A > value v
3 Attribute between value v1 and v2
4 Attribute A IN (List of values)
5 Attribute A IN Subquery
6 Attribute A condition C1 OR condition C2
7 Attribute A condition C1 AND condition C2
8 Attribute A is NOT NULL
Cost function for Join include: Nested-loop Join; Nested Index Join; Sort Merge Join/ Sort Scan Join
and Hash Join. In Nested Loop Join the optimizer chooses one of the tables as the outer table. The
other table is called the inner table. For each row in the outer table, optimizer finds all rows in the
inner table that satisfy the join condition. Database compiler can only perform a sort-merge join for
an equijoin. To perform a sort-merge join, following steps are required: Sort each row source to be
joined; Rows are sorted on the values of the columns used in the join condition; Compiler then
merges the two sources.
V. CONCLUSION
Query processing and Optimization is much more than merely choosing the
best query plan. Designing effective and correct plan involves considering a number of factors which
vary from initial phases to the cost functions involved. Even though the work has been done in the
area of query processing and optimization but still there are significant open problems that exist.
Nevertheless, a perceptive of the accessible engineering framework is essential for making effectual
role to the area of query optimization.
REFERENCES
[1] S. Rahimi, F. Haug, “Distributed Database Management Systems:A Practical Approach”, pp. 111-181, 2010.
[2] S. Bottcher, D. Bokermann, R.Hartel, “Generalizing and Improving SQL/XML Query Evaluation ”, Eighth
International Conference on Signal Image Technology and Internet Based Systems (SITIS), 2012, pp. 441 - 449
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 81
[3] P. Doshi, V.Raisinghani, “Review of dynamic query optimization strategies in distributed database”, 3rd
International Conference on Electronics Computer Technology (ICECT), vol. 6,pp. 145 – 149, 2011.
[4] T. V. V. Kumar, V. Singh, A. K. Verma, “Generating Distributed Query Processing Plans Using Genetic
Algorithm ”, International Conference on Data Storage and Data Engineering (DSDE), 2010, pp.173-177.
[5] L. Antova, T. Jansen, C. Koch, D. Olteanu, “Fast and Simple Relational Processing of Uncertain Data ”, 24th
International Conference on Data Engineering, 2008, pp. 983 - 992
[6] C. Binnig, D. Kossmann, E. Lo, “Reverse Query Processing ”, 23rd International Conference on Data Engineering,
2007, pp. 506-515.
Issues in Query Processing and Optimization
Issues in Query Processing and Optimization

More Related Content

What's hot (20)

PDF
Study on Relavance Feature Selection Methods
IRJET Journal
 
PDF
An effective adaptive approach for joining data in data
eSAT Publishing House
 
PDF
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
IJDKP
 
PDF
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
IJDKP
 
PDF
Effective data mining for proper
IJDKP
 
PDF
GCUBE INDEXING
IJDKP
 
PDF
A unified approach for spatial data query
IJDKP
 
PDF
Enhancing the labelling technique of
IJDKP
 
PDF
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...
ijscai
 
PDF
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
IJDKP
 
PDF
Threshold benchmarking for feature ranking techniques
journalBEEI
 
PDF
Unsupervised Feature Selection Based on the Distribution of Features Attribut...
Waqas Tariq
 
PDF
Using particle swarm optimization to solve test functions problems
riyaniaes
 
PDF
C24011018
IJERA Editor
 
PDF
New proximity estimate for incremental update of non uniformly distributed cl...
IJDKP
 
PDF
An unsupervised feature selection algorithm with feature ranking for maximizi...
Asir Singh
 
PDF
A novel population-based local search for nurse rostering problem
IJECEIAES
 
PDF
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
Editor IJCATR
 
PDF
COMPARATIVE STUDY OF DISTRIBUTED FREQUENT PATTERN MINING ALGORITHMS FOR BIG S...
IAEME Publication
 
PDF
Apsec 2014 Presentation
Ahrim Han, Ph.D.
 
Study on Relavance Feature Selection Methods
IRJET Journal
 
An effective adaptive approach for joining data in data
eSAT Publishing House
 
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
IJDKP
 
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
IJDKP
 
Effective data mining for proper
IJDKP
 
GCUBE INDEXING
IJDKP
 
A unified approach for spatial data query
IJDKP
 
Enhancing the labelling technique of
IJDKP
 
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...
ijscai
 
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
IJDKP
 
Threshold benchmarking for feature ranking techniques
journalBEEI
 
Unsupervised Feature Selection Based on the Distribution of Features Attribut...
Waqas Tariq
 
Using particle swarm optimization to solve test functions problems
riyaniaes
 
C24011018
IJERA Editor
 
New proximity estimate for incremental update of non uniformly distributed cl...
IJDKP
 
An unsupervised feature selection algorithm with feature ranking for maximizi...
Asir Singh
 
A novel population-based local search for nurse rostering problem
IJECEIAES
 
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
Editor IJCATR
 
COMPARATIVE STUDY OF DISTRIBUTED FREQUENT PATTERN MINING ALGORITHMS FOR BIG S...
IAEME Publication
 
Apsec 2014 Presentation
Ahrim Han, Ph.D.
 

Similar to Issues in Query Processing and Optimization (20)

PDF
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
PDF
unit 3 DBMS.docx.pdf geometry in query p
FallenAngel35
 
PPTX
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
PDF
dd presentation.pdf
AnSHiKa187943
 
PPTX
Lecture 5.pptx
Shafii8
 
PDF
Query optimization in oodbms identifying subquery for query management
IJDMS
 
PDF
An Analysis on Query Optimization in Distributed Database
Editor IJMTER
 
PPTX
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
PPTX
Query optimization
Pooja Dixit
 
PPTX
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
PPTX
Query-porcessing-& Query optimization
Saranya Natarajan
 
PPTX
SQL Query Optimization: Why Is It So Hard to Get Right?
Brent Ozar
 
PPTX
Query evaluation and optimization
lavanya marichamy
 
PDF
01Query Processing and Optimization-SUM25.pdf
sfsmj710f
 
PPTX
Query processing and optimization on dbms
ar1289589
 
PPT
PASS Summit 2010 Keynote David DeWitt
GraySystemsLab
 
PPTX
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
PPTX
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
PPTX
Oracle performance tuning for java developers
Saeed Shahsavan
 
PPTX
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
unit 3 DBMS.docx.pdf geometric transformer in query processing
FallenAngel35
 
unit 3 DBMS.docx.pdf geometry in query p
FallenAngel35
 
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
AthosBeatus
 
dd presentation.pdf
AnSHiKa187943
 
Lecture 5.pptx
Shafii8
 
Query optimization in oodbms identifying subquery for query management
IJDMS
 
An Analysis on Query Optimization in Distributed Database
Editor IJMTER
 
DB LECTURE 5 QUERY PROCESSING.pptx
grahamoyigo19
 
Query optimization
Pooja Dixit
 
Concepts of Query Processing in ADBMS.pptx
AaradhyaDixit6
 
Query-porcessing-& Query optimization
Saranya Natarajan
 
SQL Query Optimization: Why Is It So Hard to Get Right?
Brent Ozar
 
Query evaluation and optimization
lavanya marichamy
 
01Query Processing and Optimization-SUM25.pdf
sfsmj710f
 
Query processing and optimization on dbms
ar1289589
 
PASS Summit 2010 Keynote David DeWitt
GraySystemsLab
 
Ch-2-Query-Process.pptx advanced database
tasheebedane
 
700442110-advanced database Ch-2-Query-Process.pptx
tasheebedane
 
Oracle performance tuning for java developers
Saeed Shahsavan
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
neju3
 
Ad

More from Editor IJMTER (20)

PDF
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
Editor IJMTER
 
PDF
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
Editor IJMTER
 
PDF
Analysis of VoIP Traffic in WiMAX Environment
Editor IJMTER
 
PDF
A Hybrid Cloud Approach for Secure Authorized De-Duplication
Editor IJMTER
 
PDF
Aging protocols that could incapacitate the Internet
Editor IJMTER
 
PDF
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
Editor IJMTER
 
PDF
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
Editor IJMTER
 
PDF
Sustainable Construction With Foam Concrete As A Green Green Building Material
Editor IJMTER
 
PDF
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
Editor IJMTER
 
PDF
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
 
PDF
Testing of Matrices Multiplication Methods on Different Processors
Editor IJMTER
 
PDF
Survey on Malware Detection Techniques
Editor IJMTER
 
PDF
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
Editor IJMTER
 
PDF
SURVEY OF GLAUCOMA DETECTION METHODS
Editor IJMTER
 
PDF
Survey: Multipath routing for Wireless Sensor Network
Editor IJMTER
 
PDF
Step up DC-DC Impedance source network based PMDC Motor Drive
Editor IJMTER
 
PDF
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
Editor IJMTER
 
PDF
Software Quality Analysis Using Mutation Testing Scheme
Editor IJMTER
 
PDF
Software Defect Prediction Using Local and Global Analysis
Editor IJMTER
 
PDF
Software Cost Estimation Using Clustering and Ranking Scheme
Editor IJMTER
 
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
Editor IJMTER
 
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
Editor IJMTER
 
Analysis of VoIP Traffic in WiMAX Environment
Editor IJMTER
 
A Hybrid Cloud Approach for Secure Authorized De-Duplication
Editor IJMTER
 
Aging protocols that could incapacitate the Internet
Editor IJMTER
 
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
Editor IJMTER
 
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
Editor IJMTER
 
Sustainable Construction With Foam Concrete As A Green Green Building Material
Editor IJMTER
 
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
Editor IJMTER
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
 
Testing of Matrices Multiplication Methods on Different Processors
Editor IJMTER
 
Survey on Malware Detection Techniques
Editor IJMTER
 
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
Editor IJMTER
 
SURVEY OF GLAUCOMA DETECTION METHODS
Editor IJMTER
 
Survey: Multipath routing for Wireless Sensor Network
Editor IJMTER
 
Step up DC-DC Impedance source network based PMDC Motor Drive
Editor IJMTER
 
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
Editor IJMTER
 
Software Quality Analysis Using Mutation Testing Scheme
Editor IJMTER
 
Software Defect Prediction Using Local and Global Analysis
Editor IJMTER
 
Software Cost Estimation Using Clustering and Ranking Scheme
Editor IJMTER
 
Ad

Recently uploaded (20)

PPTX
Introduction to Design of Machine Elements
PradeepKumarS27
 
PDF
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
PPTX
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
PPTX
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
PPTX
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
PPTX
MATLAB : Introduction , Features , Display Windows, Syntax, Operators, Graph...
Amity University, Patna
 
PPTX
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
PDF
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
PPTX
Water Resources Engineering (CVE 728)--Slide 3.pptx
mohammedado3
 
PPTX
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
PDF
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
DOCX
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
PPTX
Damage of stability of a ship and how its change .pptx
ehamadulhaque
 
PDF
AN EMPIRICAL STUDY ON THE USAGE OF SOCIAL MEDIA IN GERMAN B2C-ONLINE STORES
ijait
 
PDF
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
PDF
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
PDF
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
PDF
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
DOC
MRRS Strength and Durability of Concrete
CivilMythili
 
PDF
Design Thinking basics for Engineers.pdf
CMR University
 
Introduction to Design of Machine Elements
PradeepKumarS27
 
Reasons for the succes of MENARD PRESSUREMETER.pdf
majdiamz
 
fatigue in aircraft structures-221113192308-0ad6dc8c.pptx
aviatecofficial
 
澳洲电子毕业证澳大利亚圣母大学水印成绩单UNDA学生证网上可查学历
Taqyea
 
VITEEE 2026 Exam Details , Important Dates
SonaliSingh127098
 
MATLAB : Introduction , Features , Display Windows, Syntax, Operators, Graph...
Amity University, Patna
 
Water Resources Engineering (CVE 728)--Slide 4.pptx
mohammedado3
 
Viol_Alessandro_Presentazione_prelaurea.pdf
dsecqyvhbowrzxshhf
 
Water Resources Engineering (CVE 728)--Slide 3.pptx
mohammedado3
 
美国电子版毕业证南卡罗莱纳大学上州分校水印成绩单USC学费发票定做学位证书编号怎么查
Taqyea
 
20ES1152 Programming for Problem Solving Lab Manual VRSEC.pdf
Ashutosh Satapathy
 
CS-802 (A) BDH Lab manual IPS Academy Indore
thegodhimself05
 
Damage of stability of a ship and how its change .pptx
ehamadulhaque
 
AN EMPIRICAL STUDY ON THE USAGE OF SOCIAL MEDIA IN GERMAN B2C-ONLINE STORES
ijait
 
Halide Perovskites’ Multifunctional Properties: Coordination Engineering, Coo...
TaameBerhe2
 
Introduction to Productivity and Quality
মোঃ ফুরকান উদ্দিন জুয়েল
 
Water Industry Process Automation & Control Monthly July 2025
Water Industry Process Automation & Control
 
MAD Unit - 1 Introduction of Android IT Department
JappanMavani
 
MRRS Strength and Durability of Concrete
CivilMythili
 
Design Thinking basics for Engineers.pdf
CMR University
 

Issues in Query Processing and Optimization

  • 1. Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com @IJMTER-2014, All rights Reserved 78 e-ISSN: 2349-9745 p-ISSN: 2393-8161 Issues in Query Processing and Optimization Vineet Mehan1 , Kaushik Adhikary2 , Amandeep Singh Bhatia3 1 CSE, MAIT, Maharaja Agrasen University, H.P. 2 CSE, MAIT, Maharaja Agrasen University H.P. 3 CSE, MAIT, Maharaja Agrasen University H.P. Abstract—The paper identifies the various issues in query processing and optimization while choosing the best database plan. It is unlike preceding query optimization techniques that uses only a single approach for identifying best query plan by extracting data from database. Our approach takes into account various phases of query processing and optimization, heuristic estimation techniques and cost function for identifying the best execution plan. A review report on various phases of query processing, goals of optimizer, various rules for heuristic optimization and cost components involved are presented in this paper. Keywords- Query processing; Optimization; Heuristic estimation I. INTRODUCTION Users of the database have less knowledge about the working of the database. So this burden of choosing the best query should be put on the DBMS/RDBMS and not on the user. Here comes the role of Query Processing and Query Optimization [1]. There are ‘n’ numbers of ways to run a query. A question arises which way is the best? To choose the best way we must know the following:  How a Query is processed by DBMS/RDBMS.  What are the different ways or plans in which a query can be formed?  Which plan is the best among all the other plans? II. PHASES IN QUERY PROCESSING DDL commands are not processed by the query optimizer. Only DML commands are processed by the query optimizer [2]. Steps involved in Query processing include:  Search Query  Parsing and Validating  Optimization  Code Generation  Query Execution  Search Results Search query is any SQL query for which optimization is to be done. Parser is a tool that transforms a query to structure. It checks for the correct syntax (i.e. Table Name, Attribute Names, Data types etc.). It resolves names and references and converts the query into parse tree/query tee. To simplify the query translation process query is broken into blocks [3]. Each query block consists of a SELECT, FROM, WHERE block along with some blocks with AND, GROUP BY and HAVING
  • 2. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 79 clauses. Parser may also check if user is authorized to execute the query or not. Parsed query is then sent to the next step for Query Optimization. Output of the parser acts as an input the query optimizer. Goal of optimizer can be any one or all of the following: Goal 1: Minimize Processing Time Goal 2: Minimize Response Time Goal 3: Minimize Memory Used Goal 4: Minimize Network Time Once the query optimizer has determined the execution plan (the specific ordering of access routines). The code generator writes out the actual access routines to be executed. The query code is interpreted and passed directly to the runtime database processor for execution. It is also possible to compile the access routines and store them for later execution [4]. Query has been scanned, parsed, optimized, and (possibly) compiled. The runtime database processor then executes the access routines against the database. The results are returned to the application that made the query in the first place. Any runtime errors are also returned. When the query is executed, results are obtained to be displayed to the user. III. HEURISTIC OPTIMIZATION TECHNIQUES Heuristic refers to experience-based techniques for problem solving, learning, and discovery. The solution obtained may or may not be optimal. Under heuristic optimization one should identify the techniques which make our queries optimized [5]. Common Heuristic based technique is Rule of Thumb. The term originated with carpenters who used width of their thumbs for measuring rather than measuring scales. The main reason for such a measurement was on the basis of experience. When you are an experienced carpenter you think that your measurement is right. Various rules identified include: a) Carry out Selection as early as possible. b) Projections are executed as early as possible. c) Cascading Selections(S) and Projections (P): When S and P are on the same operand then operations may be carried out together. It saves the cost of scanning a table more than once. d) Optimal Ordering of Joins: Ordering of joins should be such that the results are small rather than large. e) Combining certain Projections and Selections instead of a Join. f) If there is more than one projection on the same table the projections should be carried out simultaneously. g) Sorting is deferred as much as possible. IV. COST FUNCTIONS Several Cost components include:  Access cost to secondary storage (hard disk).  Storage Cost for intermediate result sets.  Computation costs: CPU, memory transfers, etc. for performing in-memory operations.  Communications Costs to ship data around a network. E.g., in a distributed or client/server database.
  • 3. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 80 Access cost to secondary storage involves access cost in terms of number of rows and access cost in terms of memory [6]. Since the execution takes place step by step. Intermediate results are stored in temporary files/tables. Cost of accessing tables and memory is also involved. This accounts for Storage Cost for intermediate result sets. Every storage access involves certain kind of computation cost. E.g. when a particular application is running say Microsoft power point then CPU is being utilized. Similarly to access database for some queries certain computation cost is involved. If the model that we are using is client server based then communication cost is also involved. Bandwidth is required to fetch a heavy database. E.g. If a website works on Oracle database then, there may be certain dates on which to due to heavy traffic the site hangs up or becomes slow. For discussion let us take Cost function for SELECT statement and Cost function for JOIN. Cost function for select means cost function for Selection. Selection in relation algebra accounts for conditions. So the cost function for selection depends upon the conditions as shown in Table 1. Table1. Conditions in cost function for selection. S. No. Conditions in “where” clause 1 Attribute A = value v 2 Attribute A > value v 3 Attribute between value v1 and v2 4 Attribute A IN (List of values) 5 Attribute A IN Subquery 6 Attribute A condition C1 OR condition C2 7 Attribute A condition C1 AND condition C2 8 Attribute A is NOT NULL Cost function for Join include: Nested-loop Join; Nested Index Join; Sort Merge Join/ Sort Scan Join and Hash Join. In Nested Loop Join the optimizer chooses one of the tables as the outer table. The other table is called the inner table. For each row in the outer table, optimizer finds all rows in the inner table that satisfy the join condition. Database compiler can only perform a sort-merge join for an equijoin. To perform a sort-merge join, following steps are required: Sort each row source to be joined; Rows are sorted on the values of the columns used in the join condition; Compiler then merges the two sources. V. CONCLUSION Query processing and Optimization is much more than merely choosing the best query plan. Designing effective and correct plan involves considering a number of factors which vary from initial phases to the cost functions involved. Even though the work has been done in the area of query processing and optimization but still there are significant open problems that exist. Nevertheless, a perceptive of the accessible engineering framework is essential for making effectual role to the area of query optimization. REFERENCES [1] S. Rahimi, F. Haug, “Distributed Database Management Systems:A Practical Approach”, pp. 111-181, 2010. [2] S. Bottcher, D. Bokermann, R.Hartel, “Generalizing and Improving SQL/XML Query Evaluation ”, Eighth International Conference on Signal Image Technology and Internet Based Systems (SITIS), 2012, pp. 441 - 449
  • 4. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 81 [3] P. Doshi, V.Raisinghani, “Review of dynamic query optimization strategies in distributed database”, 3rd International Conference on Electronics Computer Technology (ICECT), vol. 6,pp. 145 – 149, 2011. [4] T. V. V. Kumar, V. Singh, A. K. Verma, “Generating Distributed Query Processing Plans Using Genetic Algorithm ”, International Conference on Data Storage and Data Engineering (DSDE), 2010, pp.173-177. [5] L. Antova, T. Jansen, C. Koch, D. Olteanu, “Fast and Simple Relational Processing of Uncertain Data ”, 24th International Conference on Data Engineering, 2008, pp. 983 - 992 [6] C. Binnig, D. Kossmann, E. Lo, “Reverse Query Processing ”, 23rd International Conference on Data Engineering, 2007, pp. 506-515.