Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47
Injibara University
College of Engineering and Technology
Advanced Database
Chapter Two
Query Processing and Optimization
01/25/2025 Advanced Database -1-
Query Processing • Query Processing is the activity performed in extracting data from the database. • Query Processing in DBMS is the approach of selecting the best strategy or plan which can be used to respond to a database request. • In Query Processing, we focus on different facts of converting the query of user into standard form and afterward into a plan which can be executed to generate response. • Query Processing is also referred as Query Optimization in database 01/25/2025 literature. • Advanced Database 2 Cont… • Refers to the range of activities involved in extracting data from a database. • A database query is the vehicle for instructing a DBMS to update or retrieve specific data to/from the physically stored medium. • The actual updating and retrieval of data is performed through various “low- level” operations- Such as project, join, select, Cartesian product..
01/25/2025 • Advanced Database 3
Cont… • The component of database management system responsible for generating this strategy is known as query processor. • This is a stepwise process. • The first step transforms the query into a standard form. • The next step generates a number of strategies known as access plans to evaluate the transformed query.
01/25/2025 • Advanced Database 4
Cont… • The main goal to create a database is to keep the related data at one place, retrieve, access and manipulate them when it is required. • But as we know the database and the users both are systems. • The user can request data in its preferred language but database management system has its own language(which is SQL). • In RDBMS (Relational Database Management System) the data is stored in the form of tables (rows and columns).
01/25/2025 • Advanced Database 5
Cont… • The users can select, insert, update, delete and manipulate the data without violating the constraints. • If we want to search the list of employee who have salary more than 10,000 then » SELECT EMP-NAME » FROM EMPLOYEE WHERE SALARY>10,000; • Database Management System cannot understand this statement. So for that we have structured query language. » The Structured Query language is a high- level language which is used as a bridge between the users and the Database Management System.
01/25/2025 • Advanced Database 6
Cont… • Normally any query written in structured query language (SQL) is translated into low-level language which the system can understand. • But to write relational algebra kind of queries will be difficult for any user. • Therefore DBMS asks its users to write queries in SQL.
01/25/2025 • Advanced Database 7
Cont… • The DBMS verifies the code and convert them into low-level languages. • It then adopts the best execution path to execute the query. • All of the process together known as query processing in Database Management System.
01/25/2025 • Advanced Database 8
Block diagram of Query Processing
01/25/2025 • Advanced Database 9
A query processor in DBMS performs this task.
01/25/2025 • Advanced Database 10
• The diagram shows the processing of a query in the database. • When a query is submitted it is received by the compiler that scans the query and divides it into tokens. • After that the tokens are verified by the parser for their correctness. • Then they are transformed into different relational trees or graphs.
01/25/2025 • Advanced Database 11
• The query optimizer then picks the best query plans and generates different execution plans. • The command processor uses this plan to retrieve the data and return the result.
01/25/2025 • Advanced Database 12
Techniques used in Query Processing. • Parsing and Translation • Query optimization • Evaluation • Execution Engine
01/25/2025 • Advanced Database 13
Parsing and Translation • Parsing and translation is the first step to be performed. • In this step the user writes its request in (SQL) and the DBMS convert it into machine understandable low level language. • The query is first picked by the query processor which scans the queries and parses them into individual tokens. • After that it examines the correctness of the query, and then it converts the tokens into trees, graphs and relational expressions. • These are the following checks performed in parsing phase:
session using shared pool check the database skips next two steps i.e. optimization and row source generation, this is known as soft parsing. • If the query is not found in already processed pool it is known as hard parsing.
01/25/2025 • Advanced Database 15
Cont… • Example: Suppose a user wants to know the details of employees who are working in PROCESS_1. If it says ‘Retrieve Employees details that are in PROCESS_1’, DBMS can never understand. Hence it provides a language i.e. SQL to communicate and both user and DBMS can understand. So the request could be written as: » SELECT EMP_NAME, ADDRESS, DOB FROM EMPLOYEE E, PROCESS P WHERE E.EMP_ID = P.EMP_ID AND PROCESS_NAME = ‘PROCESS_1’; • The DBMS reads it and convert it for further process and synthesis it. This phase is known as parsing and translation. Query processor scans the query and divides it into tokens. In our example – ‘SELECT * FROM’, ‘EMPLOYEE E’, ‘PROCESS P’, ‘WHERE’, ‘E.EMP_ID = P.EMP_ID’, ‘AND’, ‘PROCESS NAME = ‘PROCESS_1’’ 01/25/2025 • Advanced Database 16 1. Parsing and Translation • A query expressed in a high-level query language such as SQL must first be scanned, parsed, and validated. Scanner Identifies the language tokens—such as SQL keywords, attribute names, and relation names—that appear in the text of the query. Parser Checks the query syntax to determine whether it is formulated according to the syntax rules (rules of grammar) of the query language. Validate Validated by checking that all attribute and relation names are valid and semantically meaningful names in the schema of the particular database being 01/25/2025 queried. • Advanced Database 17 Parsing and Translation • An internal representation of the query is then created, usually as a tree data structure called a query tree or a graph data structure called a query graph. • The DBMS must then devise an execution strategy or query plan for retrieving the results of the query from the database files depending on the statistics available.
01/25/2025 • Advanced Database 18
Optimization • Query processor applies rules to the internal data structures of the query to transform these structures into equivalent, but more efficient representations. • The rules can be based upon mathematical models of the relational algebra expression and tree (heuristics), upon cost estimates of different algorithms applied to operations or upon the semantics within the query and the relations it involves.
01/25/2025 • Advanced Database 19
Optimization • Selecting the proper rules to apply, when to apply them and how they are applied is the function of the query optimization engine. • A query typically has many possible execution strategies, and the process of choosing a suitable one for processing a query is known as query optimization.
01/25/2025 • Advanced Database 20
• This step analyses SQL queries and determines effective execution mechanisms in query processing. • Optimizer uses the statistical data i.e. information about the length of records, size of the table, the indexes created on the table etc, stored as part of data dictionary. • The query optimizer produces one or more query plans and the most efficient is selected and used to execute the query. 01/25/2025 • Advanced Database 21 • The next step is Row source generation. • The optimal plan is received by the Row source generator from the optimizer and the execution plan for sql query is the output of this step. • A collection of row sources is known as Execution plan and they are structured in the form of a tree. • A row source processes a set of rows, one at a time in an iterated manner. • It is an iterative control manner that produces a row set. 01/25/2025 • Advanced Database 22 • The above example can be represented in relational structures like tree/graphs as below:
01/25/2025 • Advanced Database 23
Evaluation • We got many execution plans through query optimization. • Although they give the same output but differ in terms of space and time consumption. • Evaluation helps us to choose effective and less cost consuming execution plan to give the final result by accessing the data from the database. It can also be written as • σSALARY>90 (πEMP_NAME (EMPLOYEE))
01/25/2025 • Advanced Database 24
Cont… • In SQL Some of the factors considered to calculate the cost of evaluation plan by the optimizer are: » CPU time » Number of operations » Number of tuples to be scanned and » Disk access time
01/25/2025 • Advanced Database 25
Execution Engine • Execution engine is responsible for producing the output of the given query. • It executes the query execution plan which is chosen from the previous step i.e. Evaluation and the output is finally displayed to the user.
01/25/2025 • Advanced Database 26
Summary • The step wise process of translating high level queries into low level expressions is known as Query Processing. • This technique is used at the physical level (file system), query optimization and actual execution by using steps like parsing, translation, optimization and evaluation. • After summarising query processing involves two steps: » Compile time - Parsing and Translation, Optimization and Query generation » Runtime - Evaluate and Execute
01/25/2025 • Advanced Database 27
Steps in Processing a Query
01/25/2025 • Advanced Database 28
Query processing
Data Statistics
01/25/2025 • Advanced Database 29
Database Statistics • Refers to a number of measurements about database objects, such as number of processors used, processor speed, and temporary space available. • The DBMS uses these statistics to make critical decisions about improving query processing efficiency.
• Database statistics can be gathered manually by the DBA or
automatically by the DBMS.
• If the object statistics exist, the DBMS will use them in
query processing. 01/25/2025 • Advanced Database 30 QUERY PROCESSING • Translating SQL Queries into Relational Algebra • 1.SQL queries are decomposed into query blocks, (Query block - basic units that can be translated into the algebraic operators and optimized. ) • 2.A query block contains a single SELECT-FROM- WHERE expression, as well as GROUP BY and HAVING clauses, aggregate functions etc..) • 3.Nested(inner) queries within a query are identified as separate query block.
01/25/2025 • Advanced Database 31
Cont… • We first concentrate the first step: finding efficient relational algebra expressions. • For the second step, we need to know how data is stored, and how it is accessed
01/25/2025 • Advanced Database 32
Cont… • A query expressed in a high-level query language such as SQL must be scanned, parsed, and validate. • Scanner: identify the language tokens. • Parser: check query syntax. • Validate: check all attribute and relation names are valid.
01/25/2025 • Advanced Database 33
Conceptual Evaluation Strategy SELECT S.name, E.name FROM Student S, Enrolls E WHERE S.sid=E.sid AND E.term=10-2; Semantics of an SQL query defined in terms of the following conceptual evaluation strategy: 1.Compute the cross-product of from-list 2.Discard resulting tuples that fail qualifications 3.Delete attributes that are not in select-list 4.If DISTINCT is specified, eliminate duplicate rows 01/25/2025 • Advanced Database 34 Cont… • This strategy is probably the least efficient way to compute a query! • An “optimizer” will find more efficient strategies to compute the same answers. • An SQL query is translated into equivalent: – Relational Algebra Expression • Represented as a: – Query Tree
01/25/2025 • Advanced Database 35
Intermediate Query Form • An SQL query is translated into equivalent: Relational Algebra Expression • Represented as a: – Query Tree
01/25/2025 • Advanced Database 36
Notation for Query Tree • Query tree: a data structure that corresponds to a relational algebra expression • Input relations of the query as leaf nodes • Relational algebra operations as internal nodes • An execution of the query tree consists of executing internal node operations.
01/25/2025 • Advanced Database 37
Notation for Query Graph • Query Graph : used to represent a relational calculus expression. • Relations in the query are represented by relation nodes, which are displayed as single circles. • Constant values are represented by constant nodes, which are displayed as double circles or ovals. • Selection and join conditions are represented by the graph edges. • Attributes to be retrieved from each relation are displayed in square brackets above each relation. 01/25/2025 • Advanced Database 38 Query Tree & Query Graph
01/25/2025 • Advanced Database 39
Draw Query Tree • 1. List the name of students and the Courses taken by them. » SELECT NAME,CNAME FROM STUDENT ,COURSE WHERE CID=CNO • 2. List the name of students and Course who are taking the course given by Department 1? » SELECT NAME,CNAME FROM STUDENT ,COURSE WHERE CID=CNO and DNO=1
01/25/2025 • Advanced Database 40
Cont… • 3.List name of students ,course taken by them and name of the department giving the course for all those whose cgpa>3. » select name,cname,dname from student ,course,department where cid=cno and did=dno and cgpa>3 ;
01/25/2025 • Advanced Database 41
• 4. List name of students and course name who is taking courses provided by CS&IT » SELECT NAME,CNAME FROM STUDENT ,COURSE,DEPARTMENT WHERE CID=CNO and DID=DNO and DNAME=‘CS&IT’