0% found this document useful (0 votes)
8 views

unit 3_DBMS

Notes of chapter DBMS

Uploaded by

nimjetejas2003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
8 views

unit 3_DBMS

Notes of chapter DBMS

Uploaded by

nimjetejas2003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 15
Query Processing & Optimization 4.0 Introduction : unit mainly focuses on Query Processing and Query Optimization abase explains query processing along with various steps involved in it escribe measures of query cost estimation, it explains various join operations as neste® ‘oop join, block nested loop join, index nested loop join, merge join and hash join. « also describe concept of materialization and pipelining, - It discuss problem based on join operation. 1 describe concept of query optimization. es various equivalence rule use for transformation of relational .0 explains teehniques of query optimization. the end the concept of materialized view is explain. w a eseribe steps involved in query processing? Explain functionin ofeach step. cand ihwelof @ List and explain basic steps involved in query. 0. Explain query processing in DBMS with neat sketch. [s-12, 14] roceasing is the process of transforming « high level query i nt nt execution plan expressed in ({retrevelo and manipulation o Wweitten in SQL into correct and effic iow level language, that performs req i 1 database. jy involves set of steps in order to access databas ut rieval of heir database system receives a query for update or ret ation, it goes through series of query compilation steps called ¢ and give 157 Parsing and Translation (2) Optimization 4 @) Evaluation oey > suaisies ‘About Ota Fig. Query Processing Si (1) Parsing and Translation Itmainly involves conversion of SQL query into relational algebra query as user writes query in SQL which is in a high level language but it is not understandable by database engine who can only understand relational algebra, so it requires conversion from SQL to relational algebra. ~ This process is similar to work performed by parser of compiler. - The parser checks syntax of user query and verify that relation name appearing in query is the name of relation in database. ~ It also generates parse tree representation of query which it then i translates into relational algebra expression. { (2) Optimization - It's a process of selecting one efficient qulery plan out of many with aim of reducing amount of time taken to execute query. ~ There are number of ways available to execute single query. ~ Each SQL query can itself be translated into a relational algebra expression in several ways. nt ~ Consider the SQL query Select balance 5 From account Where balance < 2500 ‘TochnoScan - Database Management System 158 This query can be translated into either of following relational algebra expression. (2) cneess0laige(@COURt)) (2) Thera Ortanecseel@ecount)) - Query plan is a set of steps to obtain desire output. + For aboye relational algebra expression below query plans can be generate | i { 4 ' n, I account account Fig. Query Evaluation Plan = In order to select one query plan, optimizer must know cost of each plan which is a selection parameter for it. - Different evaluation plans for given query can have different cost. - Exact cost is hard to compute since it depends or many parameters ‘such as actual memory available to operation, - It is possible to get rough estimate of execution cost for each operation. (3) Evaluation Once query plan is chosen query is evaluated with that plan by evaluation engine and send output of query to user. 4.2 Measures of Query Cost Estimation : Q Explain the measures of query cost estimation. = The cost of query evaluation can be measured in terms of a number of different resources, including disk accesses, CPU time to execute a query. - In database systems disk access is the most important cost since disk access is slow compared to memory operation. - Estimating the CPU time is relatively hard, compared to estimating disk access is slow compared to memory operation. - We use number of block transfer from disk as measure of actual cost. To simplify our computation of disk access cost. we assume all transfers of block have same cost. - This assumption ignores the variance arising from rotational latency and seek time. 7 ites ~To get more precise numbers, we need to distinguish yoy jal 1/O where blocks read are continuous on disk atid, J 3s y g sequenti 1/0 where blocks are non continuous. - We also need to distinguish between reads and writes of blocks takes more time to write a block to disk than to read a block from - A more accurate measures would estimate (1) The number of seek operation performed (2) The number of block read (3) The number of blocks written 4.3 Selection Operation - In query processing, the file scan is the lowest level operator to a data, File scans are search algorithms that locate and retrieve records that fulfill a selection condition. 4.3.1 Basic Algorithms Q. Explain two basic algorithms to implement the selection operations. - Consider a selection operation on a relation whose tuples are together in one file. Two scan algorithms use to implement selecti operation are (a) Linear Search : - In linear search the system scan each file block and test all records t ‘see whether they satisfy selection condition. - For a selection on a key attribute, the system can termin: required record is found, without looking at other records of r ~ The cost of linear search in terms of number of I/O operations where br denotes number of blocks in file. (b) Binary Search : ~ If file is ordered on an attribute and selection condition is sn eq comparison on the attribute we can use a binary search to locate rec that satisfy the selection. ~ The system performs the binary search on blocks of file. - The number of blocks that need to be examined to find a block con required records is 1092(br) where br denotes number of blocks in /s.3.: raaeckotonUringindics === Indices ae ndex structure is referred to as access paths, sinee they provide a path through which data can be located and accessed, - Aprimary index is an index that allows records of file to be read in an order that correspond to physical order in the file - Ordered indices such as B" trees also permit access to tuples in a sorted onder, which is useful for implementing range queries, - Indices can provide fast, direct and ordered access. 4.4 Sorting Q Explain external sort merge algorithm. - Sorting of data plays an important role in database system for reasons. - First SQL queries can specify that the output be sorted. Second several of the relational operations, such as joins can be implemented efficiently if input relations are first sorted. We can sorta relation by building an index on sort key and then using that index to read the relation in sorted order, however such a process, erders the relation only logically through an index, rather than physically. - For relations that fit in memory, techniques like quick sort can be used, but for relations that don't fit in memory external sort merge is, good choice. Let M denote memory sizé (in pages) creation of Runs : Let i= 0; Repeat Read M blocks of relation into memory Sort in memory part of relation Write sorted data to run Ri isied Until end of relation Merging of Runs We assume that N (total no of Funs) < M (memory size) se N blocks of memory to buffer input runs and 1 block to buffer output. first block of each run into Hs buffer page. 4 161 Repeat ree first record among all buffer pages. ©) + Write record to the output buffer. ifoutput butfer is full, write it to disk. + Delete record from it’s input buffer page if buffer page becomes empty then read next block of the run into buffer Until all input buffer merge passes are required. IfN > M then several merge passes are required. - The below fi relation. For illustration we assume only one tuple fits in a block and we assume that memory holds at most 3 page frames. During merge stage 2 page frames are used for input and for output. Ts 31 rae ht =H ater = = zy = [it ea ets e135 ra ar Tr ma a7 oe) \ peta ate a eh Ye ran a Ron Merge argo Poet Past Fig. Merging Process 4.5 Join Operation - It involves several algorithms for computing join of relation and analyzing their cost. = It involves following join operation (1) Nested loop join (2) Block nested loop join (3) Index nested loop join (4) Merge join . (8) Hash join, 182 4.5.1 Nested loop join = This algorithm is called nested loop join as it basically consist of pair of nested for loop. This algorithm is use to compute theta join rp40s of two rel: jon and s is called i ~ The relation r is called outer rel = The nested loop join algorithm is expensive, as it exan Algorithm : Foreach tuple tr in r do begin Foreach tuple ts in s do begin test pair (tr, ts) to see if they satisfy join condition ifthey do, not tr:ts to the result end end ‘The computation cost of above algorithm can be calculated in worst case Total block access = nr* bs + br Here nr = number of tuples in relation r bs = No of block for relation 5 br= No of block for relation r In worst case buffer can hold only one block of each re *bs + br block accesses would be required. In best case there is enough space for both relations to fit in memory so each block would have to be read only once hence yn and total nr Total block access = br + bs 4.5.2 Block Nested Loop Join ~ Ifbuffer is too small to hold either relation entirely in memory, we can still obtain a major saving in block accesses if we process the relations on per block basis rather than on per tuple basis. is a variation of nested loop join where every paired with every block of outer relation. = Block nested loop j block of inner relation + Po-each block Rr ofr da begin Foreach block Bs of s do begin _ Foreach tuple tr in Br do begin ‘and seek time: ts in Bs do begin to see if they satisfy join condition i end Computation cost of above algorit! = Worst case In worst case their will be total block access = br* bs + br where brand bs denote number of block containing records of rai ~ Best case total block = br + bs access 4.5.3 Indexed Nested Loop Join and index is used to lookup tuple: tuple tr. 4.5.4 Merge Join Best case Total block accessed = br1 + br2 Worst case Total block accessed required = br2 * [2 log, (br2/3)] + 1 for r2 lock accessed required = br1 * [2 log, 9br1/3)} +L Total forrl Total block accessed required = r2 + rl ‘TechnoScan - Database Management System, 5.5 Hash Join Like merge join algorithm hash join algorithm ean be used to ment natural join and equi joins, hash join algorithm, a hash funct nh is used to partition tuples of ns. The basic idea is to partition tuples ofeach of the relations to sets that have same hash value on join attributes. In Best case: Total block access read = 3 (br + bs) or 3 (bri + br2) iy C46 Materialization : K @. What is materialization? Explain it with the help of example. To evaluate an query containing multiple operations. One of the way is to evaluate one operation at a time in specific order and result of each evaluation is materialized in temporary relation for subsequent use. ~ Materialization is the process of storing output of an operator operation the temporary relation for processing by the next operator. Materialization process starts from lowest level operation in the expression, which are at the bottom of the query tree. Consider the below relational algebra query as n, aoolA'CCOUNt) pq Customer} - The pictorial representation of the above query as shown below. Gon <2000 ‘Account customer Fig, Pictorial rapresentation of query In onter to evaluate above query we apply materialization which start : Jowest level operation that is from selection operation on account relation, for above query. . _ Tien the result obtained by selection operation is stored in temporary © © setation which is use as input to operation at the next level in the tree. Unit-4 165 - Here next level operation is join which takes input as temporary relation created before and customer relation, @) roduces another temporary relation which is given as input to the projection operation at the root of tree and hence complete query is evaluated. ~ Then join operation can be evaluated an = This type of query evaluation is called materialized evaluatio result of each intermediate operation are created and then used for evaluation of next level operation. - The main problem with this approach is the need to construct temporary relation which must be written to disk. (AY Pipelining MM Sell: @. Explain pipelining process. js.) - Pipelining is the process of passing result of one relational operator to another operator directly without storing it in temporary relation. - Pipelining is used to improve performance of the queries, as we know the result of intermediate algebra operations are stored on secondary storage temporarily. - This process of temporarily writing intermediate algebra operations is called materialization. - The material jon process starts from lowest level operation on input relation at bottom of the query tree and output is stored in temporary relation, which is then used to execute operation at the next level of tree, same process is repeated and finally operation at root of the tree is evaluated and we get final result. - The efficiency of the query evaluation can be improved by reducing the number of temporary files produced. So several relational operations are combined into pipeline of operation in which result of one operation pipelined to another operation without storing it into temporary relation, ~ Apipeline is implemented as a separate process within the DBMS. - Each pipeline takes stream of tuples from it’s inputs and stream of tuplesasit’s output = A buffer is created for each pair of adjacent operation to hold tuples. ‘being passed from 1* operation to second one. - Pipeline operation eliminates the cost of reading and writing temporary relations. Example of Pipelining ase 200 hy (@CC)) In this query pipeliningcan be used, as here first projection operation is performed whose output directly given as input to selection operatic instead of storing it in temporary relation. 166 TechnoScan - Database Management System Pipeline can be executed in cither of two ways (2) Demand Driven (2) Producer Driven 4.7.1 Demand Driven: jpeline system makes repeated request for tuples. wv est for tuples, it tuple to be returmed and then returns that tuple. = System keeps trick of ed so far. n also makes request for tuples from it’s pi Using tuples received from it's pi tuples for it’s output and passes them up t ~ Demand driven pipelining involves pulling data up from the operation at the top. ined inputs, the operation computes 4.7.2 Producer driven pipelining = - In producer driven pipeline operations do not wait for request to produce tuples but it generate the tuples itself. = Each operation at bottom of a pipeline continually generates output tuples and puts them in it’s output buffer until buffer is ful - An operation at any level of pipeline generates output tuples when it gets input tuples from lower operation in pipeline, until it's output buffer is full. - Once operation uses tuple from pipelined input, it removes tuple from it's input buffer, - Once output buffer is full, the operation waits until it’s pai Femoves tuples from buffer, so that butfer has space for more tuples. er is full ‘operation ~ At this point, the operation generates more tuples, unt again. - The operation repeat this process until all the output tuples have been. generated. - The producer driven pipeline involves pushing data up from below. Let relations r1(A, By 2? as 20,000 tuple and 12 has 45,000 tuples 25 tuples o block HTS ¢ Problem Based On Join Operation 9°“ « and r2 (C, D, E) have following pro, 0 tuples of r2 on 1 Block. ft eee eee eee ing join strategies of r1hir2 # (a) Nested loop join poe Z(b) Block nested loop join Ac) Merge join (a) Hash join [s-10,11,13} a 20,000 _ 5 11-2002. vcs (ot) 45,000 Boag 71500 blocks (be2) For Nested Loop Join: asl pa thington In Be pelslaon and 2 is inner. Worst case “Roof dak access = nr Here nrl = No of tup bel and br2 = No ofblocks fr + be2 + bel = 20,000 * 1500 + soo Total block access No of disk access = 30,000, 800 Best Casi Total block access = bri + br2 = 800+ 1500 2300 ‘TechnoSean - Database Management System {0} For Block Nested Loop Join: jlock access = br] * br2 + bri = 800* 1500 + 800 00800 est Case: lock access = br1 + br2 = 800+ 1500 = 2300 (c} For Merge Join Rest Case 1ek access = bri + br2 = 800 + 1500 = 2300 Worst Case : ~~Fotal block access for r2 = br2 * [2 log, (br2/3)] + 1 = 1500 * [2 log, (1500/3)} = 26808.35 ‘Total block access for rl = bri * {2 log, (br1/3)] + 1 = 800 * [2 log, (800/3) = 12895.22 lock access required = 26898.35 + 12895.22 =39793.57 = 39793, | “Total block access = 3 (brl + br2) < = 31800 + 1500) =6900, : 4 pipet relation r1 (A, B,C) and r2 (C, D, E) have following proper Die rt ha 3 has 5000 tuples sic ri has 10,000 tuples and 2 P 8 a5 tuples of rt fits on one block and tuples of 72 fits on one block. ¢ | s 169 Estimate no of block access required using each of the follow! Join strategy for rl p42. * aca meme (a) Nested loon join () (b) Block nested loop join Merge join (d) Hash join = Calculate no of blocks required for each relation using formula. No of tuples in relation No of tuples in I Block No of block require for relation = - 10,000 _ 5% g5 ~40(te) 000 20 Sy" 100(t2) (a) For Nested Loop Join Worst Case: * Total block access = nrl “br2 + bri = 10,060 Kt00’» 400 = 1,000,400 Best Case: , Total block access = bri + br2 = 400+ 100 = 500 (b) For Block Nested Loop Join : Total block access = brl * br2 + bri = 400* 100 + 400 = 40,400 Total block access = bri + br2 = 400 + 100 = 500 “1 ‘ 170 TechnoScan - Database Management System } (0) For Mexge Join ; Best Case : Total block access = bri + br2 = 400+ 100 =500 Worst Cas ‘Total block access for r2 = br2* [2 log, (br2/3)] + 1 = 100 * (2 log, (100/3)} +1 = 1012.77 Total block access for rl = bri *[2 log, (br1/3)] + 1 = 400 * [2 log, (400/3)] + 1 = 5648.11 Total block access required = 1012.77 + 5648.11 = 6660.8849 = 6661 (a) Hash Join: Best Case: . Total block access = 3 (br1 + br2) =3 (400+ 100) i = 1500 \ fee || Za query Optimization : | gy rurerve query optimization process in detail. [s-12] Explain query optimization. [w-12] - It's a process of selecting the most. jon plan from many strategies use for processing given query especially when query is complex. - Generally we can not expect from users to write their queries so that they can be processed efficiently, but we expect the system to construct. query evaluation plan that minimizes the cost of query evaluation - Query optimization occurs at relati attempts to find an expreasion the but more efficient to execute. - The query optimizer takes relational algebra expression as input and produces efficient query plan as output. - The relational algebra expression for above probl (branch pq (account pq depositor) are interested in only that tuples which are ré in Nagpur. tuples of branch relat expression. Preannamel Speny » ragpur - The above expression adve rel combine with account and depositor table and th operation. ~ As there are many equivalent transformations of same izer is to choose one that minimizes resource usagi Painter ral ial expressi sa role of query optimizer to come plan that computes same result as that of - Computing the, the plan. ee pOpumnize is not possible without actual ‘TechnoScan - Database Management System ‘cess is slow compared to memory access. Usually dominates of processing query. - Using these statistics with cost formulae allows to estimate the cost of ‘alual operation, ‘the individual cost’s are combined to determine the estimated cost of evaluating a given relational algebra expression. (Query evaluation plan's generation for an expression involves + Generating expressions that are logically equivalent to given expression. + Annotating resultant expression to get alternative query plans. 4.9 Transformation of Relational Expressions ie} - The order of the tuples is irrelevant means two expre: generate tuples in different orders but would be consider equivalent as long as set of tuples is same. lational algebra expressions are said to be equivalent ifon every jatabase instance, two expressions generate same set of tuples. ‘A set of equivalence rules can be use to generate another expression equivalent to previous one. 4.9.1 Equivalence Rules : @. Explain various equivalence rules of transformation of relational expression and give it's pictorial representation. according to equivalence Rule, expressions of two forms are equivalent timeans we can replace an expression of the first form by an sssion of the second form or vice versa, since two expressions would erate same result on any valid database. The optimizer uses equivalence rules to transform expressions into other iogically equivalent expressions. in equivalence niles 0, 0, and 0, denotes predicates whereas L,, Ly, Ly denote list of attributes and E, E, and E, denote relational algebra expressions, Various equivalence rules are given as follows, 1 Conjunetive selection operations can be deconstructed into a sequence of individual selections. + 904004 (E) #60, (Sus (8) 2 Selection operations are commutative 70 (E)) = 902 (8% (E)) i Unit-4 i 3. Only the last in a sequence of projection operations is needed, the others can be omitted, _ a (Ps (- (5 ©))-)=!rag (2) ©) 4, Selections can be combined with cartesian Products and theta joins. a. 9, (B,*E,)=E,04,6, This expression is just definition of theta join B64 (E >a By) = E, Mey anEe : 5. ‘Theta-join operations are commutative 5,9 B= E,4,B, -— is. (E, pa E,) pa E, = E, pa (E, pa E,) i 1 Theta joins are associative in the following manner 1B, bys BaP a Bs Ey bean (EP BS) Here 0, involves attributes from only B, and Ey 7. The selection operation distributes over theta join operation under the following two conditions a. When all attributes in selection condition @, involve only the attributes of one ofthe expression being jined Gq (E, Pd, E,) = (0, (E,)) Po (Sy, (E)) te When selection condition 6, involves only the attributes of E, and, involves only the atirboutes of E, Sayan EB, 4, Ey) * (8, (EN) D4 (64, (E,) 8. ‘The projection operation distributes over the theta join operation as follows, a, Ifjoin condition q involves only attributes from L; ULe then May ure (Bb B= Oy (EP (h(E) b Consider ajoin E, og E, Let L, and L, be sets of attributes from E, and B, Let L, be attributes of E, that are involved in join condition 0, but are not in L; ULg, then 1 b4 E)= Myute (Mey uts (Ey) bg Uy es (6) k ‘The sct operation union and intersection are commutative FUE, =E2UEy Thyute secnnoscan - Database Management System ———_________zammoscan - Database Management System, E, NE, =E,NE; Set union and intersection are associative 10. (E, UE») UE, =E; U(E2 UE) (E, NE,) NE, = E, N(Ez NEs) 11, The selection operation distribute over U, and ~E2)= 09 (E1)~ 0 (Ez) and similarly for Uand/ in place of - also % (Ey ~ Ez) =o (E,) Ep and similarly for 11 in place - but not for U 12, The projection operation distributes over union mM, (E, UE2)=(T (E1))U(M, (Ez) ‘Theta oin operation are commutative bX, DX, LN / \ & & @ 5 5 hues, { \ i \ ok ho ae, Fig. Pleortl representation of equiv Example: Of transformation using equivalence rule ¢ Consider the schema Branch (Bname, B Account (accno, Bname, bal) Depositor (name, aceno) - Based on above schema suppose query is ask, Tlesame(Srory«tagaur (bTANCH PA (account depositor Igebra expression can be transform t sing rule 7a. = The above relat equivalent exp! Tasane( pry «nagnue (BFANCH)) Pe (account ba depositor) r intermediate relation compa’ - This expression generates smal original expression. 4.10 Estimating statistics of Expression Result nds on size and other sta h as abd (b bac) to est + The cost of an operation de cost of opera 4.10.1 (a) Catalog Information : The DBMS catalog stores the following sta database relations : nr The number of tuples in rel br The number of blocks containing tuple of relation r. dr Size of tuple of relation and in bytes. fr -> The blocking factor of re relation r that fit into one block. WA, 1) > The number of di attribute A. 4.10.2 (b) Join Size Estimation ~ The cartesian r x s contains nr ns tuples, Each tuple of r x s occ Ir + Is bytes from which we can calculate the size of cartesian prov ~ Estimating the size.of natural join is somewhat more.¢omplica estimating ore no ata carielan ponent ‘TechnoStan - Database Management System By considering all tuples in r, we estimate that there are nr + ns/V les in r®4s, If we reverse roles of r and s in the preceding . we obtain an estimate of nr + ns/V(A, 1) tuples in rbd. csampie: ‘Jo illustrate way of estimating join size consider the expression depositor ving catalog information about 2 relations. = 10,000 and = 25 = 1000/25 = 400 _ = 5000 and = 50 nate 7 3000/50 = 100 ‘name, depositor) = 2500, it implies that on avg, each customer nts. jpute size estimates for depositor > customer without using about foreign key ‘ustname, depositor] = 2500 and v(custname, customer) = 10000 mates we get here by using formula #15 _ 5000 «10.000 _ 99.999 and 2500 1s _ $000 10.000 _ gong 10,000 ‘and we choose the lower one. AEA: choice of Evaluation Plans : jon of expression is only part of query opti process, fe each operation in the expression can be implemented with werent algorithms. fore evaluation plan defines exactly what algorithm is used for ‘cperation and how execution of operation is coordinated. ynal algebra query (branch P¢ (account M4 depositor .e expression for 1 sv aee (Ouse Unit-4 rh One possible evaluation plan for above expression is - In the below fig, the edges from selection operations to the merge join operation are marked as pipelined, operation generate their output sorted on the join attributes. They would do so if indices on branch and account store records with equal value for the index attributes sorted by branch name. (hash join) >A merge join) ceposior Brancn Fig. An eval Classification of Query Evaluation Plan ‘The query evaluation plan may be classified 0 the following, = Left deep tree query evaluation plan - Right deep tree query evaluation plan « Linear tree query evaluation plan - Bushy (Non linear) tree query evaluation plan. 178 TechnoScan - Database Managoment System ) Lett deep tree query evaluation plan, In this type of query evaluation the bottom of tree and proceed for Fo. Ln dep execution plan ' (2) Right deep tree query evaluation plan = In tis type of evaluation plan again executio touon cece and proceed brverd wih sght handase input tea boeey \ operation is an intermediate result and left hand side input as stored ‘lation. I's structure fs shown below. tarts from rel Fig. Right deep ex (3) Linear tree query evaluation plan 7 its a combination of left deep and right deep trees where relation on fone side of operator is always a base relation. It's structure is shown (4) Bushy tree query evaluation plan + This type of plan is the most general type of plan, inputs into binary operation to be intermediate resul is shown below. Fig. Bushy execution plan ’ 12 Evaluation Techniques/optimization Techniques @ ion techniques . Explain different query optimiz ~ One way to select eval each ope ~ We can choose oper the tree. ~ Choosing cheapest algorithm for each ope necessarily a good idea, al costlier than a hash j evaluating a later operat + Wecanuse rates +can be used for, ‘TechnoScan - Database Management System cost of evaluation plan using stat ches all the plans and chooses best plan in cost based fashion. Heuristic based optimization. ir uses heuristics to choose a plan. 4.12.1 Cost based optimization @ Explain cost based optimization technique in detail. Is-12] Q. Write note on cost base optimization. Is-11] c optimizer generates a range of query evaluation plan from ing equivalence rales and chooses the one with least x query, the number of different query plans that are lent to a given plan can be large.) sn can be done by using below steps Choose the cheapest plan based on estimated cost. ‘thé costof plan can be estimated by using following information. inca information abouit relations such as number of tuples, or of distinct values for attribute. wn for intermediate results, to compute cost of capressions iy important to find optimal join order to have efficient query ler the expression rl 4 r2.>4 13, for this expression there 12 nt join ordering can be possible as r1 bd (r2 bd r3), 3 ba (rl Par2) (2-9) join onder can be general with n relations there are , . te! ‘amie programming algorithm to find optimal join order. of computations and se dyn ining algorithm store resul tow fig | shows dynamic programming algorithm for join order. unite 181 Dynamic-programming algorithm forjoin order optimization Procedure Find best plan (S) iflbestlan{s).cosea) — J) best nlan (S} already camped ronan bestian[s] (S contains only 1 relation) Set best plan (s]. plan and best plan accessing 5 Else for each non-empty subset S1 of S such that S1 ¢S PI = Find Best Pan (S1) 2 Find Best Plan (S -S ‘A= Best algorithm for joining results of Pl and P2 cost» Pl. cost * P2. cost + costof A itcost < best plan [8]. cost best plan [S}. cost = cost best plan [S] plan = “execute PL.plan; execute P2. plan; join results of Pland P2.using AY return best plan cost based on best way of Fig, Oynamic Programming Algorithm - This algorithm stores evaluation plan in an associative array called best plan. - Each element of the associative array contains two components the The value of best plan [S]. Cost is Jan [S] has not yet been computed, = The procedure first check if the best plan for computing the join of given set of relations s has been computed already, if so it retums already computed plan. - Otherwise procedure tries every way of dividing $ into 2 discount Subsets, For each division, procedure recursively find best plan for each. sta subsets and then compute cost of the overall plan by using that division. The procedure picks the cheapest plan from among all alternatives for dividing S into 2 sets. _ The cheapest plan and it’s cost are stored in the array best plan and returned by procedure. _ The complexity ofthe procedure can be shown to be O(3n} sz ‘TechnoScan -Datsbase Management System —$——________fnstno Stan - Database Management System 4.12.2 Heuristic Optimization : [5-10] W-10, 11] (w-12][8-11) Q Explain heuristic optimization. Q Write note on heuristic optimization. @ What do you mean by heuristic optimization? Discuss main heuristics applied during it. 18-13] ~The cost based optimization is expensive, even with dynamic Programming, ~ Systems may use heuristics to reduce the numberof choices that must be made in cost based fashion. ~ Heuristic optimization transforms the query tree by using set of rules ‘hat typically but not in all cases improve execution performance. ~ Some of the heuristics rules are (1) Perform selection as early as possible + His usually better to perform selection earlier than projection since selections have potential to reduce the size of relations, 2 Perform projection easly ~ The projection operation like the selection, reduces sizeof relation henever we need to generate termporay eaten, fie chee pp ismnedatey any rejection 6) Pevorm most restictve selection and join operations before other Slr operations &) Some system use only heuristics, others combine heuristics with Paralcost based optimization’ Steps in Hearstic Optimization 1 Devonsructcosunctve elections ino a sequence of ingle selection Speraon. It is based on equ rule 1-1 facitats meee een the tree 21 Mone selection operation down the query ce ‘or eariestposibe fecution. tis base on Bgut Rule ooo, 9) Saree fet hase sleson and join operations that wl produee the smallest relations. Itis base on Equi Rule @ st Wi Produce 9) pies creian product operations that ae flowed by eclect condition by join operation. Itis based on Equi Rule: 4a. The cartesiz 7 Producti expensive to implement = 3 seanatctand move as far down the tre as possibietist of projecn istof projection Ruiahules. creating new projections where needed fe base on equi Sugibaes creating ne ceded. It ‘on equi unit — rations can be pipelinseaan 6) Identify those sub tree whose ope ‘ ig pipelining. i bove recorder an initial query tree rep Tre persed sone ocr an ial ue ee a results are applied first. Advantages : eer chosen 1. The access plan selection phase of heuristic optimizer chos: strategy for each operation. stem performs ructure the tree so that system p y ve selection and join operation before other jemative sequence of operations to produce a set of can evaluation plan. Q 13 Materialized View What is materialized view? Explain it with suitable example. Q Explain the concept of materialized view along with its maintenance. Aas. ~ Amaterialized view is a sub table created for an origin: result obtained by firing query on original table 's virtual in nature but materi n disk. fen used in data warehousing business intelligence applications where querying large tat lee thousands of row takes more time. ~ Materialized view helps us by storing frequently accessed data of large table. ~ Mcterialized view also helps in providing faster access todata and he enhance the query performance. ~ Undoubtedly materialized of vi + Materializing all possible + Performance but at the hi cost. 7 fearing all the views virtual will have lowest view maint. Dut poor query performance, hence to maintain tela be Re may materialize some of the view while leaving ether interme ize query response ti warehouse st of view maintenance ‘Syntax for creating materialized view : Create materialized view (Materialized view name] as select att name from table name: / TechnoScan - Database Management System * example, Suppose We created table bank with attributes as ids, DOB, Bal, contno) Sank (Aecno, cname, Sank needs to access only cname and aceno frequently rather th adds, DOB, Bal and contno. So in order to save our time we store cay acend and name in separate sub table called materialized vem, cessing accno and ename from original table will take more time as compare to accessing it from materialized view. - Here we create materialized view called V1 as shown below. Create materialized view V1 as select accno, name from Bank. - In creation of materialized view we face two problems as () View selection problem. (2) View maintenance problem View selection problem : To materialize all the views created on original table is not possible because of storage space constraint. View maintenance problem : 4 problem with materialized views is that they must be kept up to date chen data used in base table changes. particular custname changes in bank table then omes inconsistent and it should be updated. up to date with the underlying or example suppose materialized view bec ‘The task of keeping materialized view data is known as view maintenance. manually written code, that is every piece of aterialized view V1 \iew can be maintain by code that updates the custname in mi ning materialized view is to define triggers ‘of each relation in the view definition. other option for maintait sert, delete and update the contents of materialized view to take ‘at caused trigger to fire, A simple way of materialized view on every on in ihe triggers must modify nt the change thi ly re-compute the ni acco Going 80 is to completel update, 4 better option is to modify only the affected parts of materialized view, ‘ich is known as incremental view maintenance: tduciern database systems provide more direct support for incremental sew maintenance. 5.0 Int - This prop - Itals ~ It foc - Itals andi - It giv ! imple - At la preve - This v | ne yatta, Defin Lae . | Trans: seque ~ It incl ! deletic ! - Transé variou = Gener level ¢ | is deli transa ' - The tr transa = Ifthe then d - For ex book fr from a

You might also like