0% found this document useful (0 votes)
2 views18 pages

3 Query Processing and Optimization-1

This document discusses query processing and optimization in databases, detailing the steps involved such as parsing, optimization, and evaluation. It emphasizes the importance of energy efficiency and cost measurement in query execution, including factors like disk access and CPU time. Additionally, it covers various selection operations and sorting techniques, particularly focusing on external sort-merge algorithms for handling large datasets.

Uploaded by

neupanepratik1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
2 views18 pages

3 Query Processing and Optimization-1

This document discusses query processing and optimization in databases, detailing the steps involved such as parsing, optimization, and evaluation. It emphasizes the importance of energy efficiency and cost measurement in query execution, including factors like disk access and CPU time. Additionally, it covers various selection operations and sorting techniques, particularly focusing on external sort-merge algorithms for handling large datasets.

Uploaded by

neupanepratik1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 18
QUERY PROCESSING AND OPTIMIZATION 6 II \fter comprehensive study of this chapter, you will bo able to: % Concept of Query Processing Query Trees and Heuristics for Query Optimization * Choice of Query Execution Plans * Cost-Based Optimization.TIO —_ Advanced Database OVERVIEW OF QUERY PROCESSING Energy effcieny is an important feature in designing and executing databases. The in query processing are to transform a query written in a high-level language, typically soy, if correct and efficient execution strategy expressed in a low-level language (implementing relational algebra), and to execute tho strategy to retrieve the required data, Thus, gy Processing is the activities involved in parsing, validating, optimizing, and executing » ' = . a ‘The steps involved in processing a query processing is shown in figure 8.1 and they 7 are: 1. Parsing and translation 2 Optimization 3. Evaluation ‘Query in high-level language Query Optimizer ‘Query Evaluation Engine Query Output Figure 3.1: Steps in query processing\\ ETI Query Prcessingand Optimization TH \ parsing and Transtat ne he Query \ phe main work of a query proc; ‘e880r is Lo convert a q * ye query submitted by the user, Into A fort converts the search string into definite instruc Mery string into ‘query objects i.e., conver lunderstood by the query processing engine. It h i ‘ Hone. The query parser must analyze the query language ies recognizing and interpreting operntors (AND, OR, NOT, +, - ete,), placing the \ operators into Rroups ete. The basic job of the 8 (.f., keywords, operators, operands, literal strings ete data elements (i. relational algebra operations and open query graph), Parser also verifies the validity and gyntay o ting \ Parser is to extract the token ) into their corresponding internal ands) and structures (ie., query treo, F the query string. \ Optimizing the Query “In this stage, Query optimizer t long with the implementation methods to beemployed for each relational operator. | Beample 3.1: Consider the following SQL. query respectively: SELECT Stu_name, Stu_address FROM Student WHERE age v and scan relation sequentially from there. For oaso(r) just scans relation sequentially till first tuple > v without using anv index. . ‘AT (secondary index, comparison): We can use a secondary ordered index to guide retrieval for comparison conditions involving . ° For o42¥(r) use index to find first index entry 2 v and scan index sequentially fr there, to find pointers to records. o For oxv(?) just scan leaf pages of index finding pointers to records, till first 2™"* : v. ‘The secondary index provides pointers to the records, but to get the actual rece to fetch the records by using the pointers. This step may require an UO operation fo" 0 record fetched, since consecutive records may be on different disk blocks: as bef eh operation requires a disk seek and a block transfer, If the number of retriet large, using the secondary index may be even more expensive than using lines ‘Therefore, the secondary index should be used only if very few records are selected. es we Selections of Complex Selections it pit form A op B, wher? we have considered only simple selection conditions of the dicates So far, ‘gon operation. We now consider more complex selection pre’ equality or comparit © Conjunction: A conjunctive selection is a selection of the form: pi p02noan00F)Disjunetion: A disjunctive eetectio CLAPTEI}O Query Processing and Optimization 1 mii r © ptatsent isn solection of the forn isjunctive condition is wati A dig} . ton is satisfied by the union of all simple conditions Oi, all records satisfying the individual, Nogation: The result of a selection 6 u(r) “w(?) is the vet, n evatnten fle. Tn the nbwence sta of tuples of r for which the condition 0 etn out. Imply the set of tuples in r that are A8 (conjunctive selection using on e fi 1: available for an attribute in one of the nite itt che k if there is an access path t simple conditions 0, to redw h. ice th 4 04 and one of algorithms Al through A8 for which the conbieuie Rast Es iss sae cost for oui(r). The cost of algorithm A8 is giver E Ag| anlunetive eslecticn aaa A given by the cost of the chosen algorithm. ae 7 ‘omposite index): An appropriate composite (multiple-key) index may be available for some conjunctive selections. If ‘te i exists on the combined attribute fields, then the index can be searched dinate ™ + A10 (conjunetive selection by interesting of identifiers): This algorithm requires indices with record pointers, on the fields involved in the individual conditions, The algorithm uses corresponding index for each condition, and take intersection of all the obtained sets of record pointers. Then fetch records from the file and if some conditions do not have appropriate indices, apply test in memory. + All (disjunctive selection by union of identifiers): Indices can only be used if there is an index for all conditions; otherwise, a linear scan of the relation has to be performed any way. Uses corresponding index for each condition, and take union of all the obtained sets of record pointers. Then fetch records from file. SoRrTING pak ee Sorting in database system is important for two reasons: 1. Aquery may specify that the output should be sorted / 2. The processing of some relational query operations can be implement , i tions e.g,, join operation. efficiently based on sorted rela aa For relations that fit in memory, techniques like quick-sort can be used and for relations ft in mer . i be used. not fit in memory an external sort-merge algorithm can be ted more Igorithm External Sort-Merge Als nal sorting, The most commonly ' is called exter : M denote i ; in memory is ea rithm, Let 3 Porting * ae . mee phe is the external sort-merse algo! technique for exter Memory size (in pages). 1. Create sorted runs. Initialize (=O. of the elation (Let the final valu Repeat the following till the end we 2) Read M blocks of relation nto met ks b) Sort the in-memory blo“ ©) Write sorted data tour Re d) intel | ye of i be N)416. Advanced Database 2. Merge the runs (N-way merge). We assume that N M, several merge passes are required. In each pass, contiguous groups of M- 1 ry, | merged. A pass reduces the number of runs by a factor of M -1, and creates runs longer _ | same factor. Repeated passes are performed till all runs have been merged into one, * a|w a| at a | 19 s | 4 3 | 2 b | 14 a |u| a | 9 =) ae ila a | 31 b | uw ales val | m.salary; ‘This query retrieves the names of employees who earn more than their supervisors. S17 that we had a constraint on the database schema that stated that no employee can ea than his or her direet supervisor. Ifthe semantic query optimizer checks forthe exstene oF constraint, it does not need to execute the query at all because it knows that the rest ae query will be empty. This may save considerable timo if the constraint checking ©? ee efficiently. However, searching through many constraints to find those that are applicable ’ given query and that may semantically optimize it can also be quite time-consuming: inclusion of active rules and additional metadata in database systems, semantic ptimization techniques are being gradually incorporated into the DBMS.afer get eransfort the expression and tree into equival pie 35: Consider the following SQL, queryy pon SELECT Stu_name, Marks Obtained FROM Student, Marks corresponding relational algebra expression jg: A Tis. name, Marks Obtained(GStu.id=10 (Sub jte09 (Student o4 Marks) Tle sae Ma ties sade es Figure 3.7: Initial expression tree Suppose the Student and Marks relations both have 100 records each and the number of Stu _id=10 is 50. Note that the Cartesian product resulting in 10,000 records can be reduced by 50% if the o Stu_id=10 operation is performed first. We can also combine the Sub_id=20 and Cartesian product operations into a more efficient join operation, as well as eliminating any unneeded columns before the expensive join is performed. The diagram below shows this better, “optimized” version of the tres Tsay ames Matis bined Pasa iansruid an Ths ont Sat ie | de red tree of figure 3.7) MTree after transformation (opti yuery optimizer can use to it and theorems the a e several aro equivalent relations states that the set of the definit tbe the same—because they are sets, the order does not be the y > st | algebra theorems: Figure 3.1 Inrelational algebra, there ar "ransform the query. For instance, ‘Attributes (domain) of each relation mu Matter, Here is a partial list of relationa!484 Advanced Database Cascade of o: A select with con} cascade of selects upon selec iaiaata anal) # ON1(FAa(~e(GAN(P)---)) 2, Commutativity of o: The select operation is commutative: oai(oaa(r)) = 5a2(oai(?)) 3. Cascade of II: A cascade of proj the caseade: Tatts (TTatisa(.(TTatia(*))---)) = Hatin () 4. Commutating o with TI: Given a 11's and o's attribute of Ar, Az, . operations can be commuted: Fanatic orn (6e(P)) = Oc(FTata....An(P)) Commutativity of b¢ (or x): The join and Cartesian product operations are commutative reasesoar andr 6. Commuting & with »4 (or x): Select can be commu! 1 junctive conditions on the attribute list is equiva i TL a ect operations is equivalent to the last project operat tion An, the Mand g xr ted with join ( or Cartesian product) as follows: Ifall of the attributes in the select’s condition are in relation r then or 64 8) = (6-(7)) 248 b. Given select the condition c composed of conditions cl and c2, and cl contains only attributes from r and 2 contains only attributes from s then - ‘o

You might also like