0% found this document useful (0 votes)

11 views77 pages

Query-Processing

The document covers query processing and optimizations in database management systems, focusing on the implementation of relational operators and the importance of execution plans. It discusses concepts such as indexing, external merge sort, relational algebra, and file operations, emphasizing the cost of various operations and the impact of query optimization techniques. The course is instructed by Dr. Shirshendu Das at IIT Hyderabad, with recommended readings from the textbook 'Database Management System' by Raghu Ramakrishnan.

Uploaded by

hood.robin.black

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views77 pages

Query-Processing

Uploaded by

hood.robin.black

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 77

CS3563: Introduction to DBMS II

Query Processing and Optimizations

Course Instructor
Dr. Shirshendu Das
IIT Hyderabad
[email protected]

Most of the content of these slides are based on the textbook:

“Database Management System”, by Raghu Ramakrishnan.

Read Chapter 12 and 13 from this textbook.

R
M
 The relational operators serve as the building blocks for query evaluation.
 Queries, written in a language such as SQL, are presented to a query optimizer,
which uses information about how the data is stored to produce an efficient
execution plan for evaluating the query.

 Finding a good execution plan for a query consists of more than just choosing an
implementation for each of the relational operators that appear in the query.
 For example, the order in which operators are applied can influence the cost.

 This chapter considers the implementation of individual relational operators.

 introduction to query processing
 highlighting some common themes that recur throughout this chapter
 how tuples are retrieved from relations while evaluating various relational
operators.
 implementation alternatives for the various operators.
2
R
M
Concepts you have to know to understand this chapter:

 Indexing:
 Hash based indexing
 B+ Tree based indexing
 Remember both hashing can be either clustered or unclustered.

 External Merge Sort:

 You have to know the concept of runs, page availability in main memory.

 Relational algebra:
• Select, Project, Join
• Selection Condition

 File operations:
• Scan, Equality search, Range search etc.

3
R
M

An index matches a selection condition if the index can be used

to retrieve just the tuples that satisfy the condition.

4
R
M

5
R
M

6
R
M

7
R
M

Cost:

8
R
M

Cost:

9
R
M

10
R
M

Cost:

11
R
M

12
R
M

Cost:

13
R
M

Cost:

14
R
M

15
R
M

16
R
M

17
R
M

18
R
M

19
R
M

We can now retrieve the necessary pages of Reserves to retrieve

20
tuples, and check bid=5.
R
M
 Let us consider the case that one of the conjuncts in the selection condition is a
disjunction of terms.
 If even one of these terms requires a file scan because suitable indexes or sort orders are
unavailable, testing this conjunct by itself requires a file scan.

 For example suppose that the only available indexes are:

a hash index on rname and a hash index on sid

Now if the selection condition contains just the conjunct

(day < 8/9/94 ∨ rname=‘Joe’)

We can retrieve tuples satisfying the condition rname=‘Joe’ by using the index on rname.

However, day < 8/9/94 requires a file scan.

If the selection condition is (day < 8/9/94 ∨ rname=‘Joe’) ∧ sid=3, the index on sid
matches the conjunct sid=3.
21
We can use this index to find qualifying tuples and apply day < 8/9/94 ∨ rname=‘Joe’ to
R
M

22
R
M

If we use temporary relations at each step the first step costs:

1. M I/Os to scan R, where M is the number of pages of R, a
2. T I/Os to write the temporary relation T is the number of pages of the
temporary.

The first and third steps are straightforward and relatively inexpensive.

23
R
M

1. We can scan Reserves at a cost of 1,000 I/Os.

2. If we assume that each tuple in the temporary relation
created in the first step is 10 bytes long, the cost of
writing this temporary relation is 250 I/Os.

3. Suppose that we have 20 buffer pages.

4. We can sort the temporary relation in two passes at a cost
of 2 ∗ 2 ∗ 250 = 1, 000 I/Os.

5. The scan required in the third step costs an additional 250

I/Os.
6. The total cost is 2,500 I/Os.
24
R
M
This approach can be improved on by modifying the sorting algorithm to do
projection with duplicate elimination.

Two important modifications to the sorting algorithm adapt it for projection:

1. We can project out unwanted attributes during the first pass (Pass 0) of sorting.
2. We can eliminate duplicates during the merging passes.

In fact, this modification will reduce the cost of the merging passes since fewer tuples are written out in
each pass.

Let us consider our example again:

1. In the first pass we scan Reserves, at a cost of 1,000 I/Os and write out 250 pages.
2. With 20 buffer pages, the 250 pages are written out as seven internally sorted runs.

3. In the second pass we read the runs, at a cost of 250 I/Os, and merge them.
4. The total cost is 1,500 I/Os.

5. Which is much lower than the cost of the first approach used to implement projection.
25
R
M

Partitioning:

26
R
M
Duplicate Elimination:

27
R
M

28
R
M

29
R
M

30
R
M

31
R
M

Main Memory

R S

32
R
M

Main Memory

R S

33
R
M

If each I/O costs about 10ms on current hardware,

then this join will take about 140 hours! 34
R
M

35
R
M

36
R
M

Main Memory

R S

37
R
M

Main Memory

R S

38
R
M

39
R
Main Memory M

R S

40
R
M

41
R
M

Main Memory

R S

42
R
M

43
R
M

Main Memory

R S

44
R
M

Main Memory

R S

45
Block Nested Loop Join With In-Memory Hashing:
R
M

Main Memory

R S

46
R
M

47
R
M

48
R
M

R S

49
R
M

50
R
M

51
R
M

Cost of sorting Reserve

Cost of sorting Sailor

52
R
M

53
 Hash Join R
M

Main Memory
Relation R

h = Bucket 0 if most significant digit of ID is even. Otherwise Bucket 1

54
*****This is an example of how partitioning happens in hash based join. The relation (table) considered is not from book (RM).
 Hash Join R
Partitions of both R and S using the hash function h. M
Partitions of R
Partitions of S

Note that the hash function must have

to distribute the tuples uniformly
throughout the partitions.

In these example the hash function is

not doing so. I have taken this hash
function just to explain the procedure.

Actually the hash function is not correct

as it is not distributing the tuples
uniformly.

55
R
 Hash Join M

Main Memory

Once a match found between r and s where

I consider 5 pages of Main r is from R and s is from S. Write <r, s> into
Memory here but in the this buffer.
previous step (partitioning) I
considered only 3. Actually both
must be same.

If 5 pages are available then

each relation can be divided
into 4 partitions (see prev.
slide).

I used only two partition to

R S
make the example simple.

If r is a tuple from R and s is a tuple from S then <r,s> is the tuple in the natural join of R and S provided the condition
56 of
natural join is satisfied by these two tuples.
R
M

57
R
M

58
R
M

59
R
M

60
R
M

61
R
M

62
R
M

63
R
M

64
R
M

Another Example of Pipelining

65
66

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
67

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
68

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
69

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
70

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
71

This slide is based on our text book “Database System Concepts”. But you no need
to read book for this. Understanding the contents in slide is enough.
Assume:
number of pages in T1 as 10
number of pages in T2 as 250
number of buffer pages is 5.

Cost:
1. Applying sort merge join for T1 and T2:

Cost:
2. Applying block nested join for T1 and T2:

72
Assume:
number of pages in T1 as 10
number of pages in T2 as 250
number of buffer pages is 5.

Cost:
1. Applying sort merge join for T1 and T2: 4060 I/O pages
2. Applying block nested join for T1 and T2: 2770 I/O pages

73
Suppose that we have the following two index available:
1. a clustered static hash index on the bid field of
Reserves
2. a hash index on the sid field of Sailors.

74
Suppose that we have the following two index available:
1. a clustered static hash index on the bid field of
Reserves
2. a hash index on the sid field of Sailors.

Sort (on sid) and materialize the result of bid=100 (T1)

75
We have the following two index available:
1. a clustered static hash index on the bid field of
Reserves
2. a hash index on the sid field of Sailors.

76
End of Query Processing and
Optimisation

Master's Study Plan (Study Plan For Master Degree Program) - (Essay Example), 1457 Words GradesFixer
80% (5)
Master's Study Plan (Study Plan For Master Degree Program) - (Essay Example), 1457 Words GradesFixer
4 pages
CHMA Unit - I
No ratings yet
CHMA Unit - I
29 pages
Siemens Modbus and Ion Technology
No ratings yet
Siemens Modbus and Ion Technology
21 pages
L10-Query Evaluaion
No ratings yet
L10-Query Evaluaion
50 pages
DBMS R19 UNIT IV
No ratings yet
DBMS R19 UNIT IV
25 pages
7-Query Processing
No ratings yet
7-Query Processing
47 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
QEII
No ratings yet
QEII
44 pages
Course08 - RelEval
No ratings yet
Course08 - RelEval
22 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
13 QP1
No ratings yet
13 QP1
33 pages
Evaluation of Relational Operations: Chapter 14, Part A (Joins)
No ratings yet
Evaluation of Relational Operations: Chapter 14, Part A (Joins)
6 pages
Final Review
No ratings yet
Final Review
96 pages
Query Processing + Optimization: Outline: Operator Evaluation Strategies
No ratings yet
Query Processing + Optimization: Outline: Operator Evaluation Strategies
53 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Unit_IV_Part_II
No ratings yet
Unit_IV_Part_II
37 pages
Ch 13 Updated
No ratings yet
Ch 13 Updated
30 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
05_optimization (2)
No ratings yet
05_optimization (2)
58 pages
Notes On DBMS Internals: Preamble
No ratings yet
Notes On DBMS Internals: Preamble
20 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
Midterm 13w2
No ratings yet
Midterm 13w2
8 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
Hash Tables and Query Execution: March 1st, 2004
No ratings yet
Hash Tables and Query Execution: March 1st, 2004
32 pages
BCS Topic
No ratings yet
BCS Topic
66 pages
DBMS UNIT 4 Part 1
No ratings yet
DBMS UNIT 4 Part 1
15 pages
Notes On DBMS Internals: Preamble
No ratings yet
Notes On DBMS Internals: Preamble
27 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
4
No ratings yet
4
16 pages
unit-2 Query processing and optimization,Query equivalence, Join strategies (1)
No ratings yet
unit-2 Query processing and optimization,Query equivalence, Join strategies (1)
38 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Introduction To Query Processing and Query Optimization Techniques
No ratings yet
Introduction To Query Processing and Query Optimization Techniques
77 pages
Evaluation of Relational Operations: Other Techniques: Chapter 12, Part B
No ratings yet
Evaluation of Relational Operations: Other Techniques: Chapter 12, Part B
4 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
DBMS_Unit5_Lecture1
No ratings yet
DBMS_Unit5_Lecture1
22 pages
unit 3_DBMS
No ratings yet
unit 3_DBMS
15 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
QueryProcessing Sorting
No ratings yet
QueryProcessing Sorting
44 pages
Query Processing
No ratings yet
Query Processing
39 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
Query Execution
No ratings yet
Query Execution
87 pages
DBMS 10 Joins v2
No ratings yet
DBMS 10 Joins v2
38 pages
Database Management Systems Practice Problem Set: Query Evaluation, Optimization
No ratings yet
Database Management Systems Practice Problem Set: Query Evaluation, Optimization
3 pages
Exam 5020 L16-22
No ratings yet
Exam 5020 L16-22
10 pages
Relational Query Optimization: Plan: Tree of R.A. Ops, With Choice of Alg For Each Op
No ratings yet
Relational Query Optimization: Plan: Tree of R.A. Ops, With Choice of Alg For Each Op
7 pages
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
No ratings yet
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
32 pages
HW 3 Sol
No ratings yet
HW 3 Sol
8 pages
DINLect1.pptx
No ratings yet
DINLect1.pptx
69 pages
Unit 4_Query Processing
No ratings yet
Unit 4_Query Processing
49 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
Lesson 06
No ratings yet
Lesson 06
44 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
45 pages
hw3 Sols
No ratings yet
hw3 Sols
4 pages
Assignment3 Sol
No ratings yet
Assignment3 Sol
4 pages
4 File & Index
No ratings yet
4 File & Index
35 pages
The Dirac equation
From Everand
The Dirac equation
Alessio Mangoni
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Fuuzy set week 1 notes
No ratings yet
Fuuzy set week 1 notes
12 pages
EX7 DesignofCoupling
No ratings yet
EX7 DesignofCoupling
6 pages
Computer Vision That Can See' in The Dark
No ratings yet
Computer Vision That Can See' in The Dark
10 pages
Magic Series User Manual-MF
No ratings yet
Magic Series User Manual-MF
33 pages
8-Abstraction, Design Issues and Structures of OS-27!04!2023
No ratings yet
8-Abstraction, Design Issues and Structures of OS-27!04!2023
51 pages
Dissertation Sur Le Theatre Comique Et Tragique
100% (3)
Dissertation Sur Le Theatre Comique Et Tragique
5 pages
B.tech CS S8 Client Server Computing Notes Module 1
No ratings yet
B.tech CS S8 Client Server Computing Notes Module 1
13 pages
Information Technology For Commerce-1
No ratings yet
Information Technology For Commerce-1
49 pages
Assembly: Language
No ratings yet
Assembly: Language
5 pages
Watson Studio(Santanu Sasmal)
No ratings yet
Watson Studio(Santanu Sasmal)
54 pages
Magic Hands For Deaf and Dumb People
No ratings yet
Magic Hands For Deaf and Dumb People
6 pages
CS8601 - Mobile Computing (Ripped From Amazon Kindle Ebooks by Sai Seena)
No ratings yet
CS8601 - Mobile Computing (Ripped From Amazon Kindle Ebooks by Sai Seena)
344 pages
Cybercrime and The Law - Challenges Issues and Outcomes
No ratings yet
Cybercrime and The Law - Challenges Issues and Outcomes
16 pages
IGSS Master User Guide IGSS Version 14.0
No ratings yet
IGSS Master User Guide IGSS Version 14.0
122 pages
Implementing Offline Speech Recognition with Vosk
No ratings yet
Implementing Offline Speech Recognition with Vosk
10 pages
02 Vector Graphics
No ratings yet
02 Vector Graphics
50 pages
Bidirectional_associative_memory
No ratings yet
Bidirectional_associative_memory
3 pages
Kubernetes
No ratings yet
Kubernetes
185 pages
CAATS
0% (1)
CAATS
2 pages
Scenario Based Questions on Power Bi
No ratings yet
Scenario Based Questions on Power Bi
5 pages
Cardiovascular Disease Prediction Using Machine Learning
No ratings yet
Cardiovascular Disease Prediction Using Machine Learning
81 pages
HLK-RM08K-USER MANUAL - UART To WIFI (AP) Mode
No ratings yet
HLK-RM08K-USER MANUAL - UART To WIFI (AP) Mode
9 pages
Final Year Project Report
50% (2)
Final Year Project Report
53 pages
Patillaje XBTZ968 - XBTZ915pdf
100% (1)
Patillaje XBTZ968 - XBTZ915pdf
1 page
Hydraulic Dead Weight Testers: Nagman
No ratings yet
Hydraulic Dead Weight Testers: Nagman
3 pages
Stop and Start OBIEE 11g Linux
No ratings yet
Stop and Start OBIEE 11g Linux
5 pages
Hotel North Point Trade License
No ratings yet
Hotel North Point Trade License
1 page

Uploaded by

Uploaded by

CS3563: Introduction to DBMS II

Query Processing and Optimizations

Most of the content of these slides are based on the textbook:

Read Chapter 12 and 13 from this textbook.

 This chapter considers the implementation of individual relational operators.

 External Merge Sort:

An index matches a selection condition if the index can be used

We can now retrieve the necessary pages of Reserves to retrieve

 For example suppose that the only available indexes are:

Now if the selection condition contains just the conjunct

However, day < 8/9/94 requires a file scan.

If we use temporary relations at each step the first step costs:

1. We can scan Reserves at a cost of 1,000 I/Os.

3. Suppose that we have 20 buffer pages.

5. The scan required in the third step costs an additional 250

Two important modifications to the sorting algorithm adapt it for projection:

Let us consider our example again:

If each I/O costs about 10ms on current hardware,

Cost of sorting Reserve

h = Bucket 0 if most significant digit of ID is even. Otherwise Bucket 1

Note that the hash function must have

In these example the hash function is

Actually the hash function is not correct

Once a match found between r and s where

If 5 pages are available then

I used only two partition to

Another Example of Pipelining

Sort (on sid) and materialize the result of bid=100 (T1)

You might also like