Lec04 SQL Aggregation Grouping
Lec04 SQL Aggregation Grouping
1
Aggregation in SQL
>sqlite3 lecture04
2
Comment about SQLite
• One cannot load NULL values such that they
are actually loaded as null values
select sum(quantity)
from Purchase
where quantity is not null; 5
Aggregates and NULL Values
Null values are not used in aggregates
insert into Purchase
values(12, 'gadget', NULL, NULL, 'april')
Let’s try the following
select count(*) from Purchase
-- NULL is counted in count(*)
select count(quantity) from Purchase
-- NULL is ignored in count(quantity)
select sum(quantity)
from Purchase
where quantity is not null;
-- “is not null” is redundant
6
Counting Duplicates
COUNT applies to duplicates, unless otherwise stated:
SELECT Sum(price *
quantity) FROM Purchase
What do
SELECT Sum(price * they mean ?
quantity) FROM Purchase
WHERE product = ‘bagel’
8
Simple Aggregations
Purchase Product Price Quantity
Bagel 3 20
Bagel 1.50 20
Banana 0.5 50
Banana 2 10
4 10
Banana
SELECT Sum(price * quantity)
FROM Purchase 90 (= 60+30)
WHERE product = ‘Bagel’
9
Simple Aggregations
Purchase Product Price Quantity
Bagel 3 20
Bagel 1.50 20
Banana 0.5 50
Banana 2 10
4 10
Banana
SELECT Sum(price * quantity)
FROM Purchase 90 (= 60+30)
WHERE product = ‘Bagel’
10
More Examples
How can we find the average revenue per sale?
11
More Examples
SELECT sum(price * quantity) /
count(*) FROM Purchase
WHERE product = ‘bagel’
12
Grouping and Aggregation
Purchase(product, price, quantity)
13
Grouping and Aggregation
Purchase(product, price, quantity)
FWGS
15
1&2. FROM-WHERE-GROUPBY
Other Examples
Compare these
two queries:
SELECT product,
sum(quantity) AS SumQuantity,
How about
max(price) AS MaxPrice this one?
FROM Purchase
GROUP BY product
18
Need to be Careful…
SELECT product, max(quantity) Product Price Quantity
FROM Purchase Bagel 3 20
GROUP BY product
Bagel 1.50 20
SELECT product, quantity Banana 0.5 50
FROM Purchase
GROUP BY product Banana 2 10
4 10
sqlite allows this
Banana
query to be executed
with strange Better DBMS (e.g., SQL
behavior. Server) gives an error
19
Purchase(pid, product, price, quantity, month)
Ordering Results
FWGOS
20
Purchase(pid, product, price, quantity, month)
Ordering Results
FWGOS
HAVING Clause
Same query as earlier, except that we consider only products
that had at least 30 sales.
Exercise
Compute the total income per month
Show only months with less than 10 items sold
Order by quantity sold and display as “TotalSold”
23
WHERE vs. HAVING
• WHERE condition is applied to individual rows
– The rows may or may not contribute to the aggregate
– No aggregates allowed here
24
Purchase(pid, product, price, quantity, month)
Mystery Query
What do they compute?
26
Purchase(pid, product, price, quantity, month)
Product(pid, pname, manufacturer)
Aggregate + Join
Example
Let’s figure out
what these
SELECT manufacturer, count(*) mean…
FROM Product, Purchase
WHERE pname = product
GROUP BY manufacturer
for x1 in R1:
for x2 in R2:
... Nested loop
for xm in semantics
Rm:
if
Cond(
x1,
28
x2…):
output(
Semantics for SFWGH
SELECT S
FROM R1,…,Rn
WHERE C1
GROUP BY a1,…,ak Why ?
HAVING C2
S = may contain attributes a1,…,ak and/or any
aggregates, but NO OTHER ATTRIBUTES
C1 = is any condition on the attributes in R1,…,Rn
C2 = is any condition on aggregate expressions
and on attributes a1,…,ak
29
Semantics for SFWGH
SELECT S
FROM R1,…,Rn
WHERE C1
GROUP BY a1,…,ak
HAVING C2
Evaluation steps:
1. Evaluate FROM-
WHERE using
Nested Loop
Semantics
2. Group by the
attributes a1,…,ak
Semantics for SFWGH
SELECT S
Execution order:
FROM R1,…,Rn
WHERE C1 FWGHOS
GROUP BY a1,…,ak
HAVING C2
Evaluation steps:
1. Evaluate FROM-
WHERE using
Nested Loop
Semantics
2. Group by the
attributes a1,…,ak
Purchase(pid, product, price, quantity, month)
Product(pid, pname, manufacturer)
Aggregate + Join
Example
What do these
SELECT manufacturer, count(*) queries mean?
FROM Product, Purchase
WHERE pname = product
GROUP BY manufacturer
34
Purchase(pid, product, price, quantity, month)
Product(pid, pname, manufacturer)
Exercise:
35
Purchase(pid, product, price, quantity, month)
Product(pid, pname, manufacturer)
Exercise:
Exercise:
37
Purchase(pid, product, price, quantity, month)
Product(pid, pname, manufacturer)
Exercise:
38