Window Function Revision
Window Function Revision
that are related to the current row, known as the "window." However, unlike aggregate functions
(SUM(), AVG(), COUNT()) that collapse rows into a single value, window functions retain each
individual row and add an additional calculated value to it.
Window functions are especially useful for tasks like ranking, cumulative sums, moving
averages, calculating running totals, and accessing values from previous or subsequent rows.
Syntax:
● PARTITION BY: Divides the rows into subsets for calculation. (Optional)
● ORDER BY: Specifies the order of rows within each partition. (Optional but essential for
ranking functions.)
SELECT
column1,
column2,
window_function() OVER (PARTITION BY column1 ORDER BY column2)
FROM table_name;
1. Ranking Functions
SELECT
player_id,
ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num
FROM players;
2. RANK():
Assigns a rank to each row within an ordered partition. Ties receive the same rank, and the next
rank is skipped.
Example:
SELECT
player_id,
RANK() OVER (ORDER BY score DESC) AS rank
FROM players;
3. DENSE_RANK():
Similar to RANK(), but without gaps in ranking numbers when there are ties.
SELECT
player_id,
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank
FROM players;
4. NTILE(n):
Divides rows into n groups (or buckets) with approximately equal numbers of rows.
Let's use NTILE(2) to split the data into two groups ordered by score descending.
SELECT
player_id,
NTILE(3) OVER (ORDER BY score DESC) AS bucket
FROM players;
2. Aggregate Functions as Window Functions
Window functions can calculate cumulative or running totals without collapsing rows like regular
aggregate functions.
1. SUM():
Calculates a cumulative sum over the window.
SELECT
player_id,
team_id,
score,
SUM(score) OVER (PARTITION BY team_id ORDER BY player_id) AS cumulative_score
FROM players;
2. AVG():
SELECT
player_id,
AVG(score) OVER (ORDER BY player_id) AS running_avg
FROM players;
3. COUNT():
Counts the number of rows in the window.
SELECT
player_id,
COUNT(*) OVER (PARTITION BY team_id) AS team_count
FROM players;
3. Value Functions
These functions allow access to values in other rows relative to the current row.
1. LAG():
Returns the value from the previous row in the window.
SELECT
player_id,
score,
LAG(score) OVER (ORDER BY score DESC) AS prev_score
FROM players;
2. LEAD():
Returns the value from the next row in the window.
SELECT
player_id,
score,
LEAD(score) OVER (ORDER BY score DESC) AS next_score
FROM players;
3. FIRST_VALUE():
Returns the first value in the window.
SELECT
player_id,
score,
FIRST_VALUE(score) OVER (ORDER BY score DESC) AS highest_score
FROM players;
4. LAST_VALUE():
Returns the last value in the window.
SELECT
player_id,
score,
LAST_VALUE(score) OVER (ORDER BY score DESC) AS lowest_score
FROM players;
SELECT
player_id,
SUM(score) OVER (ORDER BY player_id) AS cumulative_sum
FROM players;
2. Moving Average:
A running average of scores.
SELECT
player_id,
AVG(score) OVER (ORDER BY player_id ROWS BETWEEN 2 PRECEDING AND
CURRENT ROW) AS moving_avg
FROM players;
● Retain Detail: Unlike aggregate functions, window functions preserve individual rows
while adding additional calculated values.
● Simplify Complex Queries: They reduce the need for self-joins and subqueries.
● Flexible Calculations: They offer various options for calculating ranks, running totals,
moving averages, and more.
● Performance: Can be slow for very large datasets due to the need for ordering.
● No Filtering: You cannot use window functions in the WHERE clause. They are typically
used in the SELECT list or ORDER BY clause.