Web Development · Database
SQL Window Functions: The Analytics Queries Your ORM Won't Write for You
Running totals, rankings, moving averages, lag and lead comparisons: window functions handle the analytics queries that would otherwise require application-side loops or multiple round-trips. Here's how they work.
Anurag Verma
8 min read
Sponsored
There’s a category of analytics query that most developers solve the wrong way: fetch a bunch of rows, then loop over them in application code to compute rankings, running totals, or period-over-period comparisons. The query is simple. The application logic is messy. The performance is worse than it needs to be.
Window functions exist precisely for this case. They let you compute values across a set of rows related to the current row, without collapsing the result into a single aggregate. You get both the row’s original columns and the computed value in the same query.
What Window Functions Are (and Aren’t)
A regular aggregate function like SUM() or COUNT() collapses a group of rows into one. GROUP BY department, SUM(salary) gives you one row per department.
A window function computes across a group but keeps every row. You can see each employee’s salary alongside the sum for their department:
SELECT
name,
department,
salary,
SUM(salary) OVER (PARTITION BY department) AS dept_total
FROM employees;
Output:
name | department | salary | dept_total
-----------|------------|--------|----------
Alice | Eng | 120000 | 350000
Bob | Eng | 115000 | 350000
Carol | Eng | 115000 | 350000
Dave | Sales | 95000 | 190000
Eve | Sales | 95000 | 190000
Each row keeps its own data. The dept_total is computed over the window of rows in the same department.
The OVER Clause
The OVER clause defines the window (which rows the function looks at). It has three optional parts:
function() OVER (
PARTITION BY <columns> -- divide rows into groups
ORDER BY <columns> -- order within each group
ROWS BETWEEN ... -- which rows in the group to include
)
An empty OVER () means the window is the entire result set.
PARTITION BY
PARTITION BY divides rows into groups, similar to GROUP BY, but without collapsing. Each partition is an independent window.
-- Rank employees by salary within each department
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;
Without PARTITION BY, the ranking would be global across all employees. With it, each department gets its own rank sequence starting at 1.
Ranking Functions
Three ranking functions handle the same problem differently when there are ties:
SELECT
name,
score,
ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num, -- unique, no gaps
RANK() OVER (ORDER BY score DESC) AS rank, -- ties share rank, gap after
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank -- ties share rank, no gap
FROM leaderboard;
name | score | row_num | rank | dense_rank
-------|-------|---------|------|----------
Alice | 95 | 1 | 1 | 1
Bob | 90 | 2 | 2 | 2
Carol | 90 | 3 | 2 | 2
Dave | 85 | 4 | 4 | 3
RANK() skips 3 after the tie at position 2. DENSE_RANK() does not. ROW_NUMBER() gives each row a unique number regardless of ties.
Use ROW_NUMBER() when you need exactly one row per position (like picking the top result per category). Use RANK() when ties should be visible and gaps are acceptable. Use DENSE_RANK() when ties should be visible but you don’t want the gaps.
Running Totals
SUM() with ORDER BY in the OVER clause computes a cumulative sum:
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders
ORDER BY order_date;
By default, ORDER BY in a window function uses ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which means it sums from the first row in the partition up to and including the current row.
To compute a running total per customer:
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) AS customer_running_total
FROM orders;
Moving Averages
The ROWS BETWEEN clause controls which rows in the window the function sees. For a 7-day moving average:
SELECT
date,
daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7d
FROM daily_metrics;
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW includes the current row plus the 6 rows before it (a 7-row window). For dates with fewer than 7 preceding rows, PostgreSQL uses whatever rows are available, so the first few values are averages over smaller windows.
If you need a centered moving average (3 rows before and 3 after):
AVG(value) OVER (
ORDER BY date
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
)
LAG and LEAD
LAG() and LEAD() let you access values from other rows without a self-join. This is the right tool for period-over-period comparisons.
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS mom_change,
ROUND(
100.0 * (revenue - LAG(revenue, 1) OVER (ORDER BY month))
/ NULLIF(LAG(revenue, 1) OVER (ORDER BY month), 0),
1
) AS mom_pct_change
FROM monthly_revenue
ORDER BY month;
The second argument to LAG() is the offset (default 1). LAG(revenue, 3) returns the value from 3 rows earlier. LEAD() works the same way but looks forward.
NULLIF(..., 0) prevents division by zero when the previous month had zero revenue.
For per-group comparisons (e.g., month-over-month per product):
SELECT
product_id,
month,
revenue,
LAG(revenue) OVER (PARTITION BY product_id ORDER BY month) AS prev_month
FROM product_revenue;
FIRST_VALUE and LAST_VALUE
FIRST_VALUE() and LAST_VALUE() return the first or last value in the window:
SELECT
name,
department,
salary,
FIRST_VALUE(salary) OVER (
PARTITION BY department
ORDER BY salary DESC
) AS dept_max_salary,
salary / FIRST_VALUE(salary) OVER (
PARTITION BY department
ORDER BY salary DESC
)::numeric AS pct_of_max
FROM employees;
A common trap: LAST_VALUE() with the default window frame returns the current row’s value, not the last row in the partition. That’s because the default frame is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. To get the true last value:
LAST_VALUE(salary) OVER (
PARTITION BY department
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
NTILE
NTILE(n) divides rows into n equal-sized buckets and assigns a bucket number to each row. Useful for percentile groupings:
SELECT
customer_id,
total_spend,
NTILE(4) OVER (ORDER BY total_spend DESC) AS spend_quartile
FROM customer_totals;
Customers in quartile 1 are the top 25% spenders. This is cleaner than computing percentile boundaries manually.
Named Windows
If you’re applying multiple window functions with the same frame, name the window to avoid repetition:
SELECT
date,
value,
SUM(value) OVER w AS running_sum,
AVG(value) OVER w AS running_avg,
MIN(value) OVER w AS running_min,
MAX(value) OVER w AS running_max
FROM metrics
WINDOW w AS (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
ORDER BY date;
Performance Considerations
Window functions run after WHERE, GROUP BY, and HAVING filtering. The window operates on already-filtered rows, which is often what you want.
They can be expensive because they often require sorting. If your window has an ORDER BY on a column without an index, PostgreSQL will sort the entire result in memory (or to disk if it’s large). An index on the window’s ORDER BY column can eliminate that sort.
For large tables, filtering before the window function matters:
-- Run the window on only the last 90 days, not the full table
WITH recent AS (
SELECT * FROM events WHERE event_date >= NOW() - INTERVAL '90 days'
)
SELECT
event_date,
count,
SUM(count) OVER (ORDER BY event_date) AS running_total
FROM recent;
Window functions can’t be used directly in WHERE or HAVING clauses because they run after those filters. Wrap them in a subquery or CTE:
-- Find employees ranked 1-5 in each department
WITH ranked AS (
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees
)
SELECT * FROM ranked WHERE dept_rank <= 5;
What This Replaces
Before window functions became widely understood, the same logic required self-joins or application-side loops. Running total without window functions:
-- The old way: O(n²) self-join
SELECT
a.order_date,
a.amount,
SUM(b.amount) AS running_total
FROM orders a
JOIN orders b ON b.order_date <= a.order_date
GROUP BY a.order_date, a.amount;
With a window function, PostgreSQL sorts once and passes through the rows linearly. The self-join approach scales quadratically with row count.
The first time you replace a 50-line Python loop computing period-over-period comparisons with a single SQL query using LAG(), the pattern becomes intuitive quickly. Window functions show up constantly in analytics work and are supported by every major database: PostgreSQL, MySQL 8+, SQLite 3.25+, BigQuery, Snowflake, and DuckDB.
Sponsored
More from this category
More from Web Development
CSS Anchor Positioning: Tooltips and Popovers Without JavaScript
gRPC in 2026: When to Use It Instead of REST or GraphQL
k6 Load Testing: Performance Testing Your APIs Before Users Find the Problems
Sponsored
The dispatch
Working notes from
the studio.
A short letter twice a month — what we shipped, what broke, and the AI tools earning their keep.
Discussion
Join the conversation.
Comments are powered by GitHub Discussions. Sign in with your GitHub account to leave a comment.
Sponsored