Window_spec:
[window name] [partition clause] [order clause] [frame clause]
That looks simple. And them come terms like UNBOUNDED PRECEDING that could put a knot in your gut. The manual is not exactly written to help novices in this area get up to speed. But don't panic. If you work through the examples that follow (and please do the preceding part of this series before trying these examples) you will have a better appreciation of what is going on with window function.
The Frame Clause
So the frame clause is optional in the window function. A frame is considered a subset of the current partition and defines that subset. Frames are determined with respect to the current row. This allows grouping of data within a partition depending on the current row in the partition. If the frame is defined as all rows from the start of the end of the partition you can computer running totals for each row. Or the frame can be defined as extending a desired number of rows either side of the current row which lets you compute rolling averages.
The first example in this series had the window defined as W OVER() which does not have a frame clause and computes over the entire column.
mysql> select x,
sum(x) over() from x;
+------+---------------+
| x | sum(x) over() |
+------+---------------+
| 1 | 55 |
| 2 | 55 |
| 3 | 55 |
| 4 | 55 |
| 5 | 55 |
| 6 | 55 |
| 7 | 55 |
| 8 | 55 |
| 9 | 55 |
| 10 | 55 |
+------+---------------+
10 rows in set (0.00 sec)
Unbounded
There is usually more than one way to do things. OVER() can also be written as WINDOW w AS (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) or WINDOW w AS (RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING).Uh, wha, huh?
Besides the confusion of windows/frames and range/rows, the wording gets tricky. UNBOUNDED should be taken to mean as 'everything' so UNBOUNDED BEFORE means everything before and UNBOUNDED AFTER means everything after.
mysql> SELECT x,
sum(x) over() as 'over',
sum(x) OVER a as 'row',
sum(x) OVER b AS 'range'
FROM x
window
a AS (rows between unbounded preceding and unbounded following), b AS (range between unbounded preceding and unbounded following);
+------+------+------+-------+
| x | over | row | range |
+------+------+------+-------+
| 1 | 55 | 55 | 55 |
| 2 | 55 | 55 | 55 |
| 3 | 55 | 55 | 55 |
| 4 | 55 | 55 | 55 |
| 5 | 55 | 55 | 55 |
| 6 | 55 | 55 | 55 |
| 7 | 55 | 55 | 55 |
| 8 | 55 | 55 | 55 |
| 9 | 55 | 55 | 55 |
| 10 | 55 | 55 | 55 |
+------+------+------+-------+
10 rows in set (0.01 sec)
ROW vs RANGE
Right now you are probably wondering what the difference is between RANGE and ROW. Let's modify the last example slightly. So we replace between current row and unbounded following as the core of the frame specification.
mysql> SELECT x,
sum(x) over() as 'over',
sum(x) OVER a as 'row',
sum(x) OVER b AS 'range'
FROM x
window a AS (rows between current row and unbounded following),
b AS (range between current row and unbounded following);
+------+------+------+-------+
| x | over | row | range |
+------+------+------+-------+
| 1 | 55 | 55 | 55 |
| 2 | 55 | 54 | 55 |
| 3 | 55 | 52 | 55 |
| 4 | 55 | 49 | 55 |
| 5 | 55 | 45 | 55 |
| 6 | 55 | 40 | 55 |
| 7 | 55 | 34 | 55 |
| 8 | 55 | 27 | 55 |
| 9 | 55 | 19 | 55 |
| 10 | 55 | 10 | 55 |
+------+------+------+-------+
10 rows in set (0.00 sec)
For range, the frame is defined by rows within a value range of an ORDER BY clause. But where is the ORDER BY?? Well in this case it is implied to be the entire partition so it takes all the column values.
Range gets a little more differentiated when you use it a little more creatively.
mysql> SELECT x,
sum(x) over() as 'over',
sum(x) OVER a as 'row',
sum(x) OVER b AS 'range'
FROM x
window
a AS (rows between current row and 1 following),
b AS (order by x
range between current row and 1 following);
+------+------+------+-------+
| x | over | row | range |
+------+------+------+-------+
| 1 | 55 | 3 | 3 |
| 2 | 55 | 5 | 5 |
| 3 | 55 | 7 | 7 |
| 4 | 55 | 9 | 9 |
| 5 | 55 | 11 | 11 |
| 6 | 55 | 13 | 13 |
| 7 | 55 | 15 | 15 |
| 8 | 55 | 17 | 17 |
| 9 | 55 | 19 | 19 |
| 10 | 55 | 10 | 10 |
+------+------+------+-------+
10 rows in set (0.06 sec)
It is possible to specify bigger ranges.
mysql> SELECT x,
sum(x) over() as 'over',
sum(x) OVER a as 'row',
sum(x) OVER b AS 'range'
FROM x
window a AS (rows between 1 preceding and 3 following),
b AS (order by x range between 1 preceding and 3 following);
+------+------+------+-------+
| x | over | row | range |
+------+------+------+-------+
| 1 | 55 | 10 | 10 |
| 2 | 55 | 15 | 15 |
| 3 | 55 | 20 | 20 |
| 4 | 55 | 25 | 25 |
| 5 | 55 | 30 | 30 |
| 6 | 55 | 35 | 35 |
| 7 | 55 | 40 | 40 |
| 8 | 55 | 34 | 34 |
| 9 | 55 | 27 | 27 |
| 10 | 55 | 19 | 19 |
+------+------+------+-------+
10 rows in set (0.00 sec)
Hopefully this takes a little bit of the mystery out of windowing functions for novices.
Next time I hope to look into some time series data and windowing functions.