OVER clause is used in window functions to define the window on
which the function operates. Window functions are permitted only in the
ORDER BY clause of queries.
For general information about and examples of Window functions in Splice Machine, see the Using Window Functions topic.
expression OVER( [partitionClause] [orderClause] [frameClause] );
Any value expression that does not itself contain window function calls.
Optional. Specifies how the window function is broken down over groups,
in the same way that
GROUP BY specifies groupings for regular
aggregate functions. If you omit this clause, there is one partition
that contains all rows.
The syntax for this clause is essentially the same as for the
GROUP BY clause for queries; To recap:
PARTITION BY expression [, ...]
A list of expressions that define the partitioning.
Optional. Controls the ordering. It is important for ranking functions,
since it specifies by which variables ranking is performed. It is also
needed for cumulative functions. The syntax for this clause is
essentially the same as for the
SQL Reference. To recap:
ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS FIRST | NULLS LAST ] [, ...]
The default ordering is ascending (
ASC). For ascending order,
NULL values are returned last unless you specify
NULLS FIRST; for
NULL values are returned first unless you specify
Optional. Defines which of the rows (which frame) that are passed to the window function should be included in the computation. The frameClause provides two offsets that determine the start and end of the frame.
The syntax for the frame clause is:
[RANGE | ROWS] frameStart | [RANGE | ROWS] BETWEEN frameStart AND frameEnd
The syntax for both frameStart and frameEnd is:
UNBOUNDED PRECEDING | <n> PRECEDING | CURRENT ROW | <n> FOLLOWING | UNBOUNDED FOLLOWING
A a non-negative integer value.
Because window functions are only allowed in
HAVING clauses, you sometimes need to use
subqueries with window functions to accomplish what seems like it could
be done in a simpler query.
For example, because you cannot use an
OVER clause in a
WHERE clause, a query like the following is not possible:
SELECT * FROM Batting WHERE rank() OVER (PARTITION BY "playerID" ORDER BY "G") = 1;
HAVING are computed before the windowing
functions, this won’t work either:
SELECT *, rank() OVER (PARTITION BY "playerID" ORDER BY "G") as rank FROM Batting WHERE rank = 1;
Instead, you need to use a subquery:
SELECT * FROM ( SELECT *, rank() OVER (PARTITION BY "playerID" ORDER BY "G") as rank FROM Batting ) tmp WHERE rank = 1;
And note that the above subquery will add a rank column to the original columns,
Simple Window Function Examples
The examples in this section are fairly simple because they don’t use the frame clause.
--- Rank each year within a player by the number of home runs hit by that player RANK() OVER (PARTITION BY playerID ORDER BY desc(H)); --- Compute the change in number of games played from one year to the next: G - LAG(G) OVER (PARTITION G playerID ORDER BY yearID);
Examples with Frame Clauses
The frame clause can be confusing, given all of the options that it presents. There are three commonly used frame clauses:
|Frame Clause Type||Example|
BETWEEN UNBOUNDED PRECEEDING AND UNBOUNDED FOLLOWING
BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
Here are some examples of window functions using frame clauses:
--- Compute the running sum of G for each player: SUM(G) OVER (PARTITION BY playerID ORDER BY yearID BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW); --- Compute the career year: YearID - min(YEARID) OVER (PARTITION BY playerID BETWEEN UNBOUNDED PRECEEDING AND UNBOUNDED FOLLOWING) + 1; --- Compute a rolling average of games by player: MEAN(G) OVER (PARTITION BY playerID ORDER BY yearID BETWEEN 2 PRECEEDING AND 2 FOLLOWING);