SQL Execution Order
Most developers writing SQL seldom think about the execution order of their SQL. Most of the time it becomes intuitive that such deep understanding may not be required. But understanding the execution order can help with writing better and performant SQL.
Below is the logical execution order followed by SQL Frames which matches with the execution order of most popular and standards compliant databases.
Execution Order | Operation | Description |
---|---|---|
1 | FROM | selecting from the source tables |
2 | ON | filtering before joining |
3 | JOIN | join multiple dataset rows |
4 | WHERE | filtering pre-aggregate rows |
5 | GROUP BY | aggregate functions |
6 | CUBE / ROLLUP | rollup of aggregate functions |
7 | HAVING | filtering aggregate rows |
8 | WINDOW | analytic functions |
9 | QUALIFY | filtering based on analytic functions |
10 | SELECT | selecting the rows |
11 | DISTINCT | selecting distinct rows |
12 | UNION / INTERSECT / EXCEPT | set operations on rows |
13 | ORDER BY | ordering of the rows |
14 | LIMIT | taking only a subset of selected rows |
Note that this is just a logical order and not how a particular database might actually implement the physical plan. There are many factors such as indexes, types of indexes, clustering, partitioning and other concepts that influence how a query is actually executed. But in order to guarantee the same output no matter which database it is, understanding this logical order of execution is important.
QUALIFY
is currently not supported by SQL Frames. However, the same functionality can
be achieved by introducing an fdf()
call to filter using the analytic fields from the parent df
.
Hence, QUALIFY
is mostly a syntactic sugar and there is no loss of functionality without the support for it.