SQL Execution Order

Most developers writing SQL seldom think about the execution order of their SQL. Most of the time it becomes intuitive that such deep understanding may not be required. But understanding the execution order can help with writing better and performant SQL.

Below is the logical execution order followed by SQL Frames which matches with the execution order of most popular and standards compliant databases.

Execution Order	Operation	Description
1	`FROM`	selecting from the source tables
2	`ON`	filtering before joining
3	`JOIN`	join multiple dataset rows
4	`WHERE`	filtering pre-aggregate rows
5	`GROUP BY`	aggregate functions
6	`CUBE` / `ROLLUP`	rollup of aggregate functions
7	`HAVING`	filtering aggregate rows
8	`WINDOW`	analytic functions
9	`QUALIFY`	filtering based on analytic functions
10	`SELECT`	selecting the rows
11	`DISTINCT`	selecting distinct rows
12	`UNION` / `INTERSECT` / `EXCEPT`	set operations on rows
13	`ORDER BY`	ordering of the rows
14	`LIMIT`	taking only a subset of selected rows

Note that this is just a logical order and not how a particular database might actually implement the physical plan. There are many factors such as indexes, types of indexes, clustering, partitioning and other concepts that influence how a query is actually executed. But in order to guarantee the same output no matter which database it is, understanding this logical order of execution is important.

note

QUALIFY is currently not supported by SQL Frames. However, the same functionality can be achieved by introducing an fdf() call to filter using the analytic fields from the parent df. Hence, QUALIFY is mostly a syntactic sugar and there is no loss of functionality without the support for it.