Skip to main content

SQL Execution Order

Most developers writing SQL seldom think about the execution order of their SQL. Most of the time it becomes intuitive that such deep understanding may not be required. But understanding the execution order can help with writing better and performant SQL.

Below is the logical execution order followed by SQL Frames which matches with the execution order of most popular and standards compliant databases.

Execution OrderOperationDescription
1FROMselecting from the source tables
2ONfiltering before joining
3JOINjoin multiple dataset rows
4WHEREfiltering pre-aggregate rows
5GROUP BYaggregate functions
6CUBE / ROLLUProllup of aggregate functions
7HAVINGfiltering aggregate rows
8WINDOWanalytic functions
9QUALIFYfiltering based on analytic functions
10SELECTselecting the rows
11DISTINCTselecting distinct rows
12UNION / INTERSECT / EXCEPTset operations on rows
13ORDER BYordering of the rows
14LIMITtaking only a subset of selected rows

Note that this is just a logical order and not how a particular database might actually implement the physical plan. There are many factors such as indexes, types of indexes, clustering, partitioning and other concepts that influence how a query is actually executed. But in order to guarantee the same output no matter which database it is, understanding this logical order of execution is important.

note

QUALIFY is currently not supported by SQL Frames. However, the same functionality can be achieved by introducing an fdf() call to filter using the analytic fields from the parent df. Hence, QUALIFY is mostly a syntactic sugar and there is no loss of functionality without the support for it.