Skip to main content

Aliases

When projecting columns, it is possible to provide expressions that can be aliased. For example 100*ratio AS percentage.

Nested Aliases

In standard SQL it is not possible for one alias to refer to another within the same select context. For example, the following SQL is not valid as y can't be used to define z.

select x, x+1 as y, 2*y as z
from table1

Hence, expressions either have to be duplicated or need to resort to nested queries so the common expressions are in the sub-query and they are referred in the main query.

Authoring complex calculations in SQL should be as simple as doing the same in a spreadsheet. That is why SQL Frames allows referencing the aliases within the same DataFrame as long as there is no circular dependency (which is caught by the engine). This convenience feature makes it easy to author complex calculations and this is transpiled to SQL by flattening the expressions so there is no referencing of aliases.

Loading...
Nested/Lateral Alias Reference

This feature is also referred to as lateral alias reference by some databases such as Amazon Redshift.

Parent context has precedence

Note that if a field is redefined from parent context, other references to the alias still refer to the field from the parent context. Below, the value of y is not 10*x+1 (as x is redefined to be 10*x) but simply x+1 which is consistent with standard SQL.

Loading...

Aggregate and non-aggregate contexts

However, there is one difference to the above rule of not being able to access a redefined alias in the same context. First let's see a failing SQL.

SELECT  a,SUM(x) AS x,MIN(x) AS minx,2*x AS x2
FROM (SELECT 10*x as x,a
FROM (SELECT * FROM (VALUES (42, 'n')
,(1, 'd')
,(2, 'd')
,(2, 'd')) AS table1(x, a)) table1) table1
GROUP BY a
ORDER BY a ASC;

It is not possible to define x2 as 2*x in the standard SQL as x still refers to the non-aggregate parent context.

In the below code, x is redefined with an aggregate definition. Since the context switches to an aggregate context with a GROUP BY, only aggregates refer to the parent context while other calculations refer to the redefined aggregate.

Loading...

Analytic Functions

Analytic Functions are evaluated after computing the aggregates. Hence, these expressions have access to the aggregate context and not the non-aggregate parent context.

In the following code, sum(sum()) over(...) is achieved by referring to the inner sum via the sumx aggregate field.

Loading...
note

At present it is not possible to directly create an analytic function on top of an aggregate without selecting it. If only analytic field is desired in the final output, the aggregate can be defined to support the analytic aggregate and can then be projected out.

The local variables experimental feature may be used to work around this.