Skip to main content

UNION

The UNION clause allows concatenating the rows of two or more tables into a single table. It is required that the names and order of the fields should be exactly same in all the merged tables.

UNION ALL

With UNION ALL duplicate rows are copied over to the resulting DataFrame as separate rows. SQL Frames supports the UNION ALL operation using df.unionall(...dfs) API which takes one or more DataFrames and produces a concatenated DataFrame.

In the below example, we first create temporary DataFrames by filtering the data and then concatenate them (this is just an example, as the same can be done with a more advanced filtering).

Loading...

UNION

UNION without the ALL modifier ensures there are no duplicates. However, this is more expensive to compute. SQL Frames supports UNION with the df.union(...dfs) API allowing two more more DataFrames to be concatenated.

Loading...

Row Set ID

The UNION ALL DataFrame has a special field called Row Set ID that tracks the underlying data set to which the row belongs to. Coupled with Sort Prefix this field allows implicitly maintaining the order of the sets of DataFrames that are being combined.

Loading...
SORTING

Notice how sorting on different columns sorts them within the respective sets of rows identified via the Row Set ID.