FROM
The FROM clause allows joining multiple tables to get unified and enriched view of the data. SQL Frames supports the from clause using multiple related APIs to support the various types of joins.
Without JOINs
When working with a single DataFrame, there is no need to combine the data with other DataFrames. In this case calling any of the SELECT clause APIs on a DataFrame automatically establishes the FROM context.
With JOINs
Sometimes data from multiple DataFrames need to be combined for a unified view. This is accomplished by joining the DataFrames.
JOIN types
Below are the list of join types and the corresponding APIs provided by SQL Frames.
SQL Join Type | API |
---|---|
INNER JOIN | inner(df2,'df2-alias') |
LEFT OUTER JOIN | leftouter(df2,'df2-alias') |
RIGHT OUTER JOIN | rightotuer(df2,'df2-alias') |
FULL OUTER JOIN | fullouter(df2,'df2-alias') |
CROSS | cross(df2,'df2-alias') |
The SQL.join()
API can be used to specify the first DataFrame in a series of joins that can be chained together.
More than two DataFrames can be joined by chaining the above join APIs.
In the following example, the penguin measurements data provides the name of the island and a separate islands data contains details about the island such as the latitude and longitude. By combining these two it is possible for example to visualize the data on a map.
USING clause
When both tables are joined on the same column name(s), the join can be simply stated with the USING clause.
While standard SQL expects the using
clause to be used only when the join columns on both sides are the same,
SQL Frames provides additional convinience of leveraging using
when the columns are not the same. This
is done as using('col',['coll','colr'])
. Note how the second join condition uses different column names (namely
coll
and colr
).
See the WHERE clause to learn more about the fdf()
API in the
above example.
ON clause
The join condition can be specified using the ON clause when the join keys are not the same or the join criteria is more complex. Further, the join tables can be given alias names which is useful if the same table needs to be referred multiple times.
SELECT alias.*
By default all the fields from all the joined tables are selected. It is possible to project different sets of fields.