Skip to main content

Auto SQL Generation

One of the key features of SQL Frames is to transpile the logic of the composed DataFrame into an equivalent SQL statement with support for multiple databases. As discussed in the SQL Frames introduction, there are several advantages with this.

  1. Ability to verify logic against an existing database.
  2. Ability to switch between in-memory calculations using SQL Frames built-in analytics engine or execute equivalent SQL query against a big data backend.
  3. Or simply write your logic declaratively and get an equivalent SQL for different databases.

The generated SQL query is formatted and displayed with syntax highlighting making it easy to understand the composed logic as SQL.

Schema Name

A SQL Frames DataFrame can be created directly from CSV or JSON files. However, for the purpose of SQL generation, each DataFrame can be given a schema name. This schema name is used while generating the SQL.

Below is an example of generating SQL from the given DataFrame.


SQL Simplified

SQL is a great declarative programming language. However, it lacks certain basic features such as nested aliases and nested analytic functions forcing nested queries that can be hard. SQL Frames is designed to simplify authoring DataFrames with several convenient features that are not straightforward in SQL. While the API makes it simpler to create complex DataFrame logic with ease, the heavy lifting is done by SQL Frames to transpile the composed logic into a valid SQL. See rest of the SQL Generation documentation for more details.

SQL Generation

Not all databases have all the capabilities and hence the generated SQL may not be valid for a given database. However, for some of the ease of use capabilities provided by SQL Frames efforts are taken to generate equivalent SQL that is valid for a fully standards compliant database.