Skip to main content

SQL Frames V1

· One min read
Siva Dirisala
Creator of SQL Frames

After multiple years of work, SQL Frames has finally reached version 1. We are celebrating this big milestone by releasing a free dashboard for the codeforces community.

The Codeforces Submissions Analyzer is available for anyone to analyze their codeforces data.

Some of the features of SQL Frames used in this dashboard are

  1. Dataframe Slicers (ability to slice data with ease)
  2. Array Fields (for tags of problems)
  3. Right-click to view data (to immediately get to the problems from the charts)

Unnesting Arrays

· 4 min read
Siva Dirisala
Creator of SQL Frames

Any non-trivial data model has relationships and databases and application platforms may or may not support the notion of storing the relationships inline, that is, denormalizing the relationships into the main record. To give a real example, a user may have a set of roles in the application and this user-role relationship can be either stored in a separate table or directly within the user table. However, since the user-role relationship is 1-to-many, storing this directly within the user data model requires storing the relationship as a multi-valued column and arrays are perfect for that.

The challenge that does come with storing the 1-to-many relationsips as arrays is that it then makes it difficult to do certain SQL queries that are easy to do had the relationship been stored explicitly. That is why some databases offer a special operator to deal with such array fields.

Nulls in SQL and JavaScript

· 3 min read
Siva Dirisala
Creator of SQL Frames

The semantics of null values in SQL and JavaScript are quite different. For example null+42 in SQL is null while null+42 in JavaScript is 42. This may or may not be a concern depending on how the application is written or what it is trying to do.

SQL Frames provides scripted fields which are compiled and then evaluated against each row. SQL Frames also provides the ability to generate SQL based on the data transformation logic used to create the DataFrame. If one were to freely move the computation of expressions between server and client as desired end users would expect the results to be same. That is why the null semantics matters.

Frequent Item Sets

· 5 min read
Siva Dirisala
Creator of SQL Frames

Finding items that are frequently shopped together is an interesting piece of information. For brick-and-mortar companies this can help to decide which items to offer promotions to attract more customers without comprimising the overall profitability. For online stores, it can help with being able to appropriately cross-sell items.

Frequent itemsets is a well studied data mining technique. Finding frequent pairs is a more specific but important variation since higher cardinality sets can be easily identified using lower cardinality sets. Depending on the size of data, finding the frequent pairs may have to be run on a cluster of computers to just as a single database query! When writing it as a SQL query, care must be taken to avoid performance issues. Below we investigate three different SQL queries all providing the same results.

Opening very large CSV files in the browser

· 4 min read
Siva Dirisala
Creator of SQL Frames

Last week I talked about opening a very large CSV file within ElectronJS based app. That post talked about how to work around the 500MB limit of a string in JavaScript for Chromium browser. I decided to push the boundary so that even larger CSV files can be opened. It turns out it is possible to open even larger files and the same techniques can be used in both ElectronJS (or NodeJS) but also within the browser.

Eventually, SQL Frames managed to load more than 1GB file with 10 million records first in ElectronJS and then within the standard Chrome browser itself. Here is how it was done.