Skip to main content

Incremental Data Load

When building an operational dashboard of a frequently changing process, data needs to be refreshed at the desired frequency. One option is to refresh the entire dataset which can be expensive. The other option is to load data incrementally and merge it with existing data. SQL Frames provides an option to incrementally load data into a source DataFrame using the df.merge() API.


Tracking Merged At

The API df.trackMergedAt() adds a datetime field to the dataframe that tracks the time when a record is updated/inserted during the merges.

Response Headers

When using incremental load strategy, it may be necessary to indicate a time (or a version number) after which data needs to be made available to the client. It is possible to achieve this by providing a cutom header (such as X-Dataset-Time) as part of the incremental data response and using this value as a parameter to the next incremental request.

The way to get access to the header is by passing a callback option to the options object when creating a DataFrame. The callback function has a single argument which is the response from a fetch() call. This provides access to the header and other information. Note that the reponse.body will not be available as it is consumed to load the DataFrame.


If the solution is being deployed in a CORS enabled environment, it is necessary to use Access-Control-Expose-Headers header and white-list the custom header.


SQL Frames is reactive and any changes to a DataFrame are propagated to the DataFrames downstream the transformation pipeline.