Incremental Data Load
When building an operational dashboard of a frequently changing process, data needs
to be refreshed at the desired frequency. One option is to refresh the entire dataset
which can be expensive. The other option is to load data incrementally and merge it with
existing data. SQL Frames provides an option to incrementally load data into a
source DataFrame using the
Tracking Merged At
df.trackMergedAt() adds a datetime field to the dataframe that tracks the time when a record
is updated/inserted during the merges.
When using incremental load strategy, it may be necessary to indicate a time (or a version number)
after which data needs to be made available to the client. It is possible to achieve this
by providing a cutom header (such as
X-Dataset-Time) as part of the incremental data response
and using this value as a parameter to the next incremental request.
The way to get access to the header is by passing a
callback option to the options object when
creating a DataFrame. The
callback function has a single argument which is the response from
fetch() call. This provides access to the header and other information. Note that the
reponse.body will not be available as it is consumed to load the DataFrame.
If the solution is being deployed in a CORS enabled environment, it is necessary to use
Access-Control-Expose-Headers header and white-list the custom header.
SQL Frames is reactive and any changes to a DataFrame are propagated to the DataFrames downstream the transformation pipeline.