Incremental Data Load
When building an operational dashboard of a frequently changing process, data needs
to be refreshed at the desired frequency. One option is to refresh the entire dataset
which can be expensive. The other option is to load data incrementally and merge it with
existing data. SQL Frames provides an option to incrementally load data into a
source DataFrame using the df.merge()
API.
Tracking Merged At
The API df.trackMergedAt()
adds a datetime field to the dataframe that tracks the time when a record
is updated/inserted during the merges.
Response Headers
When using incremental load strategy, it may be necessary to indicate a time (or a version number)
after which data needs to be made available to the client. It is possible to achieve this
by providing a cutom header (such as X-Dataset-Time
) as part of the incremental data response
and using this value as a parameter to the next incremental request.
The way to get access to the header is by passing a callback
option to the options object when
creating a DataFrame. The callback
function has a single argument which is the response from
a fetch()
call. This provides access to the header and other information. Note that the
reponse.body
will not be available as it is consumed to load the DataFrame.
If the solution is being deployed in a CORS enabled environment, it is necessary to use
Access-Control-Expose-Headers
header and white-list the custom header.
Reactivity
SQL Frames is reactive and any changes to a DataFrame are propagated to the DataFrames downstream the transformation pipeline.