Skip to main content

5 posts tagged with "dataframes"

View All Tags

· 4 min read

Last week I talked about opening a very large CSV file within ElectronJS based app. That post talked about how to work around the 500MB limit of a string in JavaScript for Chromium browser. I decided to push the boundary so that even larger CSV files can be opened. It turns out it is possible to open even larger files and the same techniques can be used in both ElectronJS (or NodeJS) but also within the browser.

Eventually, SQL Frames managed to load more than 1GB file with 10 million records first in ElectronJS and then within the standard Chrome browser itself. Here is how it was done.

· 4 min read

CSV files are one of the most common formats used to share data. Yet, opening very large files of this file format can be challenging for many applications. Since CSV formats are commonly opened with Excel on Windows and Numbers on Mac, one would expect these applications to open these files very fast. Surprisingly, they are extremely slow. There are many utility apps out there that provide various levels of functionality to manage large CSV files.

CSV is one of the common file formats for SQL Frames as well. However, since SQL Frames is written in TypeScript (which compiles to JavaScript) and JavaScript has a maximum limit for the size of a string, there can be challenges dealing with large CSV file sizes.

· 3 min read

CKAN is a popular open source data management system. It is used by Data.Gov and other public data initiatives through out the world. These systems allow a way to collect and catalog the data in various formats including CSV, one of the most common data formats. CKAN has extensions to provide data visualization. However, it appears that the visualizations are not widely deployed perhaps due to the additional server-side compute required to support them.

SQL Frames provides a client-side data, visualization and intelligence platform. It is ideal for working with remote data directly within the browser. Hence, we are making this technology available for previewing datasets from Data.gov.

· 5 min read

One of the best things about JAMSTACK is that it is possible to provide searching the entire site with the search processing done entirely on the client. This is a big deal as there is no need for expensive servers to drive the search and more importantly it provides milliseconds latency for the users so they love to stay on the website for longer and explore.

Text search is an important part of data analysis and being able to do it entirely on the client is possible by projects like lunr. SQL Frames has integrated text search (thank you lunr). Use cases with examples are discussed below.

· 3 min read

In a world that is increasingly generating and consuming data, it should be easy for anyone to tinker with the data they have access to and gain better insights. Yet, most existing data analytics tools are not easy to use, or require complex multi-tier technologies that are only understood by the tech savvy or simply too expensive. All of this causes the information divide, those that know how to use complex and/or expensive tools to extract value out of the data and those who don't.