Bar Chart Racing
Bar chart racing has been a popular visualization technique that we all have seen in the last couple of years. In this post I will discuss about how bar chart racing works, it's benefits and drawbacks, a more generalized concept of bar chart racing provided by SQL Frames along with how to overcome some of the mentioned drawbacks.
The simplest form of data analysis and visualization is possible by aggregating the data along a single dimension and plotting the metric as a bar graph. Now if the data also has a time component, then it is possible to think of first partitioning the data by the time and then plotting the bar chart for each time data. That provides multiple bar charts and putting all these frames together forms a nice video. While that alone would clearly show how the metric is varying by time, bar chart racing actually takes this concept one step further by actually "tweening" the data between two frames and providing that nice animation.
If the categorical dimension used to create the bar chart has too many values, then plotting a bar chart with such a large data set makes the bar chart unreadable and adding time dimension and animating it further makes it confusing. So, one of the important requirements for bar chart is to limit the number of data points within the bar chart to a reasonably small number. But once the data is being limited, the question would be what criteria to use to narrow down to that small set? The natural choice there is to use a measure and pick the "top N" with respect to that measure.
Below is an example of bar chart racing using the gdp data.
How it works?
Now, let's look at what happens under the hood. Below example shows the same bar raching chart but along with the associated DataFrames so it becomes clear what is happening.
Disadvantages
While no doubt the racing bar charts help tell a story, they only tell the story about the top N. It becomes a popularity context. But when there are 180+ countries, you could be the citizen or a well-wisher of any country and you want to analyze how your country is performing over time. And if your country happens to be not in the top N, tough luck visualizing.
Focus, the solution
This is where SQL Frames provides the ability to focus on a data point and contiously shows the data associated with that data point. So, select any country in the DataFrame and see how the associated metrics are continuously displayed as an overlay on the chart. Note also that the corresponding row is continuously selected within the DataFrame frame after frame as the underlying Data is changing by year.
For the tech savvy, the ability to support focus required all the following pieces of the puzzle
- Ability to retain data selection across compuation of the DataFrame.
- Ability to chart only a subset of data while the underlying DataFrame has more data.
- A VDOM solution like React, which allows user interaction while the UI is still changing.
Generalization
Most bar chart racing examples out there are always w.r.t to time. However, it doesn't have to be time. Below example shows how the revenue is changing by Item Type by Region.
Unlike the above charts, the animation is not started by default in the chart below. You can enable it by using the toggle button.
Conclusion
Every piece of data has a story to tell, you just need to figure out how. SQL Frames mission is to help everyone craft their data story with ease. The data stories can boost employee morale, engage them better, help you understand your topline and bottomline trends and optimize them. The possiblities are endless and only limited by you and your employees.