Skip to main content

Top N and Others Sets

While Top N Sets provide quick access to the top N items, there are many reasons to view the reports and charts with top N along with all the other data lumped into a single Other bucket.

Top N and Others

single dimension

The simplest way to get the other slice is by calling the withOthers() API on an existing data set.

Loading...

multiple dimensions

Loading...

nested

It is possible to find the top N members of each group specified by one or more dimensions. In the following example, the top 5 countries within each region along with the Other (as opposed to top 5 countries globally as shown above) is displayed.

Loading...

multiple metrics

The withOthers() API is much more flexible and allows creating complex topN and other reports. The API is similar to the gdf() API. Hence, it is possible to provide a groupBy (see the Top N trends below for how to use it) and a set of aggregates including calculated fields can be provided. All of these are computed for the set and the other slice.

Loading...
tip

It is possible to use one metric for top N and a different metric for the final set & other data set. For example, top N can be based on the average revenue while the display focuses on the total revenue.

Top N and Others Charts

pie chart

Loading...
Data Set as a DataFrame

The Data Set interface provides a getter called ds which returns a DataFrame. Hence all the features of a DataFrame such as the ability to create a chart are readily available.

bar chart

With the ability to specify multiple metrics (as shown above) it is possible to create a chart with cumulative percentage of the contribution of the top N.

Loading...
Best Practice

If the distribution of the data is relatively uniform and not top-heavy then the bar chart might end up showing a very large Other bar and may not be ideal. One option is to increase the number of N for the Top N.

nested chart

By making use of the dynamic grid charts, it is possible to display the Nested Top N and Other charts.

Loading...

The ds.withOthers API takes an optional parameter which provides a groupBy. The groupBy should include all the dimensions used for the set and optionally include additional dimensions to get a granular level of detail. This allows, for example, to plot the trend by adding the time dimension.

overall top N

Note how data is explicitly ordered by Year and Item Type to ensure correct rendering of the trend lines.

Loading...

sliced top N

It is not required that the top N data set should be based on the same DataFrame as the final set & other data set. For example, it is possible to first get the top 3 item categories for the current year and use them to draw the trend for the last 4 years. Below example infact uses the DataFrame Slicer to allow picking top N based on arbitrary criteria and then show the trend of those categories!

The API df.sets.setNother takes a Data Set such as Top N set and creates a set and other data set.

Loading...

Top N and Others Pivot Tables

overall top N

It is possible to use advanced group by clauses like CUBE and ROLLUP along with top N API to get sub totals and grand totals.

Loading...

nested top N

It is possible to get the top 3 Item Types within each Region and pivot. In this case the Item Type dimension most likely has more than 3 Item Type values because each Region may have its own set of top 3 Item Types.

Loading...
None vs Not in Top N

When there is a blank cell, it is not possible to know whether that is because there is no underlying data for that Item Type and Region combination or if that Item Type is not among the top N within that Region.

Top M and Others within Top N and Others

Let's say the requirement is to show the top 5 Regions and for each the top 3 Item Types based on Revenue. There are two types of Others buckets

  1. one within each Region for all Item Types not in the top 3 for that Region
  2. one for all Regions not in the top 5.

At first this may be a bit daunting but it is possible to make use of the sets.topN, ds.withOthers and sets.setNother APIs to accomplish such a requirement. Along the way, we will be making use of

  1. non-aggregate topN (we aggregate but forget that it is an aggregate and treat it like a non-aggegate)
  2. double-aggregation (aggregating and aggregating the aggregates)

This report provides top 3 Item Types of top 5 Regions.

Loading...