Dimension Hierarchies
There are several ways to visualize data hierarchically as shown in the Hierarchical Tables Zoo. It is possible to orgnanize a set of fields into a dimension hierarchy and it is possible to have several dimension hierarchies within a DataFrame.
API
Dimension Hierarchies are created using SQL.hierarchy()
API and in case of Date and Datetime fields, using
Time.hierarchy()
API.
Non-Aggregate Data
Dimension Hierarchy Only
Dimension Hierarchy with other fields
Multiple dimension hierarchies
When multiple dimension hierarchies are used create the hierarchy and the hieararchy is displayed as separate columns, each dimension hierarchy is displayed in a separate column that is itself displayed hierarchically.
Aggregate Data
Dimension hierarchy only
When creating the hdf()
from an aggregate DataFrame, by default the hierarchy is built based on the grouping fields.
Hence the GROUP BY
operation has to be performed using the dimension hierarchy (which acts as a ROLLUP
).
Multiple dimension hierarchies
Pivoted Data
Dimension hierarchies are supported in pivot tables as well.
Multiple dimension hierarchies
Each dimension hierarchy results in multiple dimensions. Pivoting adds additional dimensions for the pivoted columns. Hence, while multiple dimension hierarchies can be present as row fields of a pivot table, a large number of overall dimensions can become expensive both in terms of memory and CPU. This becomes even worse if some of the dimensions have very high cardinality.