SQL in Excel
Spreadsheet software made it possible for non-technical users to do programming without even realizing it. Just typing a simple formula and having all the rows automatically compute using it is nothing but magic for a non-programmer. But did you know Excel also makes non-technical users use powerful analytic SQL queries without knowing anything about SQL? Yes, and we will delve into that aspect of SQL in Excel in this page. Several Excel functions are analyzed from SQL perspective.
SQL Frames provides both Aggregate and Analytic Functions that are SQL compatible. These functions makes it easy to solve most requirements. Most of the Excel Show Values As options can be mapped to these Aggregate and Analytic functions.
The plugin is available using df.excel
. Below are the list of API followed by examples.
Name | API | Applicable To |
---|---|---|
% of Grand Total | excel.pogt | |
% of Parent Total | excel.popt | |
% of | excel.po | |
% of Row Total | excel.port | Pivot |
% of Column Total | excel.poct | Pivot |
% of Parent Row Total | excel.poprt | Pivot |
% of Parent Column Total | excel.popct | Pivot |
Running Total In | excel.rti | |
% of Running Total In | excel.porti | |
Difference From | excel.differenceFrom | |
% Difference From | excel.percentDifferenceFrom | |
Rank Smallest to Largest | excel.rankAsc | |
Rank Largest to Smallest | excel.rankDesc | |
Index | excel.index | Pivot |
In order for these functions to work, appropriate aggregate rows need to be generated
using rollup
and cube
functions. See GROUP BY.
If Applicable To is blank it means the function is applicable to both Pivot DataFrame and Grouped DataFrame (think of it as a Pivot with no pivoted columns).
First lets setup some data.
% of Grand Total
It is easy to aggregate data using functions such as SUM and COUNT. But often we want to know
the relative performance and for that we need the % of Grand Total. This is possible using the
function df.excel.pogt()
which is short for Percent Of Grand Total.
The DataFrame should be constructed using ROLLUP/CUBE clauses that generate the grand total partition.
% of Parent Total
What if you want the performance related to not grand total but a specific level of detail. This can be
achieved using excel.popt
which is short for Percent of Parent Total.
% of
% of Row Total
Once we organize data in pivot tables, that opens up additional ways of analyzing data.
For example, we might want to know the percent of the row total instead of the grand total.
That is, in the following example, after organizing data by Region along rows and Sales Channel
along columns, we want to know % of revenue by Sales Channel type within each Region.
The df.excel.port()
API which is short for Percent Of Row Total is used to achieve this.
% of Column Total
This is similar to Percent of Row Total described above except the percentage is computed using
the column totals. In the example below, we are interested in the percent of Sales Channel type by Region
compared to the Sales Channel type for the entire world. The API df.excel.poct()
which is short for
Percent Of Column Total is used to achieve this.
% of Parent Row Total
Pivot tables can have more than 1 dimension along the rows. In such cases, we might be interested in the percentage with respect to the column totals at the parent row. So this is like Percent of Column Total but instead of the column total at the grand total row, it uses the column total at the parent row.
In the following example, we have Region and Order Priority in the rows. For rows with both Region and Order Priority
the comparison is with the column totals at the Region row and for rows with just Region the comparison is
with respect to column totals at their parent row which is the grand totals row. The df.excel.poprt()
API which is short for
Percent Of Parent Row Total is used to achieve this.
% of Parent Column Total
PIVOT tables can also have more than 1 dimension along the columns. In such cases, we might be interested in the percentage with respect to the row totals at the parent column. So this is like Percent of Row Total but instead of the row total at the grand total column, it uses the row total at the parent column.
In the following example, we have Sales Channel and Order Priority in the columns. For columns with both Sales Channel
and Order Priority the comparison is with the row totals at the Sales Channel column and for columns with just Sales Channel
the comparison is with respect to the row totals at their parent column which is the grand totals column.
Using the df.excel.popct()
API which is short for Percent Of Parent Column Total, this can be achieved.
Running Total In and % Running Total In
Running total or cumulative total aggregates along a specified field and its sort order. The df.excel.rti()
API can be used for this
which is short for Running Total In.
Difference from and % Difference from
Difference from and associated % Difference from can be calculated by specifying base field(s)
and their corresponding base item(s). excel.differenceFrom()
and excel.percentDifferenceFrom()
API
can be used to calculated these values.
Rank
Ranks can be calculated smallest to largest or largest to smallest. This is done using the excel.rankAsc
and excel.rankDesc
API.
Index
Use excel.index()
API for computing the index of a value in a pivot table.