Skip to main content

SQL in Excel

Spreadsheet software made it possible for non-technical users to do programming without even realizing it. Just typing a simple formula and having all the rows automatically compute using it is nothing but magic for a non-programmer. But did you know Excel also makes non-technical users use powerful analytic SQL queries without knowing anything about SQL? Yes, and we will delve into that aspect of SQL in Excel in this page. Several Excel functions are analyzed from SQL perspective.

SQL Frames provides both Aggregate and Analytic Functions that are SQL compatible. These functions makes it easy to solve most requirements. Most of the Excel Show Values As options can be mapped to these Aggregate and Analytic functions.

The plugin is available using df.excel. Below are the list of API followed by examples.

NameAPIApplicable To
% of Grand Totalexcel.pogt
% of Parent Totalexcel.popt
% ofexcel.po
% of Row Totalexcel.portPivot
% of Column Totalexcel.poctPivot
% of Parent Row Totalexcel.poprtPivot
% of Parent Column Totalexcel.popctPivot
Running Total Inexcel.rti
% of Running Total Inexcel.porti
Difference Fromexcel.differenceFrom
% Difference Fromexcel.percentDifferenceFrom
Rank Smallest to Largestexcel.rankAsc
Rank Largest to Smallestexcel.rankDesc
Indexexcel.indexPivot
Required summaries

In order for these functions to work, appropriate aggregate rows need to be generated using rollup and cube functions. See GROUP BY.

note

If Applicable To is blank it means the function is applicable to both Pivot DataFrame and Grouped DataFrame (think of it as a Pivot with no pivoted columns).

First lets setup some data.

Loading...

% of Grand Total

It is easy to aggregate data using functions such as SUM and COUNT. But often we want to know the relative performance and for that we need the % of Grand Total. This is possible using the function df.excel.pogt() which is short for Percent Of Grand Total.

Loading...
Grand Total Row

The DataFrame should be constructed using ROLLUP/CUBE clauses that generate the grand total partition.

% of Parent Total

What if you want the performance related to not grand total but a specific level of detail. This can be achieved using excel.popt which is short for Percent of Parent Total.

Loading...

% of

Loading...

% of Row Total

Once we organize data in pivot tables, that opens up additional ways of analyzing data. For example, we might want to know the percent of the row total instead of the grand total. That is, in the following example, after organizing data by Region along rows and Sales Channel along columns, we want to know % of revenue by Sales Channel type within each Region. The df.excel.port() API which is short for Percent Of Row Total is used to achieve this.

Loading...

% of Column Total

This is similar to Percent of Row Total described above except the percentage is computed using the column totals. In the example below, we are interested in the percent of Sales Channel type by Region compared to the Sales Channel type for the entire world. The API df.excel.poct() which is short for Percent Of Column Total is used to achieve this.

Loading...

% of Parent Row Total

Pivot tables can have more than 1 dimension along the rows. In such cases, we might be interested in the percentage with respect to the column totals at the parent row. So this is like Percent of Column Total but instead of the column total at the grand total row, it uses the column total at the parent row.

In the following example, we have Region and Order Priority in the rows. For rows with both Region and Order Priority the comparison is with the column totals at the Region row and for rows with just Region the comparison is with respect to column totals at their parent row which is the grand totals row. The df.excel.poprt() API which is short for Percent Of Parent Row Total is used to achieve this.

Loading...

% of Parent Column Total

PIVOT tables can also have more than 1 dimension along the columns. In such cases, we might be interested in the percentage with respect to the row totals at the parent column. So this is like Percent of Row Total but instead of the row total at the grand total column, it uses the row total at the parent column.

In the following example, we have Sales Channel and Order Priority in the columns. For columns with both Sales Channel and Order Priority the comparison is with the row totals at the Sales Channel column and for columns with just Sales Channel the comparison is with respect to the row totals at their parent column which is the grand totals column. Using the df.excel.popct() API which is short for Percent Of Parent Column Total, this can be achieved.

Loading...

Running Total In and % Running Total In

Running total or cumulative total aggregates along a specified field and its sort order. The df.excel.rti() API can be used for this which is short for Running Total In.

Loading...

Difference from and % Difference from

Difference from and associated % Difference from can be calculated by specifying base field(s) and their corresponding base item(s). excel.differenceFrom() and excel.percentDifferenceFrom() API can be used to calculated these values.

Loading...

Rank

Ranks can be calculated smallest to largest or largest to smallest. This is done using the excel.rankAsc and excel.rankDesc API.

Loading...

Index

Use excel.index() API for computing the index of a value in a pivot table.

Loading...