Summarise Data using a Pivot table

team-member-1

Rishi Sapra

Technical Community leader, speaker, trainer and evangelist specialising in Power BI and Azure. Formally recognised by Microsoft as a Most Valuable Professional (MVP), Fast Track Recognised Solution Architect (FTRSA) and Microsoft Certified Trainer (MCT).

Tags:

View/Download Topic Example Workbook

Instructions: You can type values or formulas into cells above but to access Excel functionality, or if it is easier to work directly in Excel, open the file by clicking on the Excel Icon at the bottom.

If the file does not display correctly try right clicking in the area above and selecting ‘Refresh’. If you are still unable to view the file properly, or it contains unsupported features, click here to download a copy.

Overview

Pivot tables are used as a way of taking raw data and grouping/summarizing it in order to make it more meaningful.

In the example above (taken from the Microsoft Northwind sample database), we have a fairly large amount of raw data which we want to make sense of. The first step in this is to understand what type of conclusions we want to draw – for example what are the meaningful number(s) to analyse (e.g. this might be shipping fees in this case) and what would it make sense to analyse these numbers by? (e.g. by state, by customer).

Say using the above data, we wanted to get an analysis of shipping fees by state and payment type for closed items only. We could list all the unique combinations of state and payment type out individually (e.g. using a filter for unique values) and then use a SUMIFS/COUNTIFS formula against each one, but a quicker and neater way would be to present the data as a pivot table, similar to the one shown in the screenshot below.

This pivot table is the analysis we need – it shows the total shipping fees for each combination of payment type and state. It’s also linked to the data in a dynamic way – if our underlying data changes then when we refresh the pivot table it will update to reflect the changes.

The format of the underlying data is important – it needs to be in as raw a data format as possible. By that I mean it shouldn’t contain summaries or totals but instead should contain columns that have repeated values which you can then summarise by. Your data should be organised in rows as far as possible – e.g. in the example above we want to have a column for ‘Payment Type’ and a column for ‘State’ rather than having data relating to different states or payment types combined. It should also contain at least one key column (with numerical values) which it makes sense to aggregate by (e.g. sum/count) – like shipping fees in this case.

Creating the pivot table

To insert a pivot table go to the Insert tab and then select Pivot Table -> Pivot Table. You will be presented with the below screen shot:

In the Table/Range box click on the range selector icon and select the entire data set you want to analyse using a Pivot table. The first row you select must contain the column headings. If the number of rows in the dataset will change over time then either select entire columns as the range or use a dynamic named range or table (Note that if you use entire columns the column headings must be in row 1). Then select the location of the pivot table (which defaults to the sheet/cell you currently have selected).

You will then be presented with a blank pivot table and a Field List (containing all your column headings) – you can drag and drop the fields you require from the field list into one of the 4 buckets listed below it:

  • Report Filter: Where we want to restrict which items we include in our analysis, we put this as a report filter. In this example we want to restrict our Pivot table by only having it show items where the StatusID is ‘New’ so we select StatusID as the report filter field. (What value we filter by can change, all we’re setting here is the field to filter by)
  • Column Labels: What we want to slice the data by (along the top). In this case we want to split by both Payment Type and State. Which we put as a column label versus a row label is purely a matter of presentation.
  • Row Labels: What we want to slice the data by (along the side). Having multiple fields here will allow us to view our data by hierarchy of the fields – for example by state and then by Payment Type (in a tree view style grid).
  • Values: The numeric field which we want to aggregate (e.g. sum/count) . In this case we want the Sum of Shipping values (by Payment type/state) so we drag the ‘Shipping Fee’ field into this bucket. Sometimes it defaults to a count of the values but if we click on it and select ‘Value Field settings’ we can change this to a sum instead.

Our completed pivot table, complete with arrows indicating the part of the pivot table each bucket relates to is shown below:

Setting Pivot Options and Drilling-down into your data

Once you have set up your pivot table, it is useful to know how to change the options so that it displays your data in the way you want.

If you right click in the row label area (the list of categories along the left), you will be presented with ‘Field settings’ for the category (Field) you selected on. This brings up the dialog box below.

The ‘Subtotals & Filters’ tab is useful for where you have multiple row labels and you want to see a sub-total for one or more of these. For example, if we had both ‘State’ and ‘Payment Type’ as row labels (so we were viewing our data by State and then by Payment Type) then adding a subtotal for the State Field would show us a total by state. (So for example if NY had a total fee of 300 paid by Check, 200 by Card and 100 by Cash then each of these items will be shown separately under NY and then we will see a NY sub-total of 600 underneath). It is also possible to show a sub-total not just by Sum but also average/max/min etc. Note that the fields you want to sub-total by should be to the furthest left – listed at the top in the ‘Row Labels’ bucket (you can drag and drop fields in each bucket to change their position).

The ‘Layout & Print’ tab allows you to configure some display options. Again most of these are useful for where you have multiple row label fields. For example if you want to view the data in separate blocks (as well as or instead of sub-totalling them) then you can click ‘Insert blank line after each item label.’ It’s worth spending some time experimenting with these field options

Right clicking anywhere on the Pivot Table and selecting ‘Pivot Table Options’ allows you to specify some options for the pivot table generally as shown in the screen shot below

There are quite a few here and much of this depends on personal taste as to how you (or the people reviewing the files!) are used to seeing data in pivot tables presented. Personally, a few of the options that I will usually set are:

  • Giving the Pivot Table a more meaningful name – useful when referring to the Pivot table in formulas (e.g. the GETPIVOTDATA formula)
  • Setting the layout to Classic ‘PivotTable layout’ – when you have multiple row label fields this determines how they are displayed relative to each level.
  • Showing Error values in a more friendly way (under Format)
  • Sorting Field List A to Z (In the Display tab)
  • Having the Pivot table automatically refresh on opening the workbook (under the Data tab). Note that even if this is selected you should ensure you periodically refresh the pivot table (right click and select ‘refresh’) each time the source data changes.
  • If you were to right click on a pivot table value instead (in the middle of the pivot table – the numbers you are totalling/aggregating) and select ‘Value Field Settings’ then you’ll be able to change some of the more fundamental options of how the pivot table aggregates your data, shown in the dialog box below. This dialog box can also be obtained by -clicking on the field name in the ‘values’ bucket and selecting ‘Value Field Settings’

    There are multiple options for how you want to aggregate the numbers you are analysing in your pivot table (in this case how we want to aggregate the shipping fees, being analysed by State and Payment type). The standard method of aggregating is to sum them (displaying the total shipping fee by state/payment type) or count them (displaying the number of records by state/payment type, irrelevant of the value), though you can also display a maximum, minimum or average for example.

    The second tab in this dialog box is also very powerful. Instead of (or as well as) showing the total shipping fee by state/payment type, we could show what proportion of the total shipping fee is attributable to each state/payment type combination. To do this under ‘Show Values as’ we select ‘% of Grand Total’. Alternatively if we had multiple row label fields (e.g. by State and then Payment Type), we could show what proportion of the shipping fee sub-total for each state is attributable to each payment type (e.g. for NY 20% is paid by check, 30% by Credit Card etc). To do this we select % of ‘Parent Row Total’.

    Where we want to display both a total and a corresponding percentage value we need to repeat the field in our Values bucket – drag the same field into there twice, and then set different value field settings for each.

    Finally, another very useful feature of pivot tables is the ability to drill-down into any value within the pivot table and see the underlying records which make it up. To do this simply double click on any value In the pivot table (in the values area) – a new sheet will be generated which will show the underlying records which make up that number. This can be used on subtotals too – clicking on a subtotal value will allow you to go from a total to seeing the underlying breakdown.

Try it Out

In the Topic Example Workbook, Create a Pivot Table in a new worksheet (with the source as the data in the ‘Data’ worksheet) by following the steps outlined above.

Drill down into the total shipping fee for state NV (value of 205), this should generate another worksheet showing the data values which make up that number.
A demonstration is shown in the video clip below. Resist the temptation to view the answers file below until you have tried this!

Answers

Once you have completed the ‘Try it Out’ exercise, the completed Excel file should look like this:

If the file does not display correctly try right clicking in the area above and selecting ‘Refresh’. If you are still unable to view the file properly, or it contains unsupported features, click here to download a copy.

Links and other resources

YouTube Video – How to Create Pivot Tables in Excel 2007

PowerPivot Data Analyst 1 – What is PowerPivot?