### Rishi Sapra

Technical Community leader, speaker, trainer and evangelist specialising in Power BI and Azure. Formally recognised by Microsoft as a Most Valuable Professional (MVP), Fast Track Recognised Solution Architect (FTRSA) and Microsoft Certified Trainer (MCT).

Tags:

Instructions: You can type values or formulas into cells above but to access Excel functionality, or if it is easier to work directly in Excel, open the file by clicking on the Excel Icon at the bottom.

If the file does not display correctly try right clicking in the area above and selecting ‘Refresh’. If you are still unable to view the file properly, or it contains unsupported features, click here to download a copy.

### Overview

This topic looks at understanding arrays and array formulas in Excel, which are powerful tools for working with sets of data as they allow you to write formulas that reference or return a range of cells in a customised way. More specifically, the topic looks at using aggregate functions (e.g. SUM, COUNT, MIN, MAX) as part of an array formula. Doing this gives you the power to manipulate or evaluate cell contents individually before using them as part of your formula. As an example, consider the dataset in the Topic Example workbook, in the situation where we want to do some analysis on the sales by Andrew. We could write a SUMIF/COUNTIF formula to find the total number of sales or total quantity sold by Andrew, but what if we wanted to find the highest (or lowest) amount sold by him? Excel doesn’t have a ‘MAXIF’ or ‘MINIF’ function which allows us to do this, but we can write this function ourselves using array formulas.

### Understanding Array Formulas

Put simply, an array is just a series of values which are all treated together as part of the same collection. Because cells within an Excel worksheet store individual values, a range of cells is essentially an array. Like with a range, you can perform an aggregate function on any array (for example sum or count the cell values). The advantage of doing this using an array formula rather than just passing in a range of cells is that you can choose to include only particular cells in that range to evaluate/aggregate (much like a SUMIF/COUNTIF formula). These aggregate functions evaluate each element of the array in turn and aggregate them to return a single result; alternatively you can also use array formulas to return an array of values which you can then populate into a separate range.

An Aggregate function used with an array formula allows you to take a range of cells and individually evaluate each one to return an overall result. In this sense many Excel formulas are, at their core, array formulas. For example COUNTIF takes a range of cells, individually evaluates each to see if it meets a particular condition and if it does includes it in the overall count.

You can tell Excel to treat values together as part of the same collection (array) by typing the formula containing values /a cell range into the formula bar but then instead of just pressing enter, hold down Ctrl and Shift and then press enter. This will insert curly brackets around the formula indicating that it is an array formula. Every time the formula is edited the array formula must be entered again using Ctrl + Shift + Enter or else it may return the wrong result.

To see an example of this, consider the data grid shown in cells G13:I15 in the Topic Example Workbook. This range of cells (G13:I15) is an array – we can either use an aggregate function on it such as =SUM(G13:I15) which will return a single value result (of 45), or we can use a formula which returns an array instead of a single value result. An example of such a function is TRANSPOSE which looks at a range of cells, takes the values which are across columns and puts them into rows, and takes the data which are across rows and puts them into columns (Similar to the Transpose Paste Special Option). If you highlight a blank area that has the same height/width as the data grid (3 columns wide by 3 rows tall) and then type =TRANSPOSE(G13:I15), and press Ctrl + Shift + Enter so that the formula becomes { =TRANSPOSE(G13:I15)} then you will see that the data grid appears in the cell range that was highlighted, but with the data that was in rows now in columns and vice-versa. If the formula was entered without Ctrl + Shift + Enter then it returns #VALUE.

### Aggregate functions with Array Formulas

Aggregate functions allow you to take a range of cells and do something with them-such as sum them or count them. If you just referenced a range of cells in a formula such as =C2:C11 then, because Excel can’t return multiple values in a single cell, the formula by will just return the first value in the range – 100 (in cell C2). In order for the formula to return a meaningful result you have to tell Excel what to do with the range values. Therefore if you did =COUNT (C2:C11) it would return a valid result of 4500- COUNT is an example of an aggregate function which takes an array of values and does something with them.

But what if you wanted to not count all the values in C2 to C11 but just the ones greater than 300? Consider the formula =IF(C2:C11>300,C2:C11)). The first element of this is what to check for in the range (checking for all values greater than 300) and the TRUE condition is just the same range, i.e. telling Excel to return (as an array) all the values which are greater than 300. If you type this in, Excel will just return FALSE, indicating that not all the values in the range C2:C11 are greater than 300 (because Excel can’t return multiple values in a cell it tries to evaluate it as a single result). In order to return something meaningful, we need to do something with the range of values returned – e.g. COUNT them or SUM them. So to count the number of values greater than 300 we could have a formula that reads =COUNT(IF(C2:C11>300,C2:C11))

If you enter the formula just like this Excel will return the wrong result of 1 (there are actually 6 values greater than 300). But this is just because Excel doesn’t recognise it as an array formula- to get Excel to calculate this function you have to press Ctrl+Shift+Enter when in the formula bar- Excel will then insert curly brackets either side of it to indicate it is an array formula. So Our completed formula looks like {=COUNT(IF(C2:C11>300,C2:C11))} which returns 6 as expected.

Of course this is exactly what the COUNTIF function does [Try it-compare =COUNTIF(C2:C11,”>300″) with your array formula-they should give the same result], but what we’ve done here is built it up from scratch, essentially going under the hood of how Excel formulas work.

Returning to the original point of the topic, if we want to find the maximum quantity sold by Andrew even though there is no ready-made ‘MAXIF’ formula that we could use, we can instead write an array formula that returns the array of cells in column C where the corresponding value in Column A is equal to “Andrew” and then use the aggregate function MAX to find the maximum value in this range. The array of cells in the range C2:C11 where the corresponding value in A2:A11 is “Andrew” is represented by the formula =IF(A2:A11=”Andrew”,C2:C11). Putting this as part of the MAX function to aggregate the results, and entering as an array formula, our formula becomes {=MAX((IF(A2:A11=”Andrew”,C2:C11))} which returns 600. Looking at the data we can see that this is correct – Andrew made 3 sales in total – one for a quantity of 100, one for 200 and one for 600; therefore Andrew’s highest sale value is 600.

So we are using an array formula to return a range of values which meet the condition specified by our IF statement – i.e. rather than taking all the cells in the range C2:C11 we are only taking a subset. This has other applications; for example if we wanted to do an average of the cells C2:C11 whilst ignoring any 0 values, then we could write a formula such as {=AVERAGE(IF(C2:C11<>0,C2:C11))} as an alternative to using AVERAGEIF. Or if our dataset contained both positive and negative numbers and we wanted to find the total absolute value of the range then rather than having a separate column with the absolute numbers (using the ABS function) we could just write an array formula that reads {=SUM(ABS(C2:C11))} – this takes each value in the range C2:C11 and converts it into an absolute number before summing it.

It gets even more interesting when we reference multiple arrays (cell ranges) within an array formula – for example we could count the number of rows in the range G13:H15 where the value in Column G is higher than the corresponding value in Column H by using a formula such as {=COUNT(IF(G13:G15>H13:H15,G13:G15))}. Other applications of array formulas include examples in other topics such as doing a lookup and returning multiple values which uses an array formula to return the range of row numbers where the cells in a certain column are equal to a value specified.

### Try it Out

In the Topic Example workbook:

• Build up an array formula by filling in the yellow cells as per the description in Column G.
• Transpose the Data grid in cells G13:I15 by highlighting a grid of the same size underneath and using the TRANSPOSE function as described above.

A demonstration is shown in the video clip below. Resist the temptation to view the answers file below until you have tried this!