Optimise file size/speed
Overview
I’m sure anyone who uses Excel on a regular basis has come across the frustration of trying to work with large files which crash frequently and become almost unmanageable to work with, often for no apparent reason.
If the file size suddenly increases unexpectedly and then increases further every time you save (even if you haven’t made any changes!) then it’s likely the file is corrupt and you’re better off reverting to an earlier saved version of possible.
Fortunately this is quite rare-most of the time files become large is because of the way formulas are written, data sets are referred to or because of stray cells. All of these can be corrected and the file size should then reduce, with the file becoming much quicker.
Below I’ve mentioned my 5 top tips for optimising file size/speed. This is by no means comprehensive but is a good start and should resolve the vast majority of speed/size issues.
Minimise the used range of each worksheet
Often the reason file sizes are so big are because of stray values in cells well outside the range of cells you’re actually using within the worksheet (This is particularly the case in Excel 2007-2010 which has over 1 million rows and 16000 columns). To find a stray cell on a worksheet, use the Keyboard shortcut Ctrl + End (or Ctrl + G -> Special -> Last Cell) which will take you to the last used cell in the worksheet. If this is a cell outside of the range you’re using +n the worksheet then you will need to delete the excess rows/columns not being used.
To do this, highlight the entire row beneath the last used row you’re actually using and press ctrl + Shift + Down Arrow to highlight all the rows from there to the end of the worksheet (row 1048576), then right click and select Delete. Do the same with columns – select the entire column to the right of the last one actually used and then select all to the end of the worksheet by pressing Ctrl + Shift + Right Arrow, then delete. If you’re unable to delete all the rows/columns at once (the application might complain of insufficient resources) then try just deleting only the last used row and then click save (shown in the video clip below)
This will need to be repeated for all worksheets in the workbook. After each deletion of unused rows/columns save the file by pressing Ctrl + S. Once you’ve gone through the entire workbook with this, close and re-open and the workbook should be smaller and quicker. Sometimes the deletion of rows/columns has to be done piecemeal due to memory constraints and if it is proving difficult to do this it may be better to create the worksheet again, copy and pasting data over from an existing one.
Optimise formulas and calculation
The extent and frequency of formula recalculations is the main thing that will determine the speed of the workbook. Every time something is changed in the workbook, Excel will recalculate precedent/dependent cells (cells which link to it and from it) which can be slow when working with large datasets linked by formulas.
To prevent this you can set calculation to Manual (rather than Automatic) by clicking on the ‘Calculation Options’ command on the ‘Formulas’ tab and selecting ‘Manual’. The status bar (in the bottom left corner of the workbook) will then show ‘Calculate’ indicating that the workbook is in Manual calculation mode. Clicking on this word or pressing F9 will force a calculation manually. Use Shift + F9 to recalculate the entire workbook. Whilst there isn’t a problem per se with having a workbook in Manual calculation mode, if you happen to forget it can sometimes be quite frustrating when formulas aren’t returning the results you expect!
Regardless of the calculation mode, formulas should be written in such a way that they trigger off the minimum amount of calculation required. For example, rather than referencing entire rows/columns in a formula, it is better to limit it to the specific range that you need. (e.g. reference $A$6:$A$1000 rather than $A:$A). If the number of rows/columns changes over time (so you need your reference to be dynamic), consider using a Dynamic named range or better yet a Table.
Also avoid the use of formulas which take up more processing power – these include VLOOKUP] (use INDEX/MATCH instead), Array formulas, OFFSET and INDIRECT. (These last two are known as Volatile functions – the entire workbook is recalculated each time something is changed, not just dependent/precedent cells). Of course these formulas are sometimes needed but try and use them sparingly, particularly in workbooks that are already quite large, and restrict the ranges across which you use them – trying to reference an entire column in an array formula will almost certainly cause your machine to freeze!
Finally, don’t have formulas/links for the sake of it. If you only need static data values then copy and paste special Values with the data rather than keeping them as formulas. Try to limit the amount of layers between your source data and reporting or the amount of times you link to the data, and it is much quicker to use a pivot table to analyse a large data set compared to a series of SUMIF formulas for example.
Save in a current file format
Excel 2007 introduced new file formats with file extensions including .xlsx, .xlsm or .xlsb rather than .xls. Files in this new format are generally much smaller (particularly binary files – .xlsb extension)
If you are working on workbooks created in Excel 2003 they will be opened in Compatibility mode – you will see this in the title next to the name of the file. To convert it into the new file format, click on the Office button and click on ‘Convert’. Excel will then close and re-open the file in the new file format and the file size should be much smaller.
Note that when saving workbooks down in a new format, if you have macros in the file you will need to save it as a .xlsm or .xlsb file. Also beware that any macros that reference files by their full filenames (including the extension) might not work if the file extension changes, though it should be quick for someone who knows the macro to change.
Delete excess Cell styles, Objects and Named ranges
When sheets are copied into workbooks and files are edited by multiple people over time, the workbook often becomes cluttered with lots of references, styles, objects or names which can be aren’t needed and which slow down the workbook unnecessarily.
To see the list of name references you have in the workbook, click on ‘Name Manager’ under the Formulas tab of the ribbon and you will be presented with the Name Manager dialog box as shown below.
If there are lots of names here which you don’t recognise (especially those that have a reference to an unrecognised file in the ‘Refers to’ field or have a #Ref value in the Refers To/Value fields) then it it likely that they are not required and can be deleted. To see all the named ranges that refer to an error or have a reference is not recognised, click on the Filter drop down and select ‘Names with errors’. To delete names, highlight any that can safely be deleted (select multiple by holding down shift or ctrl and selecting them), and then press Delete.
Having too many cell styles will have less of an impact on the speed of the workbook but once it reaches its limit (approx 64,000) you may experience the error message ‘Too many different cell formats’ and end up losing formatting applied in the spreadsheet. To see the custom styles you have in the workbook, click on the ‘Cell Styles’ drop-down in the Home tab and review the styles in the top section (Custom) – if this section doesn’t exist then this means there are no custom styles. You can delete a custom styles by right clicking on it and selecting ‘Delete’. Of course doing this for 64,000 custom styles isn’t the most exciting way to spend several hours so you may want to revert to using VBA or start a new file if this is the case.
Having multiple objects in the workbook (e.g. text boxes or other shapes) will slow down performance of the file as each of these need to be re-drawn each time a sheet with them is brought into focus. Sometimes the objects aren’t easily visible – for example if they are small or in an obscure place of the workbook. To highlight any shapes on a worksheet, click on ‘Find and Select’ under the Home tab and click ‘Select Objects’. If there are objects on the worksheet, the cursor will change to an arrow and dragging it across an area will highlight all the objects/shapes within that area – select them and press delete to get rid of any which you don’t need.
Deal with large datasets differently
Excel 2007 and 2010 can deal with much larger amounts of data – over 16000 columns and 1 million rows – but just because Excel can deal with this amount of data doesn’t mean that it will handle it well or that you should try and stretch it to this capacity.
Large datasets should ideally be put into an Access (or SQL) database – you can then bring the data back into Excel by using external data connections. The Data Connection type that is easiest to work with is MS Query – to create this select ‘From Other Sources’ from the Data tab and then ‘From Microsoft Query’. You will then be presented with a series of data source including an Access database or even other Excel workbooks (which is potentially better and quicker than just linking to a large volume of cells located in other External workbooks).
- Navigate Using Keyboard Shortcuts and the Quick Access Toolbar
- Dynamically refer to named ranges or sheets by using INDIRECT [Advanced]
- Build your own ‘MAXIF’/’MINIF’ functions using Array Formulas [Advanced]
- Use a VLOOKUP (Lookup Left to Right)
- Use an INDEX/MATCH (Lookup either way)
- Summarise Data using a Pivot table
- Sum/Count rows based on multiple conditions or values within a range – using SUMIFS/COUNTIFS
- Format data and Align text effectively
- Refer to cells by using Named Ranges