7 Excel lookup formulas

  • Tagged with:
  • Design,
  • Errors,
  • Excel,
  • INDEX,
  • INDIRECT,
  • Lookups,
  • MATCH,
  • named ranges,
  • Spreadsheets,

Many spreadsheet users will know that when you need to pick out a value from a list, it’s a good idea to use a lookup formula. There are numerous ways to do this, but which is the best?

Each formula or method has its advantages and disadvantages depending on the circumstances. It’s hard to cover all of these circumstances, so we’ll focus on one specific, yet common, situation.

Let’s say you have a unique list of items, say “fruit”. Each type of fruit has a numerical value associated with it — perhaps the quantity or price of the item. Somewhere else in your spreadsheet you want to enter the fruit item you’re interested in and have a formula return the specific value for this fruit item. Let’s further assume that the items are not sorted in any particular order.

In this table, we want to enter the name of the fruit in cell B10, and we want the value in cell C10 to automatically pick out the correct quantity in the source data.

So, now that we’ve set the scene, let’s look at 7 different ways to do this in Excel.

Method 1 – SUMIF

C11: =SUMIF(B4:B7,B11,C4:C7)

SUMIF is a very useful way to do conditional sums. It lets you specify the range of items (i.e. the list of fruit names), the criteria (i.e. the particular fruit you want to find), and the result range (i.e. the quantities associated with the items). It does not require the list to be sorted in any order, but for this to work as a lookup each item in the list needs to be unique. Furthermore, it will only work for numerical values in the result range. If you wanted to return say the colour of the fruit, SUMIF would not work.

Method 2 – INDEX and MATCH

D11: =INDEX(D4:D7,MATCH(B11,B4:B7,FALSE))

INDEX and MATCH is probably the method I use most often to perform exact lookups. For some reason it just makes the most sense to me, and has clear advantages over VLOOKUP and HLOOKUP. If the items in the list are not unique then it returns the associated value for the first one it finds. Although this indicates a problem with your source data, this seems better to me than SUMIF which would effectively aggregate all the items it finds. Furthermore, INDEX and MATCH works with non-numeric data, so would work fine in the situation discussed above regarding picking a colour out of the list. Finally, the ranges do not have to be contiguous, and there is a more advanced version of INDEX and MATCH which can match criteria in two dimensions i.e. rows and columns.

Method 3 – VLOOKUP

E11: =VLOOKUP(B11,B4:E7,4,FALSE)

This is similar to the INDEX and MATCH approach but with one significant disadvantage — it requires you to specify the column number of the data you want returned (in this case the 4th column of the range). It is possible to calculate this column number dynamically, but this is hard, opaque and error-prone. Furthermore, the formula requires that the left-most column of the range contains the items you’re searching. This is unnecessarily restrictive and is not required by the INDEX and MATCH approach.

Method 4 – DGET

F11: =DGET(B3:F7,F3,B10:B11)

This is a neat method and probably should get more use. By all accounts it is a computationally efficient method, and when working with arrays it allows for flexible criteria and boolean logic e.g. “apples” OR “bananas” (e.g. DSUM, DCOUNT, etc.). One drawback is that it requires a criteria range to be set up in a particular way (B10:B11 in the formula) and this may not suit your worksheet layout and may break later if you insert rows in the wrong place. In general, I would avoid this approach and still go with INDEX and MATCH.

Method 5 – Array IF

G11: {=SUM(IF(B4:B7=$B$11,G4:G7))}

Arrays are incredibly powerful and intuitive when you really get them. Sometimes thinking about them can cause your brain to leak out of your ear though, so be careful. In this case, using an array IF is no better than SUMIF and has the disadvantage of being more advanced – so less skilled users of your model may be confused. Also arrays can only be entered by holding down Ctrl and Shft at the same time when hitting Enter. Again, this is not something less skilled users will know about — not that they should be messing with your formulas anyway.

Method 6 – Range Names

H11: =INDIRECT(B11)

This assumes that you’ve created range names for your source data. You can do this easily / automatically by selecting your source data (B4:I7) and hitting Ctrl Shft F3, then choose the Left column option and hit Enter. This will create range names out of all item names. The INDIRECT formula is the way to get Excel to use the input value you type in B11 as if you’d written =Bananas in cell H11. This method is bad design! The main problem is that you now need to maintain and ensure the integrity of a whole lot of range names. If you later change the item Bananas in your source list to be Kiwi Fruit, you’ll need to delete your Bananas range name and add one for Kiwi Fruit. There are so many things that can/will go wrong with this method, and we generally recommend not using range names at all in your models.

Method 7 – PivotTables

I11: =GETPIVOTDATA(“PivotTable”,B13,”FRUIT”,B11)

PivotTables are very useful for analysing data set up in a database format. They can certainly be used to lookup data as we’re doing here, but they require you to create a PivotTable and ensure that it is being refreshed when your source data changes. This does not happen automatically like normal spreadsheet functions. It also requires you to use and understand the GETPIVOTDATA formula, which can be a bit fiddly. In this case, using a PivotTable approach is a bit like using a sledge-hammer to kill an ant, so it’s not our preferred approach.

Conclusion

So, while none of these methods are foolproof, the winner is clearly method 2 – INDEX and MATCH. To further improve your exact lookups, you should 1) check your source items are unique, 2) use data validation to ensure the result item is from this list, 3) deal with potential errors with IFERROR or ISERROR. Finally, although not done in the above examples, remember to get your $ signs in the right places in case you need to copy and paste this formula to other cells.

6 Comments so far Join the Discussion

  1. AdamV

    26th June 2010 at about 2am

    Any reason you missed off using a straight LOOKUP?

    This has many of the advantages of Index/Match in terms of flexibility of locations of lookup array and value array, is generally easier to ‘get’ for many users at first, and can be used when you want to deliberately do an inexact search (eg values between category boundaries).

    I also like to use Index/Match quite a lot, especially if I need to lookup a value once and return severla values for it, which means I can do the MATCH once and keep the value in a (hidden) column, then INDEX against that several times for maximum efficiency.

  2. Darren Miller

    26th June 2010 at about 4pm

    Thanks for your comment Adam. The only reason I left out LOOKUP is that it can’t be used where the lookup range is not sorted in ascending order. INDEX and MATCH does the same thing but doesn’t require the lookup range to be sorted.

  3. Justin

    30th June 2010 at about 5am

    While I normally use Index/Match, I’ve recently discovered the power of SUMPRODUCT. It can easily replace SUMIF & Array SUM IF’s (no CSE!). It can also be used as a lookup formula.

  4. Barbara Martin

    24th July 2010 at about 12pm

    All right, that’s very interesting. I started my internet search for bananas simply because I wanted to locate a good recipe for banana pie. Some how I came across your website. I am incredibly glad I took the time to read your page. Interesting how the web can move you around and aid you to learn so very easily. Thanks for the excellent insight. Iím off to locate a excellent banana pie recipe.

  5. Bob Phillips

    15th August 2010 at about 9am

    Apart from the fact that I am the polar opposite of you on range names, I find this post alarmingly simplistic.
    You seem to offer a host of poor choices to validate your conclusion, when in reality there are only three options here IMO. I don’t think you should use any sort of SUM function when you are doing a lookup; INDIRECT is a bad function that should be avoided; and as you say pivots are over-kill for this problem.

    That leaves us:

    - INDEX/MATCH – your preferred solution. All of the reasons for not using it were arguing against the SUM alternatives, which are not valid at all IMO. Your arguments for using it were for a situation entirely different to the one you proposed; you seemed to be saying this is my favourite tool so I will use it for all situations

    - DGET – a good solution, probably the best IMO. Your proffered reason for dismissing it are really weak, I know that people do seem loathe to use the database functions but they should learn to use Excel to its full power

    - VLOOKUP – nothing wrong with this, and if you are turned off by DGET, this is the one I feel should be used in the situation you gave. Your argument that … it requires you to specify the column number of the data you want returned … is just not accurate if you include the headings in the VLOOKUP (dare I say a good design in my opinion is to include the headers and a range name for the whole lookup table), as you can use =VLOOKUP(B11,B3:C7,MATCH(“Quantity”,B3:C3,0),FALSE) No need to specify the column. Agreed, you can’t do a lookup-left (one of the things MS should have added in 2007 rather than all the glitz), but as I said before, that wasn’t part of your spec., and I believe in using the appropriate tool for the job, not the same one every time.

  6. Darren Miller

    20th August 2010 at about 11am

    Bob, thanks for your comment.

    I respect your point of view, but don’t understand why you feel VLOOKUP is a better solution for the example provided.

    If I needed to parameterise which column contains the results then perhaps VLOOKUP would be better, but in a simple case like this INDEX and MATCH works fine, and in fact it has fewer arguments than the VLOOKUP formula you provided (4 versus 6).

    One of the other reasons I prefer INDEX and MATCH is that it is more ‘generic’ than VLOOKUP, i.e., the syntax is the same for vertical ranges and horizontal ranges; it works with the lookup and result ranges left to right or right to left, or indeed with multiple other columns or rows between the two; and it provides a powerful 2-D lookup which lets you pick a cell out of matrix based on looking up values in rows AND columns.

Join the discussion

Popup

Register your Interest

Please register to keep up to date with our developments.

Register your Interest

Please register to keep up to date with our developments.

mac data recovery software | pricexaminer.com | gaussian elimination calculator online