The SUBSTITUTE function allows you to change text in a string, or a range of strings, but there is more to it than that.
By combining it with other functions you can use it to do things like count how many times a section of text appears in a cell or range, or find the location of all occurrences of a string in a chunk of text.
SUBSTITUTE
The syntax for the function is
=SUBSTITUTE( text, old_text, new_text, [instance])
- text - The text to search.
- old_text - The existing text we intend to change
- new_text - The new text to be inserted
- instance - Which occurrence of old_text to replace e.g. the 1st, 2nd, 3rd etc. Square brackets [ ] indicate that it's optional to provide this value. If this is omitted, then all occurrences are changed.
If we have a cell with the text 'apple, banana, grape, peach, apple' we can use SUBSTITUTE to change every apple to orange
change only the 2nd occurrence
change only the 6th occurrence
There isn't a 6th occurrence, but SUBSTITUTE doesn't generate an error which is good so you don't have to use IFERROR to display a 'nice' error message.
Download the Example Workbook
All of the formulae I use in this post are in the example workbook. Download it to follow along as you read.
Enter your email address below to download the workbook.
Count How many Times a String Appears
Because SUBSTITUTE can replace every occurrence of a substring in another string, you can use this to count how many times that substring appears.
Let's start by working out the length of our string 'apple, banana, grape, peach, apple' using LEN
If we then substite 'apple' for a null string "" we are deleting every occurrence of apple from the string. This will shorten the length of the string by 5 every time 'apple' is removed.
As 'apple' appears twice, the end result is a string that is 10 characters shorter than when we started. Subtracting one from the other we get 34 - 24 = 10.
Divide 10 by the length of 'apple' to get the number of times it appears in our original string
Find Location of Substring in String
If we want to know the position in our string of every occurrence of 'apple' we can use a combination of FIND (or SEARCH for case-insensitive searching) to return the position of the substring, and SUBSTITUTE to replace 'apple' with a special string we will look for. Let me explain.
FIND returns the position of the first occurrence of a substring in a string, so if we ask FIND to tell us where 'apple' is in 'apple, banana, grape, peach, apple' it'll give us a 1. How do we find the 2nd, and other occurrences if they exist?
By using SUBSTITUTE to change specific occurrences of the substring we are looking for. But we need to be sure we change the string to something that we know won't already be in the string, so we will use an ASCII non-printing character. These are ASCII codes 0 to 31 but character 1 is commonly used.
What we get is
=FIND(CHAR(1),SUBSTITUTE(A6,"apple",CHAR(1),1))
and we are using CHAR to convert the value 1 to the corresponding ASCII character to insert into our string.
The 4th parameter supplied to SUBSTITUTE tells it to look for the nth occurrence of a substring, in this case we tell it to look for the 1st occurrence.
By changing this value we can tell it to look for the 2nd occurrence, and so on
=FIND(CHAR(1),SUBSTITUTE(A6,"apple",CHAR(1),2))
We can now use these formulae in adjacent cells to give us the position of each 'apple' in the string.
Find Location of Substring in String Using an Array Formula
By modifying the formula we just used we can turn it into a more flexible array formula. What we have to figure out is how to dynamically tell SUBSTITUTE which occurrence of the substring to change. This can be done using the COLUMN (or ROW) functions.
If our string is in cell A10, this is column 1. If our formula is in B10 that's column 2. Subtracting one from the other gives us a 1 which we can use to tell SUBSTITUTE to change the first occurrence of 'apple' in the string.
Likewise, if we have the same formula in C10 (column 3) we take 1 from 3 to give us 2 which we use to tell SUBSTITUTE to change the 2nd occurrence, and so on.
The formula looks like this
=IFERROR(FIND(CHAR(1),SUBSTITUTE(A10,"apple",CHAR(1),COLUMN()-COLUMN(A10))),"")
So COLUMN() gives us the number of the column the formula is in, and COLUMN(A10) gives us a 1. So each time we move 1 cell to the right of A10 we increase this value and hence the occurrence we are telling SUBSTITUTE to look for.
Wrapping the whole thing in IFERROR allows us to deal nicely with errors caused by FIND not finding the substring.
To enter the array formula, select the cells beside your string where you want the results to go, in my case I select B10 and C10. Enter the formula and press CONTROL+SHIFT+ENTER as we are entering an array formula.
Working With Ranges
Counting
We can count the number of times 'apple' appears in a range by modifying the formula we used above for counting occurrences in a cell.
If our data is in A2:A6 our formula becomes
=LEN(A2:A6)-LEN(SUBSTITUTE(A2:A6,"apple",""))/LEN("apple")
This will give us the number of times 'apple' appears in each cell A2 through to A6 but because of the way LEN works, it can't handle arrays properly, it uses implicit intersection to give us the number of times 'apple' appears in the cell on the row where the formula is.
Confused? Let's look at this image.
The formula in cells B2:B6 is the same. It's working out the number of times 'apple' appears in the range A2:A6, but you'll notice that the result for each row is not the same.
This is because LEN can't handle the whole range so it just returns the number of times 'apple' appears in the cell on the same row as it.
In row 2 (cell A2) 'apple' appears once. In row 3 (cell A3) 'apple' appears once, in row 4 (cell A4) 'apple' does not appear, and so on.
By wrapping this in SUMPRODUCT
=SUMPRODUCT(LEN(A2:A6)-LEN(SUBSTITUTE(A2:A6,"apple","")))/LEN("apple")
We get a total showing us how many times 'apple' appears in the range.
Using an Array Formula
By using SUM instead of SUMPRODUCT you get the same result, but the formula must be entered as an array formula, with CONTROL+SHIFT+ENTER.
Substituting Text in a Range
If you supply a range to SUBSTITUTE it can replace all occurrences of a substring in that range. This must be entered as an array formula.
Using the instance parameter works as expected.
Anton Redelinghuys
Good day.
I need to limit the number of times a specific text entry is allowed in a range.
I.e monthly customer visit limits
Customer A can only be visited twice during a month and so if I enter Customer A in for the third time the cell changes red or gives warning note.
I tried =SUBSTITUTE(text, old_text, new_text, instance) but this seems only to work in a specific cell and data as a string. I essentially want this checked in a range
Catalin Bombea
Hi Anton,
Can you upload a sample file with an example of how this should work? Here is a link to our forum where you can create a new topic and upload the sample file.
Catalin
Glenn Case
Phil:
A couple of questions:
1) In the “Find Location of Substring in String Using an Array Formula,” why do you use the array formula to find the location rather than just using an absolute reference to the first column (other than that this is the topic of that section? Is there a reason to use an array vs a non-array formula here? i.e., if I use the following, it will work the same as the array formula when copied to the adjacent cells:
=IFERROR(FIND(CHAR(1),SUBSTITUTE($A$10,”apple”,CHAR(1),COLUMN()-COLUMN($A$10))),””)
Trying to determine what the increased complexity of the array formula is doing for us.
2) Sort of the same question on substituting text in a range, i.e., it seems this can be done without the array formula by making A10 an absolute reference.
3) Lastly, ref your last paragraph above on using the instance parameter, the reason it does nothing when you use 2 is because there is only one occurrence of “apple” in each string. Add another, and it works (i.e., make the string in A2 of the Implicit Intersection sheet read “apple, banana, apple” and B2 will read 2.) 2 refers to the number of occurrences of “apple”, not to the position in the string. I found this confusing at first until I realized what was going on.
I appreciate these tutorials, I usually learn something. Keep it up!
Philip Treacy
Hi Glenn,
For 1) and 2) yes you can do it that way too. I showed how to use an array formula partly as an example of how to do it, but I would argue that using an array formula means you only have to use/maintain 1 formula as opposed to a formula in each cell for every occurrence you want to find/replace. Horses for course I guess.
3) Why didn’t I realise that, doh. Of course, I even wrote about implicit intersection. Thanks, I’ll amend the post.
Regards
Phil