Ever copy and paste some data from a web page, or import it from another program, and end up with some funky and usually unwanted characters like these nonprinting characters below?
The Excel CLEAN Function can help you to remove them, but it has some limitations.
You see it was designed to remove the first 32 nonprinting characters in the 7-bit ASCII code (values 0 through 31) from text. Blah, blah, blah… Sorry, let me give you an example before you fall asleep.
Excel CLEAN Function Syntax
It’s pretty straight forward:
=CLEAN(text)
Where ‘text’ can be a reference to the cell containing the text you want to clean. See the example in cell E2 below:
Warning: more boring techy talk coming up, but I know some of you will want to know this stuff.
The CLEAN Function Can’t Do It All
In the Unicode character set, there are additional nonprinting characters (values 127, 129, 141, 143, 144, and 157).
Plus if you’re familiar with HTML then you’ve probably stumbled across the HTML entity; which is character 160.
To remove these you need to use the SUBSTITUTE function.
SUBSTITUTE Function
The SUBSTITUTE function replaces new_text for old_text in a text string.
The syntax is slightly more complicated that the CLEAN Function but nothing drastic:
=SUBSTITUTE(text,old_text,new_text,[instance_num])
The arguments are as follows:
- text – cell containing the characters you want to replace.
- old_text – the text to be replaced.
- new_text – the new text you want inserted.
- Instance_num – if omitted, it replaces every occurrence of the old_text; Otherwise, only the specified occurrence of the old_text is replaced in the text string.
It can do everything CLEAN can do and more like this:
Taking the formula displayed in cell C4 above, in English it reads:
Locate the text in cell A4, and find the character number 127, and replace it with nothing (as denoted by the two sets of double quotes).
Of course with SUBSTITUTE you first have to know the character number you're trying to clean up. It's just as well then that you can use the CODE function to find this out.
Other SUBSTITUTE Formulas
The SUBSTITUTE function isn't limited to working with the CHAR function. You can use it to replace any text.
For example, let's say we wanted to replace the text 'Iron Man' with 'Captain America' in this text:
Cell A1: Iron Man is the greatest Avenger
Your SUBSTITUTE formula would be:
=SUBSTITUTE(A1,"Iron Man","Captain America")
And now it reads:
Cell A1: Captain America is the greatest Avenger
I'm not sure my 4 year old would agree with my changes so it's just as well he can't read yet!
What About TRIM
Now you might think that the TRIM function can replace blank spaces at the end of text, but unfortunately it can’t handle these special characters.
TRIM is only good for removing the standard space character number 32 from text. But don’t forget it ignores single spaces between words.
moiz
Thanks for the tip.Can the substitute function be used to change letter case in XL? If not, is there an alternate way?-moiz
Mynda Treacy
Hi Moiz,
To change the case of letters you need to use:
UPPER – all upper case.
LOWER – all lower case.
PROPER – for the first letter in each word to be capitalised.
Kind regards,
Mynda.
Piotr
Good to know about characters which are blank spaces – thanks!
Once upon a time I had an adventure with character 160 when the data were taken from some accounting software…
Mynda Treacy
🙂 You’re welcome, Piotr. I can only imagine the fun you had detecting 160!
TomD
Learned a lot — thanks a bunch!
BTW — what was the function you used in columns B & E to get the length? I’m still learning.
Tom D.
Mynda Treacy
Hi Tom,
Thanks for your kind words.
The function to get the length is LEN. It goes like this:
and it’ll return the length of the text in cell A1.
Too easy 🙂
Kind regards,
Mynda.