A while ago I wrote a routine using the .Find method to find missing numbers in a sequence. So if we had 1, 3, 5, the missing numbers are 2 and 4.
I had some feedback that as the sequence grew longer, the .Find method became increasingly slower, to the point that your hair grew quicker.
I am surprised to hear this as I’d expect a routine provided by Microsoft to be as least as fast as anything us mortals could write.
I chose to use .Find really for its ease of use, I wouldn’t have to write any loops to check each number. But when the length of time it takes to list missing numbers becomes frustratingly long, it’s time to look for another way.
One of our readers provided some code (thanks Bryan) where he compared two ways of listing missing numbers in a sequence, the first using .Find and the second using loops to check a collection created from the number sequence.
What the code revealed was that by the time you got to 10,0000 numbers in the sequence, the difference between the two approaches was already noticeable .Find was taking 50+ seconds, whereas looping through a collection took less than 2 seconds.
I then received a message from another blog reader, Dennis, saying he was trying to find missing numbers in a 90,000+ long sequence of numbers, with .Find taking over half an hour
So, I thought I’d better check this out myself and thanks to the code Bryan provided I came up with this.
What The Code Does
- Find missing numbers (integers) in a sequence
- The lowest number in the sequence is the starting point, the highest number is the end point
- Can include negative integers
- Numbers can be repeated
- No sorting of the sequence is required
- The sequence (input range) can contain empty cells
- The input range can be non-contiguous
Testing
The purpose of this testing was to compare the .Find method against an approach looping through a collection created from the range of cells containing the number sequence.
Whether or not this approach is totally scientific isn’t really important, what is important is that the two approaches are compared in the same way. Apples with apples.
The most important thing is what you, or the user, experiences.
If one approach is lightning fast and the other is slower than a snail going uphill, then we don’t really need to make sure our testing is accurate down to the microsecond.
That said, the testing routines use a function called MicroTimer, provided by Microsoft for accurate timing.
It uses Windows API calls to the computer's high-resolution timer and is faster and more accurate than the VBA Time function. It can measure intervals down to microseconds.
The results of these tests will be different each time you run them. Things like the number of missing numbers, whether you are using Debug.Print or not, and other processes running on your computer will mean each test will be slightly quicker or slower than the last one.
But to give you an idea of the variation I’m talking about, in four runs of the Collection routine with 10,000 numbers the code executed in 1.73s, 1.78s, 1.75s and 1.70s. So unless you are The Flash, you won’t even notice this.
The testing code was run in Excel 2013 and 2010, and each routine was run four times. Firstly with a sequence of 10 numbers, then 100, 1 000, 10 000 and finally 100 000 numbers in the sequence. The execution times for the 4 runs was averaged.
Actually that’s a bit of a lie. By the time I got to 100,000 numbers in the sequence the .Find method was taking so long I just ran it twice. But I did test the others four times each.
The Testing Routines
The testing code uses the RANDBETWEEN function to randomly create numbers between 1 and the largest number in the sequence (10, 100, 1 000, 10 000 or 100 000) in a column starting from A1.
Because RANDBETWEEN is volatile, if we wrote the missing numbers to the worksheet, RANDBETWEEN would recalculate each time we did this, changing the numbers in our sequence and this would mess up our test.
So the missing numbers, and any other output we want, is written to the Immediate window in the VB editor using Debug.Print. To show the Immediate window in the VB Editor either press CTRL+G or go to the View menu ->Immediate Window
Results
With only 10, 100 or 1000 numbers in the sequence, the difference between .Find and the collection/loops routine isn’t perceptible, all the results coming back in under a second.
But with 10,000 numbers in the sequence, .Find takes about 51 seconds, the collection/loop code finds all the missing numbers in less than 2 seconds!
With 100,000 the difference is even more staggering with .Find taking more than an hour and 20 minutes to find all the missing numbers in the sequence. The collection/loop code did it in under 9 seconds.
I was amazed.
I don’t know what is going on in the .Find method but beyond 1000 numbers in a sequence, it seems to be incredibly slow at this particular job. .Find is very good if you want to do a one-off location of a single thing within a range, but for this particular application, looping is by far the better approach.
Below is a table of results (measured in seconds) for each of the tests starting with 10 numbers in a sequence and working through to 100 000 numbers.
.Find | Collection/Loop | |||
2010 | 2013 | 2010 | 2013 | |
10 | 0.0025 | 0.0054 | 0.0026 | 0.0020 |
100 | 0.029 | 0.039 | 0.013 | 0.016 |
1 000 | 0.664 | 0.769 | 0.237 | 0.173 |
10 000 | 50.87 | 50.38 | 1.74 | 1.76 |
100 000 | 5037.68 | 5072.39 | 8.83 | 8.15 |
The Finished Code
After the testing I had to modify the code so that it gave us results we could use in the workbook, not just printing to the Immediate Window. All the code can be found in this workbook.
When you open the VB Editor (ALT+F11) you’ll see 2 modules named Testing_Code (guess what’s in there!), and FindMissingNumbers.
Testing_Code contains the MicroTimer function, and two subs called Using_Collection_Loops and Using_Find. You can run either of these subs to do your own testing.
The array lngNumValues is set to hold five values which are how many numbers in the sequence you test, i.e. 10, 100, 1000, 10 000 or 100 000. You can change these values to whatever you like.
Specifying which value in the array lngNumValues to use in testing determines how many numbers are tested. For example lngMax = lngNumValues(4) sets lngMax to 10 000, so the code will use a random sequence of 10 000 numbers.
When you run either of the testing subs, you’ll see the missing numbers displayed in the Immediate window, followed by a duration for the code execution.
How To Use the FindMissingNumbers Sub In Your Workbook
In the FindMissingNumbers module is the FindMissingNumbers sub.
On the first sheet of the workbook I’ve added a big green button, click this and the macro will run. You can also run the macro in a number of other ways. Read this article to learn how to run a macro using a shape, shortcut keys, an icon on your Ribbon, or the Quick Access Toolbar.
When the macro starts the first thing you need to do is specify the input range. You can either use your mouse to select the range, or type it in
Then you specify where you want the missing numbers to be listed. You can select a single cell and the results will be output into that column.
If you select more than one cell across a row, the results will be output across that row.
Enter your email address below to download the sample workbook.
Disclaimer – Please test the code yourself, it may not work in your environment. All code provided as is without any warranty
Option Explicit '********************************************************** ' Find missing integer numbers in a range ' Author : Philip Treacy, MyOnlineTrainingHub ' With thanks to Bryan upon whose code this is based ' For more info go to https://www.myonlinetraininghub.com/find-missing-numbers-really-fast '********************************************************** Sub FindMissingNumbers() Dim InputRange As Range, OutputRange As Range Dim LowerVal As Single, UpperVal As Single, count_i As Single, count_j As Single Dim NumRows As Long, NumColumns As Long Dim Horizontal As Boolean Dim rng As Range Dim col As Collection Dim varDummy As Variant 'Default is to output the results into a column Horizontal = False On Error GoTo ErrorHandler 'Ask for the range to check Set InputRange = Application.InputBox(Prompt:="Select a range to check :", _ Title:="Find missing values", _ Default:=Selection.Address, Type:=8) 'Find the lowest and highest values in the range/sequence LowerVal = WorksheetFunction.Min(InputRange) UpperVal = WorksheetFunction.Max(InputRange) 'Ask where the output is to go Set OutputRange = Application.InputBox(Prompt:="Select where you want the result to go :", _ Title:="Select cell for Results", _ Default:=Selection.Address, Type:=8) 'Check the number of rows and columns in the output range NumRows = OutputRange.Rows.Count NumColumns = OutputRange.Columns.Count 'If there are more columns selected than rows, output is to go horizontally If NumRows < NumColumns Then Horizontal = True 'Reset the number of rows to 1 so that output is into first row NumRows = 1 Else 'Reset the number of columns to 1 so that output is into first column NumColumns = 1 End If Set col = New Collection ' First add all items to collection (ignore errors when duplicates are added) For Each rng In InputRange On Error Resume Next col.Add rng.Value2, CStr(rng.Value2) On Error GoTo 0 Next rng count_j = 1 ' Loop through every possible nubmer in range (error returned when number not found) On Error Resume Next For count_i = LowerVal To UpperVal varDummy = col(CStr(count_i)) If Not Err.Number = 0 Then 'Output the missing number to the sheet If Horizontal Then OutputRange.Cells(NumRows, count_j).Value = count_i count_j = count_j + 1 Else OutputRange.Cells(count_j, NumColumns).Value = count_i count_j = count_j + 1 End If Err.Clear End If Next count_i On Error Resume Next Exit Sub ErrorHandler: If InputRange Is Nothing Then MsgBox "ERROR : No input range specified." Exit Sub End If If OutputRange Is Nothing Then MsgBox "ERROR : No output cell specified." Exit Sub End If MsgBox "An error has occurred. The macro will end." End Sub
Venkatesh
Very useful tool ! Thanks a lot !.
Philip Treacy
No worries.
MME
Goodevening,
Thank you so much for your code and clear explanation. I was looking for this solution and this really helped me A LOT!
At the moment I’m trying to amend your code for my needs, but do not know exactly how to proceed.
I want to report the missing numbers but only those missing numbers that start with (1st digit) the same digit as the value in Cell C1 (for example C1 = 2)
The numbers that need to be checked are between 10000 till 99999.
So if the value in C1 is 2, I would like to report all missing numbers from 20000 till 29999.
I only need 25 numbers to be reported.
Can you help me to get this accomplished?
Thank you for you time in advance, I really appreciate it.
Rgds,
MME
Catalin Bombea
hi MME,
Here is the code you should try;
Option Explicit
'**********************************************************
' Find missing integer numbers in a range
' Author : Philip Treacy, MyOnlineTrainingHub
' With thanks to Bryan upon whose code this is based
' For more info go to https://www.myonlinetraininghub.com/find-missing-numbers-really-fast
'**********************************************************
Sub FindMissingNumbers()
Dim InputRange As Range, OutputRange As Range, FirstDigitRange As Range
Dim LowerVal As Single, UpperVal As Single, count_i As Single, count_j As Single
Dim NumRows As Long, NumColumns As Long
Dim Horizontal As Boolean
Dim Rng As Range, FirstDigit As Long
Dim col As Collection
Dim varDummy As Variant
'Default is to output the results into a column
Horizontal = False
On Error GoTo ErrorHandler
'Ask for the range to check
Set InputRange = Application.InputBox(Prompt:="Select a range to check :", _
Title:="Find missing values", _
Default:=Selection.Address, Type:=8)
Set FirstDigitRange = Application.InputBox(Prompt:="Select the cell with first digit :", _
Title:="Select the cell with first digit", _
Default:=Selection.Address, Type:=8)
'Ask where the output is to go
Set OutputRange = Application.InputBox(Prompt:="Select where you want the result to go :", _
Title:="Select cell for Results", _
Default:=Selection.Address, Type:=8)
'Find the lowest and highest values in the range/sequence
LowerVal = FirstDigitRange.Cells(1) * 10000 + 9999
UpperVal = WorksheetFunction.Min(99999, WorksheetFunction.Max(InputRange))
'Check the number of rows and columns in the output range
NumRows = OutputRange.Rows.Count
NumColumns = OutputRange.Columns.Count
'If there are more columns selected than rows, output is to go horizontally
If NumRows < NumColumns Then Horizontal = True 'Reset the number of rows to 1 so that output is into first row NumRows = 1 Else 'Reset the number of columns to 1 so that output is into first column NumColumns = 1 End If Set col = New Collection On Error Resume Next ' First add all items to collection (ignore errors when duplicates are added) For Each Rng In InputRange If Rng.Value2 > LowerVal And Rng.Value2 <= UpperVal Then col.Add Rng.Value2, CStr(Rng.Value2) Next Rng On Error GoTo 0 count_j = 1 ' Loop through every possible nubmer in range (error returned when number not found) On Error Resume Next For count_i = LowerVal To UpperVal varDummy = col(CStr(count_i)) If Not Err.Number = 0 Then 'Output the missing number to the sheet If Horizontal Then OutputRange.Cells(NumRows, count_j).Value = count_i count_j = count_j + 1 Else OutputRange.Cells(count_j, NumColumns).Value = count_i count_j = count_j + 1 End If Err.Clear End If 'add exit point after first 25 numbers If count_j > 25 Then Exit For
Next count_i
On Error Resume Next
Exit Sub
ErrorHandler:
If InputRange Is Nothing Then
MsgBox "ERROR : No input range specified."
Exit Sub
End If
If FirstDigitRange Is Nothing Then
MsgBox "ERROR : No input range specified for first digit."
Exit Sub
End If
If OutputRange Is Nothing Then
MsgBox "ERROR : No output cell specified."
Exit Sub
End If
MsgBox "An error has occurred. The macro will end."
End Sub
Rick Rothstein (MVP - Excel)
On my system, the following code is more than 10 times faster than the code you posted (for 999,999 cells arranged in 3 columns)…
Philip Treacy
Hi Rick,
Thanks for writing a faster way to do this. I’ll put into my Excel and give it a spin.
Regards
Phil
Rick Rothstein (MVP - Excel)
I’ll be interested in your results run on your computer (the 999,999 cells process in less than 1/2 second using your code on my computer using XL2010, not the nearly 9 seconds you reported).
Also, I forgot to mention that my code only outputs down a single column… it does not have the ability to output horizontally.
Philip Treacy
Hi Rick,
I couldn’t get your code to run when I had 3 x columns of 999,999 numbers. Each time the code generated ‘Run time error 13: Type mismatch’ on this line
I did test it on a smaller set of numbers though. It did run but I noticed an error. If you have the sequence 2,5,7 the missing numbers are 3,4 and 6. Your code outputs just 3 and 4. I tried several different sets of numbers and your code always missed out the last missing number.
Regards
Phil
Rick Rothstein (MVP - Excel)
My code will only run against a single column… that is what you posted for your example, so I figured the list would only be in a single column. If three columns, which way would the “wrap” occur… down one column then to the next column and so on OR across one row and then down to the next row and so on?
As for the 999,999 rows of data… yeah, that will overwhelm the Application.Transpose function call by quite a bit… I think 64000 may be the upper limit on the number of element that can be Transposed.
As for the missing numbers… stupid mistake on my part. The last line of code…
OutRng(1).Resize(UBound(Nums)) = Application.Transpose(Nums)
should have been this…
OutRng(1).Resize(UBound(Nums) + 1) = Application.Transpose(Nums)
Philip Treacy
Hi Rick,
Your original post said that your code was for ‘999,999 cells arranged in 3 columns’.
I’ve corrected the last line of your code above.
Cheers
Phil