A common task in Excel is to compare two lists to identify which items are duplicates, different or missing from one list compared to the other.
We can use Power Query* to easily compare two lists like these staff lists below:
For referencing purposes in this tutorial I’ve formatted the names that are different in each list above in red.
Tip: I recommend you format your lists in an Excel Table before loading into Power Query.
*Power Query is available in desktop versions of Excel 2010 and 2013 as a free add-in which you can download here. In Excel 2016 it’s available on the Data tab in the Get & Transform group.
Power Query for Excel 2010 and 2013 Office 365 is only available with Office 365 ProPlus and Enterprise licences.
Excel Compare Two Lists with Power Query
Step 1: Load the tables into Power Query.
Select any cell in Table1 > go to the Power Query tab (Excel 2016 Data tab) > From Table.
This opens the Power Query editor window. Go to the Home tab > Close & Load To (as shown below):
In the Load To dialog box select ‘Only Create Connection’ > Load:
Now you should have one query listed in your Workbook Queries pane:
Repeat for Table2 so you have two queries in your Workbook Queries pane:
Step 2: Merge the Queries
Now we need to merge the queries. Right-click on one of the queries in the Workbook Queries pane > Merge:
This will open the Merge dialog box. Table1 should already be listed (because it was the one you right-clicked), so you just need to select Table2 as the second table you want to merge.
Then select the columns in each table that you want to match. The selected columns will be highlighted in green fill. If you have more than one column you want matched then hold down CTRL and select them in order so that the column numbers, which are automatically inserted, correspond to one another as you can see below.
Lastly you need to choose the join kind. There are 6 different join kinds but we’re only interested in the last 3.
- Inner – will return only those rows that are present in both tables i.e. the rows that match or duplicate.
- Left Anti – will return rows that are present in the first table but not in the second table.
- Right Anti – will return rows that are present in the second table but not in the first table.
Let’s take a look at each one in turn.
Note: if you don’t have a drop down list for Join Kind you will need to upgrade your Power Query add-in to the latest version.
Inner Join Kind
Continuing on from above, select Inner in the Join Kind list > click ok in the Merge dialog box. The Query Editor opens and you can see the results of Table1 but the results for Table2 are in ‘NewColumn’:
Click on the double arrow beside NewColumn to expand it. Uncheck ‘Use original column name as prefix’ > Click OK.
Now you can see the results of both tables (image below) but notice there are only 17 names because 4 names (2 in each table) have been omitted since they didn’t match.
So we can see the Inner join returns a table of only matching, or duplicate rows and removes any rows that don’t match.
Left Anti Join Kind
Next repeat the process except this time choose ‘Left Anti’ in the join kind list. This will return rows that only appear in the first table and not the second table as you can see in the Query Editor below:
Expanding NewColumn will return nulls since those names aren’t in the second table:
You can select the last two columns and press DELETE to get rid of them.
Right Anti Join Kind
Lastly repeat the process and choose Right Anti in the join kind list. Right Anti returns rows that appear only in the second table, not the first.
The results in the Query Editor are a bit different this time since there aren’t any results for Table1 (which are the first two columns shown in the Editor):
However, once you expand ‘NewColumn’, which is the second table, the items appear:
Now you can just select the first two columns and Delete them from the query so you’re left with just those names missing from the first list.
Download the Workbook
Enter your email address below to download the sample workbook.
kevyiw
Great Explanation (& feature)– I discovered that Power Query added this capability (not sure when it was added). This is great for identifying the rows that do not match. One question, is there a way to compare two tables and highlight the “affected” columns/fields to make it easier. I have a list where I am comparing over 20 columns. While in most cases the entire row is not present in the other. However, sometimes there is mismatch of some other data field(s). I can merge the two anti joins and sort so they appear in two consecutive rows, but still requires manual process of reviewing both rows. It would be nice to have the fields highlighted to know where to focus — example:
Address List A
123 Main St.
Address List B
123A Main St.
In example above the other 20 columns that matched would NOT be highlighted both Address columns in both list (A & B) would be highlighted
Mynda Treacy
There’s no functionality in Power Query to highlight the differences, but you could load the table to Excel and use Conditional Formatting to colour code the differences.
Kamran Ali
Hi,
It is wonderful….
I used it to compare tables of contents of two different books of the same topic/subject.
Power Query compared it effectively.
But there is a problem that it only compare exactly the same data.
E.g. I want it to compare the word “Interpretation” in “Table 1” with “Interpretation Clause” in “Table 2”.
Please let me know, how is this possible?
Thanks…
Catalin Bombea
Hi,
Power Query has a fuzzy match functionality. When you match 2 columns, expand advanced options and you should be able to see fuzzy matching options, where you can define a custom match percentage for example.
J
Thanks. This is really helpful.
I had a cost report and then got sent another cost report with the same headings but the totals are different. I’m trying to find out what rows in the costs report are the same, what has been added to the new cost report and what has been removed from the old cost report. I’ve tried doing the above but when I add the total of Inner Join Kind to Left Anti Join Kind and to then Right Anti Join Kind I’m getting a total which is more than it’s original data. I’ve tried looking for duplicates but there are none.
Any ideas why this happening?
Catalin Bombea
Hi,
What fields are used as keys for matching? Is there an ID column, or there are multiple fields that makes the record unique?
Normally, after merging 2 tables based on one or more key fields, you should expand only the Value from the second table, and compare it to table 1 value in a calculated column, to return True or False (if they are equal or not).
When expanding, if there are multiple matches, you will have options to expand into rows, it will not aggregate the values.
Carlos
This was VERY helpful, really useful information.
Catalin Bombea
Thank you, glad to hear it was useful for you.
Regards,
Catalin
Marie Del Grande
When I merge the 2 tables. Common names are combined but other names are lost and not kept in the spreadsheet
Mynda Treacy
Hi Marie,
Do you have a new column added called ‘NewColumn’? If so, you can click the double headed arrow on the NewColumn header to expand the columns.
Mynda
Thamer Alqershi
Hi,
Can’t I create a query with a filter which should be linked to a drop list? For example, I will have a drop list with some countries names and I want my query table to be adjusted based on the country name selected.
Mynda Treacy
Yes, that can be done with a PivotTable.
Michele George
Thank you for this post 🙂 I am just wondering why null appears in the data when you do a right anti join?
I am trying to use merge on power query to find errors between two tables of data (all the data in both tables should be the same so an error is when the data changed from one table to another). The ID numbers are the same in both tables, but there are many other columns with information that should stay the same in both tables and sometimes changed. This is a long shot but do you have any idea how to use merge on power query to find differences in data where the ID numbers correspond to each other?
Mynda Treacy
Hi Michele,
Nulls are empty cells in Power Query. They represent rows where you have mismatched data. If you use the Full Outer join you will see the mismatched data in both tables. If you’re still stuck please post your question and a sample Excel file in our Excel forum where we can help you further.
Mynda
Rich
I am new to Power Query but let’s say that the ID numbers were amounts and you wanted to show the difference between the 2 tables by customer. I guess we could create a helper column and use index and match or vlookup to calculate the difference but can Power Query perform this automatically?
My 2 tables have thousands of rows so another thought would like to create 2 Pivot Tables in Power Query but wanted Power Query to calculate the difference between the same customers appearing in both pivot tables. Any ideas?
Catalin Bombea
Hi Rich,
You should merge with Power Query those 2 tables, and do the calculations inside Power Query.
If you need help on your project, please upload a sample file with your tables on our forum, we will help you process them.
Catalin
Andrews
Hi,
I recently migrated from Office 2013 to office 2016.
I cannot find the JOIN KIND option when trying to do a Merge.
As a result it’s not clear how to do anti Joins in Excel 2016.
Spent some time on Google Search, but did not get any good resource.
Would you know how to do this?
Mynda Treacy
I suspect despite having a new install of Office 2016 that the version of Power Query bundled with that install is an older version. If you are an Office 365 customer then you would be entitled to updates to Excel and therefore, Power Query. The timing of the updates is based on your update channel (current, deferred, first release etc.). You can find out the updates for your channel here: https://technet.microsoft.com/en-us/office/mt465751.aspx?f=255&MSPPError=-2147217396
If you’re not an Office 365 subscriber then unfortunately you cannot update Excel 2016 to get the latest Power Query join options. If your Merge dialog box looks like this:
Then you’ll only have the limited join functionality of the older version, which is the equivalent of an inner join where it only includes matching rows, or if you uncheck the ‘Only include matching rows’ box it’s the equivalent of a left-outer join, where all from the first table and only matching from the second table are merged.
Mynda
Chris Wagner
Thanks Mynda,
This exercise presented some powerful tools with very useful applications.
Mynda Treacy
Thanks, Chris. Glad you found it useful.
Grainne Duggan
Hi Mynda
Thanks for this fabulous post. I had to compare two lists from Access and Excel, with different fields and data types and then import the combined result into Access. A manual comparison would have taken a day or two. I not only compared the lists but changed column order and data types. I even took the merged table and merged it a second time rather than doing a lookup table for province/state abbreviations. It took me under an hour – love it!
Mynda Treacy
Wow, what a wonderful result, Grainne. I’m glad you’re finding Power Query useful.
Mynda
Nate O
Thanks for sharing. This is very helpful. I have to do exercises like this all the time.
Mynda Treacy
Glad it was useful for you, Nate.
Ricardo
Hello Mynda
I enjoy Reading your blog and the solutions to problems you work on. Keep it up. I find this solution, using PowerQuery, however, too “bulky”. It is much faster to “Compare” files using MS Word. I understand you must use Excel but for ease and speed, MS Word the way to go.
Mynda Treacy
Hi Ricardo,
You use Word to compare lists? I can’t imagine how.
Mynda
Ricardo
Hello Mynda. I first copy the columns/fields to compare to Word documents Mynda1 and Mynda2. In MS Word –> Review –> Compare load both files and it gives the differences quickly. I understand that this does not show the capabilities of powerquery but it is much less cumbersome and quicker.
Mynda Treacy
Hi Ricardo,
Thanks for expalining. That’s a lot of copying and pasting which is fine for a one off job, but if those lists change/update and you want to compare them again then you have to repeat all the steps, whereas in Power Query you simply click the Refresh button and it picks up the new data and produces a new report.
Mynda
Asif Khan
Thanks Mynda,
First of all I am big fan of your site and of course Power BI features are just awesome. Secondly this Power Query this features is great and I wonder what others functions we can do with Power Query without which it takes hell of time to to get the solution.
Mynda Treacy
Cheers, Asif. Glad you liked the post and Power Query. There are tons of ways you can use Power Query to get and clean data and I cover them in my Power Query course.
Mynda
Ron MVP
Funny thing, this subject was discussed in the MS answers forum:
They came up with 4 macros that brought the comparisons to character level. If you are looking for character comparisons, take a look here:
https://answers.microsoft.com/en-us/office/forum/office_2013_release-excel/how-to-compare-2-columns-and-highlight-the/1129b03c-08b2-4d86-9994-2cd62d3f5ebe
Mynda Treacy
Thanks for sharing, Ron.
Alissa Wright
Hi Mynda! I have the latest version of Office 365 and Excel, but I didn’t have the option to choose the join kind. Any ideas why?
Great seeing you at the Data Summit last week in Bellevue!
Mynda Treacy
Hi Alissa,
It was great seeing you too 🙂
If you don’t have the Join Kind then you must have an earlier version of Power Query. There are different release speeds for Office 365 so you may be on a later one, however this has been out since last year (I think) so you should have it by now, assuming you’re on automatic update.
You could try forcing the update by downloading the latest version here:
https://www.microsoft.com/en-us/download/details.aspx?id=39379
Hope that helps.
Mynda
Peter Stratton
Really good, thanks very much!
Mynda Treacy
You’re welcome.
Regards
Mynda
Rita
I’m becoming a bigger and bigger fun of Power Query. And your blog. 🙂
Mynda Treacy
Hi Rita,
Great 🙂 Glad to hear that you are enjoying the site.
Regards
Mynda