Power Query
Power Pivot
May 26, 2022
Hi all. I receive monthly payroll data and one of the tables I use is dd every month. The data is staff structure, so there are always duplicate values. I need the query to remove duplicates from Personal Reference (which needs to be a Unique Identifier) and only keep the version where the Date is the most recent date. How do I do this?
Power Query
Power Pivot
May 26, 2022
Hi. Please find attached the output I get from my Power Query.
I need it to remove duplicates on Personal Reference:People (column I) so that I can use this as the Unique Identifier. I need it to remove all those versions and just keep the latest version.
So, you can see that Personal Reference:People number 0000025 appears each mt, and I would only like to keep the latest version (so date 01/02/23):
01/01/2023 | 0000025 | KB |
01/02/2023 | 0000025 | KB |
01/04/2022 | 0000025 | KB |
01/05/2022 | 0000025 | KB |
01/06/2022 | 0000025 | KB |
01/07/2022 | 0000025 | KB |
01/08/2022 | 0000025 | KB |
01/09/2022 | 0000025 | KB |
01/10/2022 | 0000025 | KB |
01/11/2022 | 0000025 | KB |
01/12/2022 | 0000025 | KB |
In this example, Personal Reference:People number 5006718 last appeared on the 01/12/2022 file so this would be the record we would need to keep:
01/04/2022 | 5006718 | EB | |
01/05/2022 | 5006718 | EB | |
01/06/2022 | 5006718 | EB | |
01/07/2022 | 5006718 | EB | |
01/08/2022 | 5006718 | EB | |
01/09/2022 | 5006718 | EB | |
01/10/2022 | 5006718 | EB | |
01/11/2022 | 5006718 | EB | |
01/12/2022 | 5006718 | EB | 05/12/2022 |
Trusted Members
October 18, 2018
If I understood you correctly, I first grouped your table by Personal Reference and Max Date. I then merged (joined) that table back onto the original table to give you only the data for the Max Date for the Personal Reference
See the attached. I was unable to open your query as it was linked to your PC.
Power Query
Power Pivot
May 26, 2022
That is amazing, Alan! Thank you so much for your help.
Sadly, there's one last bug. PAYROLL_Structure returns 169 unique rows, but that increases to 178 rows with Merge1. Looking at it, that's because 8 out of 169 Personal Reference:People are not unique as those 8 people have more than one role within the organisation (something I hadn't realised before doing this!)
Can you suggest any way around this, because otherwise I'm stuck without a Unique Identifier for example, is it possible to add a _1 or _2 to that final data set where there's a duplicate?
Many thanks!
1 Guest(s)