Forum

Cleaning up a Merch...
 
Notifications
Clear all

Cleaning up a Merchant Name list.

7 Posts
3 Users
0 Reactions
684 Views
(@kelca)
Posts: 6
Active Member
Topic starter
 

I have a list of Merchant category code groups, merchant category codes and merchant names.  The problem is there are almost 87,000 merchant names on it and for the analysis being done we want to group all the stores from the same store brand together.  For example a store called Mark's Work Warehouse may be listed as #store number Mark's; or as Mark's #store number or some  other variation of a the name of the chain with the store number but then there are stores with Mark's in the name that have nothing to do with Mark's Work Warehouse so need to be left out of the list for this particular store.  Wal-Mart is even worse for varieties in how it is listed as the Merchant Name.

One of the group working on the analysis we are doing used filtering and searching to try to identify and get these stores name down to store brand rather then individual store but I'm trying to figure out the best way to use Power Pivot so when we repeat this with newer data, we don't have hours of work to clean up the merchant list.

What is a good way to do this?  I was looking at the Power Query video on transformation tables but not sure with the number we have listed and the variety of patterns for store names and store number that it would work.

Thanks

Karen

 
Posted : 08/04/2022 9:07 am
(@jstewart)
Posts: 216
Estimable Member
 

There's fuzzy matching in PQ. Without an example of the data, I'm not sure we'll be able to give adequate advice. 

 
Posted : 08/04/2022 11:28 am
(@kelca)
Posts: 6
Active Member
Topic starter
 

I've removed the merchant list from the overall data mix and limited it to one merchant category.

One easier example (hopefully) is No Frills, it can be listed as

(name) No Frills (Store number)

No Frills (store number)

(name) No Frills Store

Nofrills (name) (Store number)

 

There may or may not be spacing so it can be Nofrills or No Frills and at times it may be 's at the end instead of just the s.  They are all the same chain and it would be nice to know how much we spend overall at the various No Frills stores.

Thanks for any help.

Karen

 
Posted : 09/04/2022 8:18 am
Alan Sidman
(@alansidman)
Posts: 223
Member Moderator
 

In the column containing the Store Name, click on the Sort drop down.  Select Text filer.  Select Begins with No Frill and select the Or function in the window and select begins with NoFrill.  You now have all the No Frills store available for analysis.

 
Posted : 09/04/2022 6:39 pm
(@kelca)
Posts: 6
Active Member
Topic starter
 

Alan

We have done that.  I'm looking for a way that we can automate it.  With close to 87,000 individual stores on the entire list and with a lot of time spent with the filters we got down to over 23,700.  However, we don't want to have to repeat that process everytime we refresh the data (probably quarterly).  

I was hoping to avoid macros, as I have very limited experience with programming and maintaining them.  So was looking at some of the features of power query.  We are already using power query to clean up and set up DIM tables for using in power pivot.  

Thanks

Karen

 
Posted : 12/04/2022 7:25 am
(@jstewart)
Posts: 216
Estimable Member
 

Karen,

I am trying to get fuzzy grouping to work, I haven't gotten the syntax right and my version doesn't have it in the ribbon. I didn't have time to play with it this weekend, but I'm going to try and play tonight.

 
Posted : 12/04/2022 10:52 am
(@jstewart)
Posts: 216
Estimable Member
 

Hey Karen,

Fuzzy matching is your answer. I got it to work, but I think a fuzzy join of two tables would be better than fuzzy grouping. Here's a video showing exactly what you want. 🙂 Go, be the hero of the office.

 
Posted : 13/04/2022 12:37 am
Share: