Hi Mynda,
Refer to Conditional Column w/ Lookup from a 2nd Table
I'm looking to do the same thing as Digalo asked above so hope this is how I re open this tread?
Is there a method to do this with a much larger data set in the "lookup" table in your example file.
I can spend the time to build out a massive step using M for an add conditional column but I presume there is a way as i want to re use the look up table on a near daily basis for each time an get a new data set for processing.
Cheers Tony
I would do this with a parameter query as shown in the attached. Create a new table of key words. Insert a column and filter for the key word with an if Text.Contains statement.
Thanks Alan,
I appreciate the input but can't see what the parameter key achieves and appears to my PQ newbie eye as just a third table so I might elaborate a bit more as my explanation was not great.
I have a data set with items that are assigned by brand (Make Column) and depending on the model details (Model Column) they are also assigned into a Category (Category Column) and then a sub category (Sub Category Column) and these 4 columns are part of 20 columns needed for each item in the data set. I have extracted sample data and build out a reference list of brands to use in a conditional look up for the Make and it is over 1,000 lines and likley to keep growing slowly.
We have around 20 categories and they have 350 sub categories. The new data I am collecting uses similar but not matching sub categories to ours so I have mapped these and created a 3 column table that I also want to use in a conditional look up to find their sub cat in a text string and then assign ours in the new column.
While both of these tasks can be done building out a super large conditional look up I'm assuming PQ will allow me to refer to these data sets so they can stay as tables and be updated and so I can also use them on other builds.
Digalo's example has his look up table in the same sheet as his source data, but I can't work out how set up my query to connect to my reference tables that are not in the same sheet or in the same work book.
Hopefully someone might have done this or similar before.
Sorry if that is confusing.
Thanks again for any feedback.
Suggest you upload a sample of your data that is representative of your actual data. In this manner we can work with what you have and not Digalo's data and better understand your situation without having to guess at what you file looks like.
Hi Alan,
3 sheets attached
Sample source data that has my cleaning query
Refence table for "Makes", i have cut this down to about a dozen examples but is normally passing through 1,200 lines and will adjust over time as I know I have some more cleaning work to get it right
Refence table for "Categories and Sub Categories", this one is pretty much the full list but may change slightly over time.
In the sample data, my next steps are:
Add a new new column for Make from the Make and Model column using the Makes reference list
Add a new new column for Cat,SubCat from the Make and Model column using the Cat,SubCat reference list
Appreciate any ideas
This does not look like a PQ issue but rather a VBA type solution. I will be looking into writing some VBA code later today to address this.
It may be more efficient using Arrays but that is not my forte. I will do this with loops which will be a bit slower. I envision using the the Instr function within VBA.
More to come later today.
Alan
For the make and Model, still in PQ
#"Added Conditional Column14" = Table.AddColumn(#"Renamed Columns - Make and Model", "Make", each if Text.Contains([Description], "CATERPILLAR") then "Caterpillar" else if Text.Contains([Description], "JOHN DEERE") then "John Deere" else if Text.Contains([Description], "VOLVO") then "Volvo" else if Text.Contains([Description], "KOMATSU") then "Komatsu" else if Text.Contains([Description], "BOBCAT") then "Bobcat" else if Text.Contains([Description], "GENIE") then "Genie" else if Text.Contains([Description], "TOYOTA") then "Toyota" else if Text.Contains([Description], "JLG") then "JLG" else if Text.Contains([Description], "JCB") then "JCB" else if Text.Contains([Description], "CASE") then "Case" else if Text.Contains([Description], "FORD") then "Ford" else if Text.Contains([Description], "MACK") then "Kenworth" else if Text.Contains([Description], "INTERNATIONAL") then "International" else if Text.Contains([Description], "MITSUBISHI") then "Mitsubishi" else if Text.Contains([Description], "KENWORTH") then "Kenworth" else "Other")
Note that this is case sensitive.
Cannot help with other issue as there are to many. VBA may be a solution, but it may take a long time to process a thousand rows.
Thanks again Alan,
FYI - I ran the merge tables with a real data set of 500 lines and then tried the full list of Keywords for Makes and Categories and each of those done separately grew out about 600,000 & 400,000 rows of data. It ran surprisingly quickly but also highlighted the problem in my keywords data such as sub categories for Cabin and also Cabin & Chassis that are both relevant but create two entries so in the end makes for more cleaning.
Thanks for conditional column code, that list of makes is just on 1,200 so I will test a Concat built out code for all 1,200 and see how that goes and will keep looking for a solution.
Cheers
Hi Tony,
You can use List.Intersect to retrieve the make
List.Intersect( { Text.Split(Text.Proper([Make and Model])," "), Makes[Make]}){0}
But the problem with the Category and SubCategory, as you've discovered, is that looking up a word can return multiple matches e.g. Auger matches Auger and Auger Bit; Boat matches Boat, Boat Engine, Boat Trailer etc. I haven't figured out a way to resolve that problem. Without knowing which part(s) of the Make and Model string to lookup, it's very difficult.
Regards
Phil
Amazing Phil,
Thanks for that, works a treat and has shown me about half a dozen new steps / tricks / methods that I would have taken a long time to find.
Awesome.
Cheers Tony
No worries.