January 31, 2020
Hi there, I have a few Power Queries where I want to fill a column based upon text containing in another column. I am using the nested if then else conditional column with hardcoded words (where column contains this word). This leads to a long line of if then elses which is hardly maintainable. I would like to put all those words in an separate table. But there is a priority of words which define the result, so an if then else is very useful because the processing ends after a find.
So this is an example, a menu will be defined its ingredients.
MENU
potato, beans, meatball | |
spaghetti, tomato | |
tomato, lettice | |
spaghetti, meatballs | |
baked potatoes | |
spaghetti bianco |
INGREDIENTS (with type and priority)
lettice | veggie | 1 |
tomato | veggie | 1 |
beans | veggie | 2 |
potato | carbs | 3 |
spaghetti | pasta | 4 |
meatball | meat | 5 |
If a menu contains one of the ingredients, then the menu is defined by it's highest ingredient.
Any ideas how I can accomplish this simple task?
Thx, Jeroen
October 5, 2010
Hi Jeroen,
In your Excel workbook the logic you are using to determine the type of the meal doesn't make sense. According to the workbook the type of the meal is as follows:
potato, beans, meatball | veggie (beans) |
spaghetti, tomato | veggie (tomato) |
tomato, lettice | veggie (lettuce) |
spaghetti, meatballs | meat (meatballs) |
baked potatoes | carbs (potato) |
spaghetti bianco | pasta (spaghetti) |
With potato, beans, meatball you say it's veggie because of beans - beans has priority 2.
With spaghetti, meatballs you say it's meat because of meatballs - meatball has priority 5.
With potato, beans, meatball the lowest priority wins. With spaghetti, meatballs the highest priority wins. The logic is inconsistent.
Besides, how can a meal with meatballs be veggie?
Please clarify the logic to working out the meal type.
Regards
Phil
January 31, 2020
Hi Phil,
It is just an example. There is no logic in it. Sorry for that.
The actual data is financial data where keywords are submerged in text. You can view this as expenses.
When using the conditional column with 30 keywords in a nested if then else structure, I can determine the type of expense. Keywords are used multiple times and multiple keywords are used in one record/text, so the priority defines the type of data/expenses. Example: keyword invoice means supplier (type), but if there is a keyword car and/or invoice then this means car expenses (type), so keyword car comes up higher then invoice because it is more specific.
In stead of editing the if then else structure every time a new keyword arises, I would like to add the keywords to a separate table which is used in connection with my data table. The order in the lookup table can define the priority, but I probably would like to have a priority column in case some kind of sorting comes falling out of the sky. You never know.
To make this algorithm clear, I came up with this example of a menu and ingredients and priority 1 (high) to 5 (low).
The meaning does not make sense. It's just the math.
Grtx, Jeroen
Moderators
January 31, 2022
To make it work I also changed the data set-up a little. Please see if the attached does what you had in mind. If not, perhaps a good idea to come-up with an example that does has some more realistic logic.
Note that I can't follow your logic why "Spaghetti, Meatballs" becomes "Meat" (prio 5) and not "Pasta" (prio 4). As I understand the priority number indicates 1 (highest) to 5 (lowest). So, spaghetti ranks higher than meatballs.
Answers Post
January 31, 2020
Hi Riny,
Thx, your solution does the trick by using the split column delimiter. I adapted your solution to meet the result I wanted in this example. I attached the result.
However this will not work in the dataset I am using. The textstring (column) where the keywords are in is long and there is no specific delimiter. I could use the space to split the column on every word or number, but that would create a huge table (of records).
I will try this solution on my dataset, but I am still open for other suggestions.
And I will try to think of an example which gives a better real life experience.
Thx!
Kind regards, Jeroen
January 31, 2020
Hi Riny,
I created a new example and used your technic. It was a bit of a struggle to get the right "(double) rows" deleted, but I found your line in the editor and added the Table.Buffer to my "Dutch M-language" to buffer the sort. Then the remove duplicate rows worked like a charm.
#"Sorted Rows" = Table.Buffer ( Table.Sort(#"Filtered Rows",{{"Index", Order.Ascending}, {"priority", Order.Ascending}}) ),
I will do some more testing with fuzzy logic to see if keywords can be "like" in stead of straight matching.
In the current if then else where I use "if XXX contains" I can use a subset of characters in a word, which is efficient.
When I have to make a join on keywords, then all keywords need to be present in the list, not just a few characters (subset).
It's still a work in progress, but I have more tools and knowledge now to continue building this query.
Thx!
Jeroen
Moderators
January 31, 2022
Good to see you are making progress. Perhaps that Phil's blog article (link below) about finding keywords within text strings is helpful. I believe it does exactly what you need, if I'm not mistaken.
1 Guest(s)