September 6, 2019
Hello,
I have request and I don't know if Macros can do it or not. please I need your help.
I extracted data from many website, please check below example.
Company | HQ_Z | HQ_Craft | HQ_Hoovers | HQ_BlombergPublic | HQ_[World Of Manufacturers] | HQ_TheWallstreet |
Autoliv, Inc. | Sweden | USA | China | Sweden | USA | USA |
As you see in table many countries for same company, and I want to get Trusted country via compare all extracted countries and determine which one come from many sources.
In above example, we have six values as below
Three sources say USA
Two sources say Sweden
One Source say China
so in this case, the Trusted value is "USA", as it is have the largest count of sources.
Please check attached file to get other samples.
The input will be company name and other columns contain many values and total of sources.
Out put: Three columns as below.
1- The big Matching Count >>> the count for big sources which have same values.
2- Matching Percentage >> the ratio between the big count for sources which have same values / total of sources have values.
3- Trusted Data >>> value which come from the big count of sources.
Thank you very much; your support is greatly appreciated.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
Hi Catalin,
You are amazing!
Wow! This is awesome and it works like a charm. Thank you so much!!!
Yes it is working correctly without any issue. I just have below question, please can you check.
Some sources show different Country for same company so I extract data from DB using SQL by below structure
ZID | Company | HQ_Z | HQ_Craft | HQ_Hoovers | HQ_BlombergPublic | HQ_[World Of Manufacturers] | HQ_TheWallstreet | HQ_Finviz | HQ_Alacrastore | HQ_OldCrunchbase | HQ_Investing | Total of sources |
1000451 | Ashland Inc. | United States | Covington, KY, US | India | United States | United States | USA | Netherlands | United States | 8 | ||
1000451 | Ashland Inc. | United States | Covington, KY, US | India | United States | United States | USA | United States | United States | 8 | ||
1000451 | Ashland Inc. | United States | Covington, KY, US | United Kingdom | United States | United States | USA | Netherlands | United States | 8 | ||
1000451 | Ashland Inc. | United States | Covington, KY, US | United Kingdom | United States | United States | USA | United States | United States | 8 |
The result using Power Query as below.
1000451 | Ashland Inc. | 8 | 18 | United States |
So I changed the IDs to make each row has it is own ID and I got good result as I want the Trusted Data per row then I will get DISTINCT for "Company & Country" then I will take Country which have the big count of sources as below. Do you think is there any issue will happen if I do this?
1 | Ashland Inc. | 8 | 4 | United States |
2 | Ashland Inc. | 8 | 5 | United States |
3 | Ashland Inc. | 8 | 4 | United States |
4 | Ashland Inc. | 8 | 5 | United States |
How kind you are to help me.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
Hi Catalin,
Great so I will make the ID's unique.
Last question please.
I will use this file for other types of data like I will compare Found Year, Phone, ....etc.
So I face as issue to determine the count of sources which I past it in "M column" as I use this function =COUNTIFS(R3:AD3,"*",R3:AD3,"<> ")
as it is count the characters not numbers, so can you help me to find other function work with characters and numbers please.
Or If there option on Power Query to count the sources and fill M Column ?
Thanks for helping me accomplish my goal.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
That's a huge help - thanks!
I added above formula and it is running without any issue.
I did other thing and I want to know your comment please.
I removed Changed Type step due sometimes I will want to remove column from sources or change it's names as the sources not fixed as may be I got from other sources like Amazon, yahoo finance, ...etc.
Also may be I will add more columns for sources.
I tested the result and it's okay, but I want your help to review this update in attached file to make sure there is no issue on any step in code as it is first time to make edit in Power Query.
Thanks is not enough, but really don't know what else say to you. Anyway have to say thanks again.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
You're welcome.
If works, there is nothing to review.
Code is already using Unpivot Other Columns in order to handle any new possible columns, if the source has ZID and Company.
The only issue can come from source data, if it contains columns other than HQ columns, those will be counted as HQ.
September 6, 2019
I think above issue won't happen as I removed Changed Type step to make the file don't depend on columns names.
Example I changed the columns names as below and it is working without any issue.
ZID | Company | A | B | C | D | E | F |
1000512 | Autohome Inc. | Netherlands | Netherlands | Netherlands | Netherlands | Netherlands | Netherlands |
Are u intent this or other thing? as I will depend on this file on urgent project.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
I didn't say that you will have different HQ column names, what I said is that you should not have other column than HQ columns, regardless of the HQ columns names.
For example, if besides HQ column you have another new column like "Company Manager", or "Count of Floating Ducks", the query will not return the correct counts.
September 6, 2019
Ofcourse, when I work on this, I will work per feature foe example after I finish HQ and get the final result, I will start with another features like Phone.
I won't work in different values on the same time.
below table will cause issue.
ZID | Company | Phone_Google | HQ_Yahoo | email_Craft |
1000512 | Autohome Inc. | 2938373 | China | XXX@fre.com |
But below one is correct
ZID | Company | email_Google | email_Yahoo | email_Craft |
1000512 | Autohome Inc. | YYY@fre.com | XXX@fre.com | XXX@fre.com |
Thanks again. Couldn't have done it without you.
Tahnsk;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
First, I wanted to send my thanks for all your assistance.
Here all details about what I will do, please check.
I build profile for many companies contain many features like Company Headquarter, Phone, Address, Founded Year, Stock Exchange Name, Email, EmployeesCount, Industry, FiscalYear, Logo and MarketCap
And I do this via extract these data from many sources like The Wall street, Blomberg, …etc. and will verify the accuracy by comparing the data which I extracted it.
So the file which you sent me will help on comparison to get the Trusted Data, I think all of features can use the same Trusted Data process.
But I think there are two feature won’t be done with Trusted Data process, please check below.
- As not all sources write the address with same format.
Ex: CompanyName: News Corp
The Wall street_Address: 1211 Avenue of the Americas New York New York 10036 United States
Alacrastore_Addres: 1211 Avenue of the Americas New York, NY 10036 United States
Marketscreener_Address: 1211 Avenue of the Americas New York, NY 10036
Owler_Address: 1211 Avenue of the Americas New York, New York10036
- Logo, I got the image URLs and I downloaded it.
Please can you help how I can do this as I try to get any idea.
Thank you very much; your support is greatly appreciated.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
Wow! Thank you so much. This is awesome.
I checked the result which you sent me and it is great.
I want to test other address but I can't use the file as I face this issue "Expression.Error: The import Table.FuzzyNestedJoin matches no exports. Did you miss a module reference?"
So I attached some images, please check it and let me know how I can solve it.
Deep thanks Catalin.
Thanks;
Marsil
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Make sure first that your version of PQ has the fuzzy match function.
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
September 6, 2019
Result is very good. thanks so much Catalin.
IT team installed latest version of power query but I still face the same issue, I checked Google and I found some news about Fuzzy Matches with the Power Query is valid for office 360 subscribers only, is this correct? please confirm.
Thaks;
Marsil
1 Guest(s)