Hello,
I'm working on an API connection for COVID tracking.
In Column 1 [Country_EN], I have many values:
Ireland
United Kingdom
United Kingdom
United Kingdom
United Kingdom
United Kingdom
United Kingdom
United Kingdom
etc.
The reason there are many United Kingdoms, is each row represents a region. Column 2 is [1/1/2020], column 3 is [2/1/2020] and so on. I need to keep all the columns as is, however, I need to group all the United Kingdoms in [Country_EN] and sum those values in the following columns. Essentially I'll be left with only two rows -
Ireland
United Kingdom
Any help is greatly appreciated! Thanks
Hi Adam,
Without actually seeing your data I can't give the best answer. You could perhaps use Power Query though that maybe overkill. You could use SUMIFS or a PivotTable.
Please supply the workbook, or at least a sample with the identical data structure (columns,rows).
Regards
Phil
Hi Adam,
You can use the Group By tool in Power Query to group the data based on the Country column.
However, you said you have the dates going across the columns...depending on how you plan to analyse and summarise this data, I recommend you also use Power Query to unpivot it so you have a date in one column and the value in another. Then you can use PivotTables and SUMIFS formulas the way they were intended. i.e. working with tabular data.
Mynda
Thank you both for responding. I'm really hoping I can keep the file light and handle most (if not all) of the work in Power Query, but I'll trust your judgments. I've attached the file, if you wouldn't mind taking a peek. I've also noticed, at least on my end, when I refreshed this morning it did not pull in all the current data. The CSV online says the most recent date is June 7, yet the file only updates to June 6. Could there be some sort of caching issue on my end?
Thanks again for your time!
With regards to the column count issue, it seems a limit of 140 was hard-coded into the Source step. I've learned that removing that section fixes the problem (Columns=140). If you've got any thoughts on preventing that in the future it'd be appreciate, too. Thanks again!
I'm pretty sure I was able to solve the problem by taking the source data, and pivoting it so all the dates were in one column and all the regions in the other, then pivoting it back, essentially, which I think should have combined the duplicates. Unless I'm mistaken
Hi Adam,
Please see attached and compare to your solution.
After unpivoting the date columns, I've grouped by Country and Date to give the final table.
I've removed the Columns=140 too.
Regards
Phil
Hi Adam,
There is no way to prevent the 'Columns=n' clause being automatically entered in the query. You just have to remember it's there and if you're expecting more columns in future, then remove it manually from the M code.
I agree with Phil's solution. You should not be pivoting the data again in Power Query. This is a job for the PivotTable or your later analysis. If you have your data already pivoted then it will be difficult to use Excel's built in tools and functions like SUMIF/S, COUNTIF/S and PivotTables and Slicers etc.
You should always start with tabular data. That's what Power Query is designed to output.
Hope that points you in the right direction.
Mynda
Thank you both Mynda and Philip - you're rockstars. Philip your data matched what I was able to come up with and I couldn't be happier. Also, Mynda, your tabular data page is great. I'm going to share it with my boss for future reference to help us align in our goals.
Thanks again, very much.
no worries 🙂