Phil hello. I didn't succeed yet to solve that question. Let me know if the files I gave you in the previous answer are available and OK for you.
Hi Bruno,
This is getting very messy and complicated. I've looked at the 3 files you've most recently supplied and unfortunately that just confuses me a little more. Let me explain.
The last file I provided to you pgt-Try02.xlsx was based on the original file you gave me. This included loading your source data file from a location on your PC which was C:UsersBruno MelkiDocuments Bruno ProjetsExcel Excel_mini_Projects Corona Data_Extract_Python Corona_Data.xlsx
The word document you most recently provided shows that you are getting an error trying to load that file. The only reason for this would be that the file no longer exists at that location. So you have either moved it, renamed it, or deleted it.
Looking in your Word doc you show a screenshot of the file you are using which you can see isn't the same as the file/path you gave me in your first file.
You've now got a folder called Data_Israel after the Corona folder, this was not in your original source path - hence the error. Change the path and this particular error will go away.
In the 3 files you've most recently given me, neither of the Excel workbooks are called Corona_Data.xlsx and this is supposed to be your source data file. I assume that you intend to use the CSV file you provided as source data, but if I change the path/file my query is loading are we going to end up with another problem like the 1st one I just explained? Because not only has the file/path name changed, the method of loading that file has changed because it's CSV and not XLSX.
Also note that the CSV isn't the same format as the Excel file you were using as source data. The Excel file has an extra Index column so there's another potential problem with any query I write that then does not match file(s) at your end.
Upon loading this CSV as source, I now have new problems. The column Cumulative_verified_cases contains values like <15 which cannot be converted to a number. If it cannot be converted to a number that column can't be used to work out the daily change that you are after. You will need to fix this before you can go any further.
For the sake of this example I've replaced all errors with the value 0.
So the first part of the query looks like this
// Change this path to load your own source file
Source = Csv.Document(File.Contents("D:tempBrunoMelkicorona_city_table_ver_006.csv"),[Delimiter=",", Columns=8, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers", {{"Cumulative_verified_cases", Int64.Type}, {"Date", type date}}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type", {{"Cumulative_verified_cases", 0}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Replaced Errors", {{"City_Code", Int64.Type}}),
#"Sorted Rows" = Table.Sort(#"Changed Type1",{{"City_Code", Order.Ascending}, {"Date", Order.Ascending}}),
So with all of that done I can now proceed to get the daily change in infections that you are looking for.
The part of the query that does this is
BufferedValues = List.Buffer(#"Sorted Rows"[Cumulative_verified_cases]),
BufferedGroup = List.Buffer(#"Sorted Rows"[City_Code]),
RT = Table.FromList(Daily_Beat(BufferedValues, BufferedGroup), Splitter.SplitByNothing(), {"GRT"}, null, ExtraValues.Error),
It is creating 2 buffered lists and passing these to the Daily_Beat function, which returns a list.
This list is then combined with the original table, and some final sorting and column re-ordering is done to get the data the way you'd want to see it.
Columns = List.Combine({Table.ToColumns(#"Sorted Rows"),Table.ToColumns(RT)}),
#"Converted to Table" = Table.FromColumns(Columns,List.Combine({Table.ColumnNames(#"Sorted Rows"),{"GRT"}})),
#"Reordered Columns" = Table.ReorderColumns(#"Converted to Table",{"City_Name", "City_Code", "Date", "Cumulative_verified_cases", "GRT", "Cumulated_recovered", "Cumulated_deaths", "Cumulated_number_of_tests", "Cumulated_number_of_diagnostic_tests"}),
#"Sorted Rows1" = Table.Sort(#"Reordered Columns",{{"City_Code", Order.Ascending}, {"Date", Order.Ascending}})
With regards to how the Daily_Beat function works, it takes the 2 lists which are the city code and the daily cumulative infections and uses List.Generate to calculate the daily change in infections for each city code.
I've added comments to the function for you to explain what it's doing but you can read up on List.Generate here
https://docs.microsoft.com/en-us/powerquery-m/list-generate
To gain an understanding of the functions used in the query and Daily_Beat function refer to
https://docs.microsoft.com/en-us/powerquery-m/power-query-m-function-reference
Hope that goes some way to clearing up the problems.
Here's the file with working solution. You will need to change the location of the source CSV file.
Regards
Phil