July 17, 2018
Hi all,
I am sure I have done this before but can't remember how
I have a folder with csv files that I need to combine, they contain sales forcasts for the next 14 days so everyday a new csv file is generated with columns for the next 14 days..
We keep the old data to refer back to and also the number of products can vary month to month and the file date is above the headers. I know how to extract that its dealing with the different header names that I am stuck on.
Ive attached some example files so you can see what I need.
John
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Hi John,
Try this solution: https://www.myonlinetraininghu.....ng-content
December 23, 2020
Firstly, John, remove that first row in every testx.csv files because it is redundant as it seems the entry in row 2, column B relates to the date give in row 1rnThat has the benefit of allowing Query to use the new first row as the header row for each file.
I then agree with Mynda that you need to create a separate Query for each .csv file ... no big deal in this example.
Then once you have Appended your files, you can unpivot the data, Close & Load it and create a Pivot Table very easily now and analyse your data.
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Using the File attached, any csv files can be combined, no matter how different their headers are.
Of course, like Duncan said, first row is irrelevant and must be removed. The image attached shows how the result looks like.
However, this demonstrates that you can combine from folder, without having to create a query for each source file.
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
@Duncan:
If you tried to open the function, you actually invoke the function and that indeed asks for a file path.
The file is supposed to be used as is, the only change you need to do is to change the FOLDER path from settings sheet. Then you refresh the query, data will be extracted from ALL csv files from that folder with a single query.
If you want to look at the GetCSV function, avoid opening it directly, this action invokes the function and creates a new query. Instead, open another query, from the queries list on the left select the function and open the Advanced Editor, it's the only way to see the code without invoking the function.
December 23, 2020
Thanks again, Catalin,
I read the file more carefully this time and pasted my own folder path into the first sheet and it worked as you designed it.
I know the Advanced Editor, thanks, even though I will not say I am an M expert at all. I have installed a couple of other functions before.
I tried an example of my own in your file but I think I have found that your function does not pick up the column headers from those csv files ... I will look at your code after Christmas Day and see what I might suggest ... either because of my file or because of the code.
Merry Christmas if you celebrate it!
Duncan
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Hi Duncan,
The function is adjusted for this topic, John's test files have the headers AFTER the first row. You have to remove the line in GetCSV function that removes the first row.
The code for this scenario is:
(FilePath)=>
let
#"Imported CSV" = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
Skip = Table.Skip(#"Imported CSV",1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true])
in
PromoteHeaders
For a normal csv, the code should not have the Skip first row step:
(FilePath)=>
let
#"Imported CSV" = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
PromoteHeaders = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true])
in
PromoteHeaders
July 17, 2018
Hi Guys,
Thanks for all the replies so far but one key thing is missing the date in the first row isn't irrelevant.
If you look at the result csv I need to filter by the product and that date, it basically tracks forecasts for example
on the 1st it was forecast that on the 3rd abc would require 1 unit
but on the 2nd the forecast for the 3rd had changed to 2 units
date | product | 01/01/21 | 02/01/21 | 03/01/21 | 04/01/21 | 05/01/21 | 06/01/21 | 07/01/21 |
01/01/21 | abc | 1 | 1 | 1 | 1 | 1 | ||
02/01/21 | abc | 2 | 2 | 2 | 2 | 2 |
so I need to keep that date in the first row linked to each product in that csv and of course durin a year that will end up at 365 csv's which is why I was trying to avoid a query for each csv file.
at the moment I am looking at 2 querys one to pull the first line date linked to a id based on csv name and a second to do as in the GetCsv function then merge based on the csv name
regards
John
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
If you need to keep the first value, you can add it as a new column. This is how the function should look like:
(I collected the first cell value in a parameter before removing the first row, then added the value in a new column)
let
Source = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
FirstCellValue = Table.ToColumns(Source){0}{0},
Skip = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true]),
AddColumn = Table.AddColumn(PromoteHeaders, "Date", each FirstCellValue)
in
AddColumn
A normal query , will look like:
Source = Csv.Document(File.Contents("C:\CsvFile.csv"),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
FirstCellValue = Table.ToColumns(Source){0}{0},
Skip = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true]),
AddColumn = Table.AddColumn(PromoteHeaders, "Date", each FirstCellValue)
in
AddColumn
The only difference between the function and this query is that the first line disappears. That's what converts any normal query to a function.
Because the file path will no longer be provided, hard type the sample file path:
File.Contents("C:\CsvFile.csv")
This will allow you to view the steps, add more processing if needed.
After you are done with converting one of the csv files, convert the query back to a function.
This is the best way to build a function. Start with a simple query from one of the files, then convert the query into a function by adding the first row with one or more parameters as needed, building such a custom function is very easy.
Answers Post
December 23, 2020
I understand the points you make John but I still wonder about that first line. As far as I can tell, the date in cell B2 is exactly the same date that is in cell B3 of all of your files. That is why I said that first line is redundant and it would make your life a little easier if you were to work with cell B3 and remove that first line. Otherwise, Catalin has your problem in hand, I think!
1 Guest(s)