Notifications

Clear all

Combine files with different header names

Power Query

Last Post by Duncan Williamson 5 years ago

12 Posts

3 Users

0 Reactions

1,577 Views

RSS

John Langham-Service

(@lanser)

Posts: 63

Estimable Member

Topic starter

Hi all,

I am sure I have done this before but can't remember how

I have a folder with csv files that I need to combine, they contain sales forcasts for the next 14 days so everyday a new csv file is generated with columns for the next 14 days..

We keep the old data to refer back to and also the number of products can vary month to month and the file date is above the headers. I know how to extract that its dealing with the different header names that I am stuck on.

Ive attached some example files so you can see what I need.

John

Posted : 22/12/2020 10:09 am

Catalin Bombea

(@catalinb)

Posts: 1937

Member Admin

Hi John,

Try this solution: https://www.myonlinetraininghub.com/excel-forum/power-query/multiple-csv-files-and-changing-content

Posted : 23/12/2020 12:16 pm

Duncan Williamson

(@duncanwil)

Posts: 11

Active Member

Firstly, John, remove that first row in every testx.csv files because it is redundant as it seems the entry in row 2, column B relates to the date give in row 1rnThat has the benefit of allowing Query to use the new first row as the header row for each file.

I then agree with Mynda that you need to create a separate Query for each .csv file ... no big deal in this example.

Then once you have Appended your files, you can unpivot the data, Close & Load it and create a Pivot Table very easily now and analyse your data.

Posted : 24/12/2020 10:03 am

Catalin Bombea

(@catalinb)

Posts: 1937

Member Admin

Using the File attached, any csv files can be combined, no matter how different their headers are.

Of course, like Duncan said, first row is irrelevant and must be removed. The image attached shows how the result looks like.

However, this demonstrates that you can combine from folder, without having to create a query for each source file.

Posted : 24/12/2020 10:32 am

Duncan Williamson

(@duncanwil)

Posts: 11

Active Member

Thanks Catalin

I did take a look at your Get CSV function earlier but when I tried to get it to import all three csv files at once, it asked me to give it a file path rather than a folder path. That led to three files, three queries ... did I use the function wrongly?

Posted : 24/12/2020 4:16 pm

Catalin Bombea

(@catalinb)

Posts: 1937

Member Admin

@Duncan:

If you tried to open the function, you actually invoke the function and that indeed asks for a file path.

The file is supposed to be used as is, the only change you need to do is to change the FOLDER path from settings sheet. Then you refresh the query, data will be extracted from ALL csv files from that folder with a single query.

If you want to look at the GetCSV function, avoid opening it directly, this action invokes the function and creates a new query. Instead, open another query, from the queries list on the left select the function and open the Advanced Editor, it's the only way to see the code without invoking the function.

Posted : 24/12/2020 11:55 pm

Duncan Williamson

(@duncanwil)

Posts: 11

Active Member

Thanks again, Catalin,

I read the file more carefully this time and pasted my own folder path into the first sheet and it worked as you designed it.

I know the Advanced Editor, thanks, even though I will not say I am an M expert at all. I have installed a couple of other functions before.

I tried an example of my own in your file but I think I have found that your function does not pick up the column headers from those csv files ... I will look at your code after Christmas Day and see what I might suggest ... either because of my file or because of the code.

Merry Christmas if you celebrate it!

Duncan

Posted : 25/12/2020 9:36 am

Catalin Bombea

(@catalinb)

Posts: 1937

Member Admin

Hi Duncan,
The function is adjusted for this topic, John's test files have the headers AFTER the first row. You have to remove the line in GetCSV function that removes the first row.
The code for this scenario is:

(FilePath)=>
let

#"Imported CSV" = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
Skip = Table.Skip(#"Imported CSV",1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true])

in
PromoteHeaders

For a normal csv, the code should not have the Skip first row step:

(FilePath)=>
let

#"Imported CSV" = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
PromoteHeaders = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true])

in
PromoteHeaders

Posted : 25/12/2020 10:01 am

Duncan Williamson

(@duncanwil)

Posts: 11

Active Member

There you are! Solved.

Thanks, Catalin, I will be using that function from now on!

Duncan

Posted : 26/12/2020 7:12 pm

John Langham-Service

(@lanser)

Posts: 63

Estimable Member

Topic starter

Hi Guys,

Thanks for all the replies so far but one key thing is missing the date in the first row isn't irrelevant.

If you look at the result csv I need to filter by the product and that date, it basically tracks forecasts for example

on the 1st it was forecast that on the 3rd abc would require 1 unit

but on the 2nd the forecast for the 3rd had changed to 2 units

date	product	01/01/21	02/01/21	03/01/21	04/01/21	05/01/21	06/01/21	07/01/21
01/01/21	abc	1	1	1	1	1
02/01/21	abc		2	2	2	2	2

so I need to keep that date in the first row linked to each product in that csv and of course durin a year that will end up at 365 csv's which is why I was trying to avoid a query for each csv file.

at the moment I am looking at 2 querys one to pull the first line date linked to a id based on csv name and a second to do as in the GetCsv function then merge based on the csv name

regards

John

Posted : 30/12/2020 10:40 am

Catalin Bombea

(@catalinb)

Posts: 1937

Member Admin

If you need to keep the first value, you can add it as a new column. This is how the function should look like:

(I collected the first cell value in a parameter before removing the first row, then added the value in a new column)

(FilePath)=>
let
Source = Csv.Document(File.Contents(FilePath),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
FirstCellValue = Table.ToColumns(Source){0}{0},
Skip = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true]),
AddColumn = Table.AddColumn(PromoteHeaders, "Date", each FirstCellValue)
in
AddColumn

A normal query , will look like:

let
Source = Csv.Document(File.Contents("C:CsvFile.csv"),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
FirstCellValue = Table.ToColumns(Source){0}{0},
Skip = Table.Skip(Source,1),
PromoteHeaders = Table.PromoteHeaders(Skip, [PromoteAllScalars=true]),
AddColumn = Table.AddColumn(PromoteHeaders, "Date", each FirstCellValue)
in
AddColumn

The only difference between the function and this query is that the first line disappears. That's what converts any normal query to a function.
Because the file path will no longer be provided, hard type the sample file path:
File.Contents("C:CsvFile.csv")

This will allow you to view the steps, add more processing if needed.

After you are done with converting one of the csv files, convert the query back to a function.

This is the best way to build a function. Start with a simple query from one of the files, then convert the query into a function by adding the first row with one or more parameters as needed, building such a custom function is very easy.

Posted : 31/12/2020 4:46 am

Duncan Williamson

(@duncanwil)

Posts: 11

Active Member

I understand the points you make John but I still wonder about that first line. As far as I can tell, the date in cell B2 is exactly the same date that is in cell B3 of all of your files. That is why I said that first line is redundant and it would make your life a little easier if you were to work with cell B3 and remove that first line. Otherwise, Catalin has your problem in hand, I think!

Posted : 31/12/2020 6:22 pm

26 Forums
7,331 Topics
32.2 K Posts
17 Online
35.8 K Members

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

Combine files with different header names

Super Globals

Options and Features