Forum

Multiple pdfs with ...
 
Notifications
Clear all

Multiple pdfs with multiple pages

3 Posts
2 Users
0 Reactions
845 Views
(@stratocumulus)
Posts: 7
Active Member
Topic starter
 

Hi All,

I am a bit stuck with combining (..a lot..) of pdfs containing multiple pages into one query. The data on the pdfs is in the pages (Page001, Page001, etc) and the tables (Table001, Table002, etc.) do not contain data that I am after.

There is no possibility to select multiple pages in the "import from folder" data query as far as I am aware.

A single pdf file with multiple pages opens as follows:

let
Source = Pdf.Tables(File.Contents("C:TempFile_001.pdf"), [StartPage=1, EndPage=5])
in
Source

How can I loop the above for a number of files, the names of which I specify in a dynamic table on an Excel Worksheet?

I could e.g. generate a function like below and invoke it, but this still leaves me with manually generating one query per file.

(FileID) =>
let
Source = Pdf.Tables(File.Contents("C:Temp"& FileID &".pdf"), [StartPage=1, EndPage=5])
in
Source

Help greatly appreciated.

 
Posted : 06/08/2022 3:10 am
(@catalinb)
Posts: 1937
Member Admin
 

Hi Sjaak,

When you import  from  folder, you should  have a table with all  the files in  that folder. In this table, add a new column calling  the function  you mentioned that processes one file,  the formula  for the new  column  should look like:

=ProcessFile(FileID)

This way, you will have a column  with extracted data, remove other columns  and  expand  this new column.

 
Posted : 10/08/2022 12:28 am
(@stratocumulus)
Posts: 7
Active Member
Topic starter
 

Hi Catalin,

Many thanks for your helpful reply.

It worked like a charm!

I've named my function "ProcessFile" and called it from the query by adding a column to the data file table you mentioned. Next, I expanded the data in the newly created column.

My code now reads:

let

Source = Folder.Files("C:Temp"),

  #"Added Custom" = Table.AddColumn(Source, "Test", each ProcessFile(FileID)),

  #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Data"}, {"Data"}),

in

  #"Expanded Custom"

 
Posted : 10/08/2022 10:13 am
Share: