June 10, 2023
Hi,
I have 100's of pdf's like the one attached. I would like to be able to import them all and reconfigure into a table like the one I have also attached. I have tried it with PowerQuery but I don't know enough to make it work.
I am running Office 365 on Windows 11.
Your assistance would be greatly appreciated.
Wayne
Power Query
Power Pivot
Xtreme Pivot Tables
Excel for Decision Making
Excel for Finance
Excel Analysis Toolpak
Power BI
Excel
Word
Outlook
Excel Expert
Excel Customer Service
PowerPoint
November 8, 2013
Hi Wayne,
You should use Get Data> From File> From Folder, then you can filter for a specific file extension-.pdf in your case.
Example attached, change the folder path from Path sheet to your pdf folder path.
sample queries:
let
FolderPath = Excel.CurrentWorkbook(){[Name="FolderPath"]}[Content]{0}[Folder Path],
Source = Folder.Files(FolderPath),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".pdf")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each GetPDFData([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
ColNames = Table.ColumnNames(Table.Combine(#"Removed Other Columns"[Custom])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", ColNames, ColNames)
in
#"Expanded Custom"
Code for the function that reads data from a single pdf file GetPDFData:
(bin as binary)=>
let
#"Imported PDF" = Pdf.Tables(bin),
#"Filtered Rows1" = Table.SelectRows(#"Imported PDF", each ([Kind] = "Table")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows1",{"Data"}),
ColNames = Table.ColumnNames(Table.Combine(#"Removed Other Columns"[Data])),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", ColNames, ColNames)
in
#"Expanded Data"
1 Guest(s)