September 13, 2020
SO, there is this online PDF file with a table in it.
I need to put this as a one whole table in Excel.
Any thoughts?
Link to the file:
https://finance.ec.europa.eu/s.....sia_en.pdf
Trusted Members
October 18, 2018
Moderators
January 31, 2022
Not sure what you want to do with it but the following query will generate one large table with 3630 rows. But it's not pretty. Just point the Source step to the PDF on your own machine.
let
Source = Pdf.Tables(File.Contents("C:\data\faqs-sanctions-russia-export-related-restrictions-russia_en.pdf"), [Implementation="1.3"]),
#"Expanded Data" = Table.ExpandTableColumn(Source, "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44"}, {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44"}),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Expanded Data", {{"Column28", type text}, {"Column29", type text}, {"Column30", type text}, {"Column31", type text}, {"Column32", type text}, {"Column33", type text}, {"Column34", type text}, {"Column35", type text}, {"Column36", type text}, {"Column37", type text}, {"Column38", type text}, {"Column39", type text}, {"Column40", type text}, {"Column41", type text}, {"Column42", type text}, {"Column43", type text}, {"Column44", type text}}, "en-GB"),{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Doc")
in
#"Merged Columns"
Moderators
January 31, 2022
Trusted Members
October 18, 2018
Moderators
January 31, 2022
I did not select the 'Multiple items' box for reason. Selected just one of the tables. Doesn't matter which one. Then delete the Navigation step that PQ generates and you are left with a table containing a Data column of all pages with or without tables in the same order as you find them in the PDF. Now you can expand that column and merge all of them into one.
Selecting 'Multiple items' will generate 76 separate queries. Tables first, then pages. These would then have to be appended to each other in the correct order. Quite some work that can be avoided when you follow the steps I described above.
The following users say thank you to Riny van Eekelen for this useful post:
Alan SidmanSeptember 13, 2020
Well, I've done as you Riny suggested, and it's really ugly, can't use it.
As I look at this PDF, I think that who ever created it, must have gone through some 'table' app, can't imagine they've done it in Word for example, so I thought it could be some (better) way to make these PDF tables nest in Excel...
1 Guest(s)