Active Member
February 9, 2021
I'm trying to connect power query to a large PDF (tables come in ok but have errors) so that I can clean and then analyze the data. I've pulled just few pages to start but having issues. Some of the data however is coming in with say 4 lines of text in a single cell but the corresponding values come in in separate cells. I've been doing a series of transposes but not able to figure out how to get the data to line up correctly.
Any thoughts on how to transform this? See attachmentJuly 16, 2010
Hi Alexander,
Welcome to our forum and thanks for sending in your PDF via email. As it contains confidential data I won't share it here, but the solution is to separate the 7 columns into their own queries and transform each column one by one using the split by delimiter into rows with the line break special character as the delimiter. Add an Index column to each query and then merge them back together using the Index column as the common denominator.
I'll email the Excel file containing the solution back to you so you can see the finished result.
Mynda
Active Member
February 9, 2021
Thanks Mynda. Got it and I was able to recreate the same table. A couple of questions though.
1- Making each column it's own query - Is there a fast way to do this or do you just have to right click and make new query then convert to table...then run delimiters, filters and index?.. Then do the same thing on every column?
2- Is there a way to copy or apply the same steps from the other query's to save time?
3- The real PDF I'm going to bring in has 70 tables and 7-9 columns per table. Seems like this would take a lot of time unless there's a something I'm missing.
You can post screenshots of the data if it will help others.
Active Member
February 9, 2021
Thanks Mynda - Just to follow up on one of my prev questions -
to make each column it's own query per your advice - Is there a fast way to do this or do you just have to right click and make new query then convert to table...then run delimiters, filters and index?.. Then do the same thing on every column?
July 16, 2010
There's no quick way to create a new query for each column. You can either create one query, apply the steps, then copy that query and edit the column name references to point to the next column. Or you can create all of the queries and apply all of the steps in each query one by one.
Both techniques require you to copy the query. You can do the copying in the Query Editor by right-clicking the query in the left-hand pane > duplicate. Or you can do it in Excel in the Queries and Connections pane > select the query > Ctrl+C > Ctrl+V then open the query editor to edit them.
1 Guest(s)