June 18, 2021
Hi there everyone!
This may be a slightly stupid question, as I am very much out of my depth! I'm a lawyer who has been tasked with finding a way of tracking time recording of employees internally.
I've been doing a bit of research and stumbled across PowerQuery
have a particular problem with the "Unpivot Other Columns" function.
Despite my best efforts to convince people further up the chain to record data in an unpivoted format, they are insisting 'in the interests of user friendliness' that each employee time records by client in a new column in this format:
Date | Client A | Client B | Client C | Client D etc etc
The file name of each employee will identify the employee.
Thanks to Mynda's videos on unpivot/cleaning up poorly structured data, I thought I had the solution. PowerQuery the folder with all the employee excel files having tables in the format above, add the employee's name using the file name so the table starts to look like this:
Employee Name | Date | Client A | Client B | Client C | Client D etc etc
select "Employee Name" and "Date" and then "Unpivot Other Columns".
Now, this actually seems to work, which was awesome for me! But I've hit a stumbling block.
To test the resilience of this, I just added a "Client E" to just one of the employee files. For some reason, PowerQuery was not picking this up as an extra column at all when all the files are combined into a single table (i.e. the step called "Expanded Table Column1"). This is despite the fact that when I open that speciifc Employee's File by expanding just that table in PowerQuery, the extra column for Client E is there. So what I'm able to deduce (I think!) is that something is going wrong at the "Expanded Table Column1" step.
Am I doing something wrong? Or is it not possible to have extra columns added to individual files then be combined into a single table after the first PowerQuery is completed?
October 5, 2010
You've already mentioned the solution to this which is to record data in the correct, tabular format.
As it is, when PQ imports data in columns it uses one of the files in the folder as its 'template' for how the data is laid out. PQ expects that the number of columns, and the column names, are the same for each file.
The purpose of loading files from a folder is to append them together so if the number of columns is different it doesn't know what to do. If one file has a column for Customer D and the other files don't, you'll end up with a column for Customer D that is mostly empty because only 1 employee recorded time for that customer. Multiply that for many customers and employees and you'll end up with lots of empty/null values allover the place. Very messy.
When a file does not fit the template then you get problems like the ones you are experiencing.
I don't understand how entering data in a tabular format is not user friendly?
Whilst it may be a hard sell to those people 'up the chain' a tabular format is the correct way to record data. If it's not you'll just keep running into issues like this.
June 18, 2021
November 1, 2018