I have a question about Power Query which is admittedly vague, but I’ll try to be as specific as I can. I have 2 queries, Query A and Query B. Query A loads about 1,000 rows from a table; it has 1 row for each key field. Query B loads about 30,000 rows from an external XLS file; it contains transactional data and has multiple rows for each key field. I augment each query with additional calculated columns. My augmented data is stored in queries ModQueryA and ModQueryB. Then, I reference ModQueryA as a new query and merge that query with ModQueryB, and call the result FinalQueryA.
It all works but when some of those queries refresh, there’s a message in the query list window that says XX MB from (the original data source). I thought it was loading the original data source to execute the query update. However, the MB loaded far exceed the size of the original files. In fact, XX MB counts up to about 1 GB before completing the refresh. And it takes several minutes to complete the refresh.
My question is whether there is any way to make the refresh more efficient, and therefore quicker. Per my Google search, I added a step in the data import of the source files to be:
BufferTable = Table.Buffer(Source)
However, that doesn’t appear to help.
When I limit the rows of each source query to 50 rows, let’s say, the queries definitely update more quickly. So, size does matter. But I do need all the data loaded in a production environment.
Although I have a lot of calculated columns, I would imagine that my model is not as complex as other PQ models out there. Hardware is not an issue since I have 32 GB memory on my PC.
Is there a way to troubleshoot this to enable quicker execution of my full PQ model?
I see no one has replied yet to my question although there have been 50+ views. I have been looking at some other threads related to slow Power Query performance, and one topic that comes up are queries that reference other queries that reference other queries, etc.
I was trying to modular in my approach to adding necessary calculations to my queries, but it seems that may be causing the issues. I'm going to try to refactor my PQ code to see if I can produce the desired results with no or as few references to other queries as possible.
That might improve performance...
In my personal experience, Merge operations always slow things down. The only way to see references I've seen is through Query Dependencies, but I don't think it provides any further insight. I would be interested in speeding up query pulls as well seeing as I don't have any ability to modify the data sources...