New Member
May 27, 2022
Hello!
I'm new to advanced add-ins like Power Pivot but I'm currently dealing with it at work - there is an Excel data model that was built by someone else that combines queries in Power Query linked to SQl source and results in one Power Pivot which is basically PnL of the company. Each month I have to update this model since SQL source tables are themselves updated (new month data is added or for some tables I need to upload new versions of them to SQL). Then these SQL tables are loaded to Power Query where they are transformed from raw data to the format needed and also act as references for other queries. Not that I each month create new queries - I just change the link in Navigation step to the new SQL tables I just created or in case of SQL tables that just got new month data I do nothing, queries that link to them just need to be updated. Then these queries are consolidated in one main query that references them all and also makes some transformations. This main query is then loaded to Excel data model where changes applicable for all data are made (I don't know much about this step since I never used Excel data model before). The result is Power Pivot which is simply loaded to Excel sheet.
The problem is that each new month OR whenever I need to make any changes and update this pivot, Resfresh All function in Excel takes more than 40 minutes to perform. Since new data adds every month and we are only in May, by the end of the year it could be around 2 hours. This is a problem because when the model is updated each period, it's imperfect and needs different changes - not technical, but rather financial - which I can't make quicky and immediately see the result. Just for info, the initial data in SQL from all the tables that are used is approximately 2 mln rows now.
For better understanding I'm attaching a scheme of what the whole process looks like. I tried some tricks like turning off dowload of background data for PQ, added Table.Buffer to the main query, then went to connections and turned off 'Refresh this connection on Refresh all' and 'Enable background refresh' functions for all queries except the main one. I also created a query that has SQL database as its source and then linked all queries to this one instead of SQL tables. Those are just random chaotic actions, I read about them on different forums including this one but since I don't know how they work exactly and whether they are applicable in my situation, they didn't change anything. Maybe they are not effective because I don't see the whole picture and therefore missing someting or doing something wrong.
The ideal situation would be if the pivot updated quickly and at the same time I could still make changes at different stages of the process - in SQL, Power Query or Excel - sometimes it's easier to change data in SQL by one simple code line, and in case data from SQL is not updated this would be a problem. I want some kind of balance between flexibility in manipulating the model and time needed to update the model to reflect the results of these manipulations. Any help would be appreciated.
P.S. I use Excel 2016 on PC. Microsoft SQL Server Management Studio as SQL program.
July 16, 2010
Hi Mitryaeva,
I would have your SQL database administrator create a "view" of the tables you want to bring into Power Query, rather than having Power Query bring in all the tables and then manipulate them to give you the view you want. It's more efficient to do it this way. This post explains more on views vs tables in relation to Power BI, but remember that Power Query is the same in Excel and Power BI, so it's relevant here too.
This should remove the need for multiple queries which are likely being called multiple times due to the dependencies from one query to the next.
Mynda
1 Guest(s)