• Skip to main content
  • Skip to header right navigation
  • Skip to site footer

My Online Training Hub

Learn Dashboards, Excel, Power BI, Power Query, Power Pivot

  • Courses
  • Pricing
    • Free Courses
    • Power BI Course
    • Excel Power Query Course
    • Power Pivot and DAX Course
    • Excel Dashboard Course
    • Excel PivotTable Course – Quick Start
    • Advanced Excel Formulas Course
    • Excel Expert Advanced Excel Training
    • Excel Tables Course
    • Excel, Word, Outlook
    • Financial Modelling Course
    • Excel PivotTable Course
    • Excel for Customer Service Professionals
    • Excel for Operations Management Course
    • Excel for Decision Making Under Uncertainty Course
    • Excel for Finance Course
    • Excel Analysis ToolPak Course
    • Multi-User Pricing
  • Resources
    • Free Downloads
    • Excel Functions Explained
    • Excel Formulas
    • Excel Add-ins
    • IF Function
      • Excel IF Statement Explained
      • Excel IF AND OR Functions
      • IF Formula Builder
    • Time & Dates in Excel
      • Excel Date & Time
      • Calculating Time in Excel
      • Excel Time Calculation Tricks
      • Excel Date and Time Formatting
    • Excel Keyboard Shortcuts
    • Excel Custom Number Format Guide
    • Pivot Tables Guide
    • VLOOKUP Guide
    • ALT Codes
    • Excel VBA & Macros
    • Excel User Forms
    • VBA String Functions
  • Members
    • Login
  • Blog
  • Excel Webinars
  • Excel Forum
    • Register as Forum Member
  • Login

Table.Buffer w/ Thousands of Records|Power Query|Excel Forum|My Online Training Hub

You are here: Home / Table.Buffer w/ Thousands of Records|Power Query|Excel Forum|My Online Training Hub
Avatar
sp_LogInOut Log In sp_Registration Register
sp_Search Search
Advanced Search
Search
Forum Scope




Match



Forum Options



Minimum search word length is 3 characters - maximum search word length is 84 characters
sp_Search Search
sp_RankInfo
Lost password?
sp_CrumbsHome HomeExcel ForumPower QueryTable.Buffer w/ Thousands of Record…
sp_PrintTopic sp_TopicIcon
Table.Buffer w/ Thousands of Records
Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
1
March 11, 2021 - 3:10 am
sp_Permalink sp_Print

Hello, in conjunction with a previous post of mine:

https://www.myonlinetraininghu.....mps#p19048

I've been able to modify my table of 30k+ records and return the information I'm looking for (I believe), however, even when wrapped in Table.Buffer, it refreshes at roughly 10 rows per second and will take quite a while to get through the 30k.

Is this likely due to the calculation involved in power query just being too large, or something I'm not doing with Table.buffer?

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
2
March 11, 2021 - 4:11 pm
sp_Permalink sp_Print sp_EditHistory

Hi Adam,

The FilterTable function is executed at each row, filtering the source data to get the exit time for each record is not efficient.

It can be improved, instead of filtering the entire initial table, we can filter a smaller table containing Exits only, already sorted to avoid sorting at each record in the FilterTable function. Filtering a smaller table already sorted should be faster.

let
Source = Excel.CurrentWorkbook(){[Name="AdamS"]}[Content],
ChangedType = Table.TransformColumnTypes(Source,{{"Field Date Time", type datetime}}),
FilteredRows = Table.SelectRows(ChangedType, each ([Direction] = "Entry")),
Exits=Table.SelectRows(ChangedType, each ([Direction] = "Exit")),
ExitsSorted = Table.Buffer(Table.Sort(Exits,{{"Field Date Time", Order.Ascending}})),

FilterTable = (tbl as table, Person as text, Company as text, EntryDate as datetime)=>
let
Filtered = Table.SelectRows(tbl, each ([Name] = Person and [Company] = Company and [Field Date Time] > EntryDate))
in Table.FirstN(Filtered,1),

AddedCustom = Table.AddColumn(FilteredRows, "Exit", each FilterTable(ExitsSorted,_[Name],_[Company],_[Field Date Time])),
#"Expanded Exit" = Table.ExpandTableColumn(AddedCustom, "Exit", {"Field Date Time", "Direction"}, {"Exit.Field Date Time", "Exit.Direction"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Exit",{{"Exit.Field Date Time", type datetime}})

in
#"Changed Type"

If you can prepare sample data for 1-5 thousand records or more, I can test it on my side to see what else can be improved

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
3
March 11, 2021 - 4:20 pm
sp_Permalink sp_Print sp_EditHistory

Because you mentioned power BI, you could write a measure to calculate the Exit dates...

Split the initial data into 2 tables (one for Entry, one for Exits)

In Entry table, add a measure to calculate the Exits from Exits table.

Should be much faster than PQ, which is mostly for data transformations, not complex calculations.

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
4
March 11, 2021 - 9:58 pm
sp_Permalink sp_Print sp_EditHistory

Thank you for your responses, so very much.

I have been struggling mightily with this and I worry a great deal my current solution won't be sustainable. With regards to Power BI, where our company's solution is currently implemented, a fantastic user in the official PowerBI forum provided this calculated column for me which returns the total number of minutes on site, which I then convert to hh.mm:

 

Time on Site (Minutes) (Unique Identifier) =
IF (
'Gate Scans'[Transit Direction] = "Exit",
VAR aux_ =
CALCULATE (
MAX ( 'Gate Scans'[FieldDateTime] ),
ALLEXCEPT ( 'Gate Scans', 'Gate Scans'[Company], 'Gate Scans'[SBIID1], 'Gate Scans'[Trade] ),
'Gate Scans'[FieldDateTime] < EARLIER ( 'Gate Scans'[FieldDateTime] )
)
RETURN
IF ( NOT ISBLANK ( aux_ ), 24 * 60 * ( 'Gate Scans'[FieldDateTime] - aux_ ) )
)
 
 
 
 
The problem is, after 25K records I'm getting memory allocation errors in the Power BI service, which is why I returned to Excel to see if I couldn't perform the calculations outside of PBI.
 
I managed to implement the code you just provided, though curiously on one step you filter to Entry then the very next step you filter back to Exit. I'm not sure I understand that behaviour, though it's beside the point I think. The code works, and it's quicker until I get to around 10k and it begins to slow.
 
I've also wondered if I couldn't break down the PBI solution above into a couple different stages to ease the burden.
 
 
 
I've added a file with 20k dummy records. Currently YTD we're up to ~110K rows, so I figure we'll be into ~450-500k by the year's end. I would love to not be constrained by a rolling 30 days. 
 
 
Also, your code only worked for me when I changed the source to a table and not a queried SharePoint folder, but that may be a problem for another time.
 
Your help, as always is greatly appreciated.
Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
5
March 12, 2021 - 12:06 am
sp_Permalink sp_Print

Hi Adam,
The last query provided took 2 min and 20 seconds on my PC to process those 20k records, generated a little over 10k rows.

If we extrapolate, 100k records will take around 12 min. Will that work for you?

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
6
March 12, 2021 - 2:16 am
sp_Permalink sp_Print

Hi Catalin,

I suppose it would get the job done. Though for our purposes, where ~35k = 1 month ~110k = 3 months and and ~450k = 1yr, I'm not sure anyone has the appetite for 3 months. Maybe it's better than 1. 

I guess my hope in all this was a stronger solution that wouldn't burden the system in the same way the calculated DAX column does, yet also performs faster and gives us a whole year view.

My suspicion is that using your query for a rolling 30 days is the most viable, stable and efficient method of accomplishing the goal.

If you wouldn't mind answering a couple questions. 1) Can you think of a better way of utilizing the DAX solution that accomplishes the same goal with less strain 2) Assuming I attempt to implement your PQ code with our SharePoint folder source, I was getting errors. It only worked when I ran the query against a flat table. Would you have any thoughts on that?

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
7
March 12, 2021 - 2:45 am
sp_Permalink sp_Print sp_EditHistory

PP is much faster, that's why I mentioned it.

See attached PP version, takes a few seconds for 20 k test data.

I split the data into Entry and Exits, the calculated column formula is very simple, no relationship needed.

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
8
March 12, 2021 - 3:00 am
sp_Permalink sp_Print

Getting data from sharepoint folder is not the same thing as getting data from local file or folder, you have to adjust the query.

Microsoft 365 has the option to get data from SP folder: Get Data> From File> From Sharepoint Folder

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
9
March 12, 2021 - 5:42 am
sp_Permalink sp_Print

Thanks for sending along the sample file, though unfortunately it's confused me a little. I see Query1 is the AdamS table, Entry2 has the corresponding Exit timestamps, Entry fails to load - it provides "Expression.Error: 5 arguments were passed to a function which expects between 2 and 4." and Exits is loaded to the data model, which I'm assuming populates the pivot table.

When I refresh Query1 it took about 2:15, as well as Entry2. For only ~10k records this seems like a high amount of time, especially if we hope to go over 1yr of data.

Also, just to be clear, the DAX measure I'm using in Power BI is from a SharePoint folder, (which I've also tried using in Excel Power Query). The SharePoint Folder model appears to not work with this Excel query you're providing.

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
10
March 12, 2021 - 6:15 am
sp_Permalink sp_Print sp_EditHistory

Query1 and Entry2 are 2 different PQ solutions, the processing time is almost the same, but I doubt that it can be made faster than it is.

Entry and Exits are loaded to data model. If you get an error in Entry, you might have an older version of PQ.
Try = Table.AddIndexColumn(FilteredRows, "Index", 0, 1) instead of = Table.AddIndexColumn(FilteredRows, "Index", 0, 1, Int64.Type)

Again, just to be clear:

Getting data from sharepoint folder is not the same thing as getting data from local file or folder, you have to adjust the query.
Microsoft 365 has the option to get data from SP folder: Get Data> From File> From Sharepoint Folder

If the PP version is acceptable, you can replicate the structure in your PBI model, you have everything you need in the file provided.

The SharePoint Folder model appears to not work with this Excel query you're providing.

there can be many reasons for this, my guess is that you did not updated properly the query to the PBI version.

Of course, without seeing what you did and without proper error description, I have to take your word for it 😉

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
11
March 13, 2021 - 2:20 am
sp_Permalink sp_Print

I really appreciate you taking the time to support this. I am building two reference tables in PBI from the Gate Scans source, (Entry and Exit). I will attempt to replicate what you've done in PP. I see you created a key for Name and Company, would the Unique SBIID1 column not be sufficient?

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
12
March 13, 2021 - 2:30 am
sp_Permalink sp_Print

I used the fields you have indicated originally:

Is there any easy way to subtract one line from the previous line given the name and company match[..]

Use as key whatever you know it can be used as an identifier. I have no idea what that SBIID represents, you know best.

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
13
March 13, 2021 - 6:11 am
sp_Permalink sp_Print

I was optimistic about the PP solution initially. I had no idea how to make it work in PBI so I went back to your Excel file and removed the two tables I didn't need, and then swapped the AdamS dummy table with the correct information (100k rows). Once I did that, it removed the two columns for Entry and Exit time stamps from the pivot table. I've attempted to re-add the Entry column first, and it's been taking at least 20 minutes to put back on there (still seems to be working away as I type this). Something tells me the PP solution may not be viable after all.

GateScans query is now 114k, Entry 57,834 and exit 56,332. 

I feel like I've reached the end of the road though.

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
14
March 13, 2021 - 2:39 pm
sp_Permalink sp_Print

Columns are removed from a pivot table ONLY if you overwrite the headers as well when you paste the real data into source table..

Why are you replacing headers? Paste data.

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
15
March 16, 2021 - 8:50 pm
sp_Permalink sp_Print

Please forgive the delay in responding. I had to change the data source to connect with our actual source data, at which point it lost those two columns, which I was unable to re-add.

I ran the report last night using the original filtering solution (in the link at the very top), and it worked, though it took about 7 hours to get through the 59k rows for the three months. I meant to only have one month selected. That may be a solution, though, to the point of this topic, I've got the code in Table.Buffer and it didn't seem to increase the speed at all which tells me I may not quite understand what it does.

 

Finally, in the file you sent me above, where you had a query for Entry scans and another for Exit scans, can you explain how you intended to rejoin the data? I saw only one index column and I couldn't figure it out (which is why I re-tried the original solution).

Avatar
Catalin Bombea
Iasi, Romania
Admin
Level 10
Forum Posts: 1844
Member Since:
November 8, 2013
sp_UserOfflineSmall Offline
16
March 17, 2021 - 1:54 am
sp_Permalink sp_Print

can you explain how you intended to rejoin the data? I saw only one index column and I couldn't figure it out (which is why I re-tried the original solution).

I thought it's pretty clear it's a power pivot solution. Therefore, you have to open power pivot window to see the Exit Time column added to the Entry table.

That Exit Time formula simply reads the exit time from the Exits table.

Avatar
Adam Smith
Member
Members
Level 0
Forum Posts: 46
Member Since:
December 18, 2018
sp_UserOfflineSmall Offline
17
March 17, 2021 - 4:41 am
sp_Permalink sp_Print

Ah, I didn't have PP add-in turned on. Thanks for your help, again. Your patience and support have been deeply appreciated.

sp_Feed
Go to top
Forum Timezone: Australia/Brisbane
Most Users Ever Online: 245
Currently Online: Nesha St, Syed Hussain, Riny van Eekelen
Guest(s) 8
Currently Browsing this Page:
1 Guest(s)
Top Posters:
SunnyKow: 1432
Anders Sehlstedt: 880
Purfleet: 414
Frans Visser: 346
David_Ng: 306
lea cohen: 237
Jessica Stewart: 219
A.Maurizio: 213
Aye Mu: 201
jaryszek: 183
Newest Members:
Ashley Hughes
Herbie Key
Trevor Pindling
Stevan Kanjo
Erin Sheldon
Nikita Bhatia
Sheilah Taylor
Clare Webber
David Jenssen
Dominic Brosnahan
Forum Stats:
Groups: 3
Forums: 24
Topics: 6526
Posts: 28577

 

Member Stats:
Guest Posters: 49
Members: 32810
Moderators: 2
Admins: 4
Administrators: Mynda Treacy, Philip Treacy, Catalin Bombea, FT
Moderators: Velouria, Riny van Eekelen
© Simple:Press —sp_Information

Sidebar

Blog Categories

  • Excel
  • Excel Charts
  • Excel Dashboard
  • Excel Formulas
  • Excel Office Scripts
  • Excel PivotTables
  • Excel Shortcuts
  • Excel VBA
  • General Tips
  • Online Training
  • Outlook
  • Power Apps
  • Power Automate
  • Power BI
  • Power Pivot
  • Power Query
microsoft mvp logo
trustpilot excellent rating
Secured by Sucuri Badge
MyOnlineTrainingHub on YouTube Mynda Treacy on Linked In Mynda Treacy on Instagram Mynda Treacy on Twitter Mynda Treacy on Pinterest MyOnlineTrainingHub on Facebook

Sign up to our newsletter and join over 400,000
others who learn Excel and Power BI with us.

 

Company

  • About My Online Training Hub
  • Disclosure Statement
  • Frequently Asked Questions
  • Guarantee
  • Privacy Policy
  • Terms & Conditions
  • Testimonials
  • Become an Affiliate
  • Sponsor Our Newsletter

Support

  • Contact
  • Forum
  • Helpdesk - For Technical Issues

Copyright © 2023 · My Online Training Hub · All Rights Reserved. Microsoft and the Microsoft Office logo are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Product names, logos, brands, and other trademarks featured or referred to within this website are the property of their respective trademark holders.