December 13, 2017
Hello,
I would like to scrape a webpage with power query. When I open the page in a browser it shows me 50 entries and only if a scroll to the bottom additional entries are shown. It is not additional pages, but showing the addional entries on the same page with the same url.
In power query I also see only 50 entries, but here I can't scroll down to get additional entries.
Adding time like this (e.g. 50sec), does not help to get all entries - still only 50 entries are shown, resp. included in the html code:
Web.BrowserContents("https:/...", [WaitFor = [Timeout = #duration(0,0,0,50)]])
Any chance to force a webpage to deliver all entries?
Regards,
Matthias
Moderators
January 31, 2022
December 13, 2017
Hi Riny,
it is a webpage, which contains info, which I should not share.
And now I only find webpages which work with page numbering e.g. Blog • Page 2 of 61 • My Online Training Hub
But it is not rare that webpages initially only give a subset and after scrolling down add entries, so it should be possible to find some example. Perhaps you know one / can find one too.
Thanks,
Matthias
Moderators
January 31, 2022
Perhaps you'll find an solution here:
October 5, 2010
Hi Matthias,
PQ can't interact with a website in this way. You could use Power Automate though.
You can start a browser, then send the END key to the browser window to make it scroll to the end of the page and load more data.
You then of course need to create the PA steps to scrape the data you need. You can't use PA and PQ together to scrape data.
Regards
Phil
December 13, 2017
Thanks Riny, that can only be used if the url is changing for each page, or every 10 entries.
Phil, "You can't use PA and PQ together to scrape data" means there is no way with PQ. I hoped for some kind of url modification like ?sort=desc or a kind of filter added to the url which could make the query work incremental.
Thanks,
Matthias
October 5, 2010
Hi Matthias,
The behaviour you describe from the webpage indicates that the loading of new data is triggered by scrolling to the bottom of the page. as such it must be triggered by JavaScript. Unfortunately there is no way for PQ to make web pages scroll like this and/or trigger JS. But Power Automate can.
If you could load more data by adding a parameter to the URL then you could use PQ, but it doesn't sound like this is possible from what you have said, and without being able to access the webpage, I can't check.
Regards
Phil
December 13, 2017
Hi Phil,
yes, it is a typical scroll to the bottom to load more data behaviour and the code mentions type="text/javascript".
I can use ?order=XYZ to sort the data (ascending) according to different criteria. There might be other parameters, but that is all what I found.
Thanks again,
Matthias
1 Guest(s)