• Skip to main content
  • Skip to header right navigation
  • Skip to site footer

My Online Training Hub

Learn Dashboards, Excel, Power BI, Power Query, Power Pivot

  • Courses
  • Pricing
    • Free Courses
    • Power BI Course
    • Excel Power Query Course
    • Power Pivot and DAX Course
    • Excel Dashboard Course
    • Excel PivotTable Course – Quick Start
    • Advanced Excel Formulas Course
    • Excel Expert Advanced Excel Training
    • Excel Tables Course
    • Excel, Word, Outlook
    • Financial Modelling Course
    • Excel PivotTable Course
    • Excel for Customer Service Professionals
    • Excel for Operations Management Course
    • Excel for Decision Making Under Uncertainty Course
    • Excel for Finance Course
    • Excel Analysis ToolPak Course
    • Multi-User Pricing
  • Resources
    • Free Downloads
    • Excel Functions Explained
    • Excel Formulas
    • Excel Add-ins
    • IF Function
      • Excel IF Statement Explained
      • Excel IF AND OR Functions
      • IF Formula Builder
    • Time & Dates in Excel
      • Excel Date & Time
      • Calculating Time in Excel
      • Excel Time Calculation Tricks
      • Excel Date and Time Formatting
    • Excel Keyboard Shortcuts
    • Excel Custom Number Format Guide
    • Pivot Tables Guide
    • VLOOKUP Guide
    • ALT Codes
    • Excel VBA & Macros
    • Excel User Forms
    • VBA String Functions
  • Members
    • Login
  • Blog
  • Excel Webinars
  • Excel Forum
    • Register as Forum Member
  • Login

Problem Scraping Paginated Table- Resets Order After Each Page|Power Query|Excel Forum|My Online Training Hub

You are here: Home / Problem Scraping Paginated Table- Resets Order After Each Page|Power Query|Excel Forum|My Online Training Hub
Avatar
sp_LogInOut Log In sp_Registration Register
sp_Search Search
Advanced Search
Search
Forum Scope




Match



Forum Options



Minimum search word length is 3 characters - maximum search word length is 84 characters
sp_Search Search
sp_RankInfo
Lost password?
sp_CrumbsHome HomeExcel ForumPower QueryProblem Scraping Paginated Table- R…
sp_PrintTopic sp_TopicIcon
Problem Scraping Paginated Table- Resets Order After Each Page
Avatar
Mike Campbell

New Member
Members
Level 0
Forum Posts: 1
Member Since:
February 11, 2022
sp_UserOfflineSmall Offline
1
February 11, 2022 - 3:19 am
sp_Permalink sp_Print sp_EditHistory

I found your video "Scrape Data from Multiple Web Pages with Power Query" and found it very helpful. After following the steps the table I realized I was getting duplicates. I assumed I did it wrong, tried it several ways and realized that the website inserts the order randomly each time and going from page 1 to 2, 3, etc. was resetting the random order. If it was just duplicates that would be easy, but each duplicate row in the query takes the place of a record that wasn't chosen for that random generator.  

URL: (PageStart variable in red)

Example 1: https://www.nachi.org/certifie.....e/us?page=1

Example 2: https://www.nachi.org/certifie.....browse/us/florida?page=1

The attached file has Example 1, which is all listings for the United States across 100+ pages. I started scraping Example 2 because searching for the whole country doesn't show which state they're in. I included Example 1 in the file because it's less complicated and has the same duplicate problem. I tried looking at the source code of the page for help figuring out what might be causing the problem. I can see that it might be using a cookie to change "no-js" or "has-js." I don't know much about coding but that might be the trigger that resets the random order. That's my best guess, any help would be greatly appreciated!

 

EDIT: After uploading I noticed I type the source URL in the function.  I originally had "https://www.nachi.org/certified-inspectors/browse/us?page="&PageStart&"1" but should have removed the 1 on the end to be &PageStart&"".  It had the same problem, it just went to Page 11, 21, 31, etc. instead of 1, 2, 3...

Capture2.PNGImage Enlarger

Capture.PNGImage Enlarger
 

Attachments: 

Capture.PNG- screenshot of website source code

Capture2.PNG- Duplicates on the table coming from different pages

Web Scraping test.pbix- sample file/data

sp_PlupAttachments Attachments
  • sp_PlupImage Capture.PNG (103 KB)
  • sp_PlupImage Capture2.PNG (303 KB)
Avatar
Philip Treacy
Admin
Level 10
Forum Posts: 1551
Member Since:
October 5, 2010
sp_UserOfflineSmall Offline
2
February 14, 2022 - 2:01 pm
sp_Permalink sp_Print

Hi Mike,

That website isn't giving the results correctly.  It shouldn't be duplicating results like that.  Not much we can do to prevent this as all the code is doing is querying the website.

All you need to do is to Remove Duplicate from the Member URL column.

Regards

Phil

sp_Feed
Go to top
Forum Timezone: Australia/Brisbane
Most Users Ever Online: 245
Currently Online: Sameh Alami, Donnacha Holly, Ineke Smit, Shoua Lee
Guest(s) 11
Currently Browsing this Page:
1 Guest(s)
Top Posters:
SunnyKow: 1432
Anders Sehlstedt: 880
Purfleet: 414
Frans Visser: 346
David_Ng: 306
lea cohen: 237
Jessica Stewart: 219
A.Maurizio: 213
Aye Mu: 201
Hans Hallebeek: 188
Newest Members:
JUDY MLL
Scot Bailey
Kate Dyka
Kwaje Alfred Mogga
thong nguyen
Appiagyei Kofi Frimpong
Hilary Burchfield
Richie Wright
Adel Kock
Barbara Murray
Forum Stats:
Groups: 3
Forums: 24
Topics: 6548
Posts: 28672

 

Member Stats:
Guest Posters: 49
Members: 32834
Moderators: 2
Admins: 4
Administrators: Mynda Treacy, Philip Treacy, Catalin Bombea, FT
Moderators: Velouria, Riny van Eekelen
© Simple:Press —sp_Information

Sidebar

Blog Categories

  • Excel
  • Excel Charts
  • Excel Dashboard
  • Excel Formulas
  • Excel Office Scripts
  • Excel PivotTables
  • Excel Shortcuts
  • Excel VBA
  • General Tips
  • Online Training
  • Outlook
  • Power Apps
  • Power Automate
  • Power BI
  • Power Pivot
  • Power Query
microsoft mvp logo
trustpilot excellent rating
Secured by Sucuri Badge
MyOnlineTrainingHub on YouTube Mynda Treacy on Linked In Mynda Treacy on Instagram Mynda Treacy on Twitter Mynda Treacy on Pinterest MyOnlineTrainingHub on Facebook

Sign up to our newsletter and join over 400,000
others who learn Excel and Power BI with us.

 

Company

  • About My Online Training Hub
  • Disclosure Statement
  • Frequently Asked Questions
  • Guarantee
  • Privacy Policy
  • Terms & Conditions
  • Testimonials
  • Become an Affiliate
  • Sponsor Our Newsletter

Support

  • Contact
  • Forum
  • Helpdesk - For Technical Issues

Copyright © 2023 · My Online Training Hub · All Rights Reserved. Microsoft and the Microsoft Office logo are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Product names, logos, brands, and other trademarks featured or referred to within this website are the property of their respective trademark holders.