The ability to get data from the web with Power Query is super handy and Power Query’s user-friendly GUI makes it quick and easy.
More importantly Power Query gives you an opportunity to clean the data before loading it into Excel. Then you can refresh it with one click to get updates.
The current Power Query limitation* is that only tables formatted in HTML <Table> tags can be easily imported by Power Query. Some web pages build tables using JavaScript, which is not covered in this tutorial.
Let’s use this English Premier League data from the Sky Sports website as the example:
Note: At the time of writing, the URL above provided the data used in this example, but the layout and content of this page is out of my control and may change at some point in the future.
How to Get Data from the Web with Power Query
Step 1: Copy the URL for the web page containing the table. I’m using https://www.skysports.com/premier-league-table
Step 2: Excel 2016 onward – Data tab > From Web
Excel 2013 and earlier – Power Query tab > From Web:
Note: If you don’t see the Power Query tab in Excel 2010 or 2013 you can download it here.
Step 3: Paste your URL into the ‘From Web’ dialog box then click OK:
Step 4: At the ‘Navigator’ dialog (image below), the left-hand pane provides a list of tables available in the web page.
The first item, ‘Document’, contains the page HTML code, so it’s of no use to us, but any remaining tables typically contain data you can get with Power Query. Remember, it will only show you tables built using HTML table tags.
Clicking on the table called 'Premier League 2018/19' gives you a preview of the data in the right-hand pane:
And selecting the ‘Web View’ places a green box around the table on the web page itself:
Tips:
- You can view the Navigator dialog box in full screen by clicking the full screen icon in the top right of the dialog box.
- If you want to import multiple tables then check the ‘Select Multiple items’ box in the left-hand pane.
Step 5: Once you’ve selected the table you want to import, click on the ‘Transform Data’ button. This opens the Power Query editor window which gives you an opportunity to clean the data before loading it into Excel or the Power Pivot Data Model.
Once you’re done cleaning the data you’re ready to load it to Excel or the Power Pivot Data Model.
Step 6: Home tab > Close and Load To:
Which opens the Import Data dialog box:
Tip: If you load it to the Data Model, be sure to select ‘Only create connection’ so that you don’t duplicate the data in your file i.e. once in the worksheet and again in the Power Pivot data model.
Step 7: Refresh the data. To get updates from the web page simply go to the Data tab of the ribbon and click ‘Refresh All’:
Or if you have multiple queries you can open the queries and connections pane:
In Excel 2013 and earlier go to the Power Query tab > Show Pane :
Find the query in the list > right-click > Refresh:
*Power Query Web Limitation
As I mentioned earlier, Power Query is great at getting data from the web where it’s formatted as an HTML table, but not tables generated using JavaScript. You can easily tell if the table is HTML or JavaScript by inspecting the web page source code and looking for the HTML table tags.
To do this, right-click some empty space on the web page > View Page Source (or similar depending on the browser you use):
CTRL+F to open the Find dialog box. Enter ‘<table’. This will highlight any instances of a HTML table tag.
If you find HTML table tags then it confirms there is a table that Power Query can get from that page, however it doesn’t guarantee it’s the table you actually want, as there may be other tables on the page.
More Power...Query
Power Query can get data from a myriad of places and has loads of tools for cleaning and transforming data. Here are some more examples:
And one of the most common tasks, unpivoting: Unpivot with Power Query
If you’d like to learn Power Query, and you should 😉, please consider my Power Query course.
Jayasai
Dear sir,
I am unable to find advanced option in from web popup, please help me in enabling it.
I am using ms office 2016
Thank you in advance waiting for your reply.
Mynda Treacy
Hi Jayasai, You can try updating your installation of Excel to see if you get this feature, but if it’s not there after updating, then it’s not available in your version of Excel, sorry Mynda
Peter Dube
I have recently upgrades to Office 365 Pro Plus. I have a practice from the past to import a HTML table from a web site into Excel that includes HTML hyperlinks. In the previous version of excel, there was an ‘options’ popup in the process and I could choose ‘full html format’ and the links were retained in excel. Now that option is gone. Are you aware of a way to load from a web data source an retain the html hyperlinks?
Thank you
Mynda Treacy
Hi Peter, I don’t remember being able to retain the link status when importing data from the web with Power Query, and I’m not aware of any way to do it now, sorry. Have you tried adding a column to the Excel Table that Power Query outputs with the HYPERLINK function that references the text URL and converts it to a clickable link? Mynda
Terry davies
I have a power query that gets a web-based database and loads it to Excel.
The web page consists of the last 10,000 rows of the database, with the latest row on top.
If, when I refresh the query, 50 rows have been added, then the oldest 50 rows are no longer on the database. Therefore, when my data is refreshed I am left with the latest 10,000 rows but I have lost the oldest 50 rows.
Is there any way I can refresh the query but not delete the older rows in my original table, so my table grows with each refresh?
Mynda Treacy
Hi Terry,
Power Query is designed to get the data from the source, including any changes. If your source changes then you’d need to intervene with VBA to take a copy of the data before the refresh so you can retain those old rows. There’s no setting in Power Query to do this, sorry.
Mynda