Active Member
June 2, 2022
Hi All,
I need a help with MS Query Data from Web.
I have more URLs from one webpage to process, but some of the URLs might not be existing in the time. In MS Query is all defined and with correct URLs it works perfect, but as I have in the list of URLs not existing ones, whole process of the getting data stops and my query doesnt show any data.
Is it possible to make query to skip wrong datasources and to go to next one and so on?
Thank you in advance,
Ch.
July 16, 2010
Hi Chloe,
Welcome to our forum. I presume you're talking about Power Query and not the old Microsoft Query?
If so, you can use this http error handling technique for Power Query.
Mynda
Active Member
June 2, 2022
Thank you Mynda,
I have studied your link, but the issue is, that I know some links are wrong. I have created table with links based on prefix. Some of the links are the future ones, which will be created as time goes by.
So simply I need, when errorr occures, to skip non existing website (yet) and to scrape data from next ones.
Reason for that is, that I want to create a tool which will automatically check the new URLs on the target http as they are rising up.
Any advice on that?
M.
Active Member
June 2, 2022
Dear Philip,
1. we need to update prices given by the state authority almost every month in a year into our ERP system.
https://www.health.gov.sk/Clan.....kzp-202204
1b. here you can see the excel file (link) with actual pricelist we need to obtain from their web and here are the data.
https://www.health.gov.sk/Zdro.....202204.xls
2. In MS query I have created table (as function) of two existing and the future links, because the script will be always the same for instance in July will be:
https://www.health.gov.sk/Zdro.....acia/zkzp/202207/Zoznam_ZP_202207.xls
which of course this doesn´t exist yet. But it will be.
3. Therefore In MS query I have created a list of two actual and future links which I need to check on web. As soon the query finds out the wrong address I need to skip those and load the next one which has data. Then I don´t need to visit website everyday to wait for their next excel link on web.
4. Now it works perfectly with existing websites (already published) and based on filter I separe the older data from the freshest which I use than for import into our ERP system.
5. Maybe there is also another way how to do it, how regularly check the website for most actual data. But problem is that as they publish new data in linked excel sheet, it is in the new URL. But result shuld be to get most actual data (prices) from ministry as soon as I refresh my data in Excel.
Thank you very much for any Idea to this issue.
Ch.
PS: In the attachment you will find the excel with MS Query settings.
October 5, 2010
Hi Chloe,
There's no file attached, you need to click on Start Upload after selecting the file.
But, I think what you are trying to do can be done with a query like this
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"YM", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Excel.Workbook(Web.Contents("https://www.health.gov.sk/Zdroje/Sources/kategorizacia/zkzp/" & [YM] & "/Zoznam_ZP_" & [YM] & ".xls"), null, true)),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Added Custom", {"Custom"}),
#"Kept Last Rows" = Table.LastN(#"Removed Errors", 1),
Custom = #"Kept Last Rows"{0}[Custom],
Sheet1 = Custom{[Name="Sheet1"]}[Data],
#"Changed Type1" = Table.TransformColumnTypes(Sheet1,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type text}, {"Column15", type text}, {"Column16", type text}, {"Column17", type text}, {"Column18", type text}})
in
#"Changed Type1"
Dates in YM format e.g. 202206 are loaded from an Excel table, and the query tries to load a workbook for all of those dates.
Any dates in error are removed and the latest, most recent, file is then left. This data is loaded into Excel - see the attached workbook for an example.
Regards
Phil
Answers Post
1 Guest(s)