Power Query: Blog List – Part 2
6 December 2023
Welcome to our Power Query blog. I would like to get a list of all the Power Query blogs I have posted. Today, I extend the query to extract data from all pages.
I started writing this blog series way back in 2016:
There have been many developments since then, some of which I have reported upon. Since I am going to revisit more of the areas that have been improved, I am going to start with a reminder of what we have covered so far. In order to do this, I am going to use Power Query in Power BI, since there are some functions that will help me to access web data, which are not yet recognised in Power Query (Get & Transform) in Excel.
Last week, I noted that to get a full list, I am going to need several pages:
I extracted and tidied up the data for the first page:
Now I am happy with this page, I shall go back to the website to check out the URL for the second page:
If I compare the URL I used for the first page:
https://www.sumproduct.com/blog/power-query-blogs
to the URL for the second page:
https://www.sumproduct.com/blog/power-query-blogs?page=2
I can see I have more characters, i.e. “?page=2”
If I go back to the query I created last week, and look at the source step:
It is reasonable to assume that I could add the characters “?page=1” to this, and still get the same results. I try this:
As I expected, the results are the same. However, I want to extract all the pages, therefore, I am going to use a parameter for the page number. On the Home tab, I can create a ‘New Parameter’ from the ‘Manage Parameters’ dropdown:
I create P_PageNumber:
I make this a ‘Decimal Number’ (since I do not have the option of making it a whole number) and give it an initial value of 1. If I use this parameter in the Source step of my query, I should see no change to the data extracted. Instead of the current M code:
= Web.BrowserContents("https://www.sumproduct.com/blog/power-query-blogs?page=1")
I need to use the following code:
= Web.BrowserContents("https://www.sumproduct.com/blog/power-query-blogs?page="&Number.ToText(P_PageNumber))
I need to use the function Number.ToText() for P_PageNumber, because the file location string is a character datatype, and P_PageNumber is a number. As before, I can test that this works by checking that the results are the same:
Next time, I will convert this query into a function and generate a list of page numbers to concatenate my data.
Come back next time for more ways to use Power Query!