Power Query: Project Population – Part 12
12 July 2023
Welcome to our Power Query blog. This week, I solve the formatting issue in my merged query and apply some buffering.
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
I have been transforming the data, and last week, I merged the Data query with the other queries that I had already merged to form query Merge1.
I noticed that each row is much taller than it was before. Logically, I might assume that this was caused by a row from Data, but this is not the case. The culprit is the Special Notes column.
Now I have multiple rows for each country, it’s become more obvious that I need to improve the way that the text is being formatted in this column. In the Transform tab, within the Text section, there is an option on the Format dropdown to ‘Clean’. This will remove any ‘hidden’ text commands like the carriage return.
When I choose to Clean the Special Notes column, the row height returns to normal:
I can now rename the query. Since I know that the rows merged from Data are those that pertain to single countries, I will call the merged query Population Estimates by Country.
Now I have merged the data, my query is noticeably slower to refresh. I can buffer the data, rather than reading it in as the merge progresses. To do this, I use Table.Buffer() function. I start with the first step, Source.
The M code is currently:
= Table.Join(Country, {"Country Code"},
#"Country-Series", {"CountryCode"}, JoinKind.LeftOuter)
I can use Table.Buffer() for both queries being merged:
= Table.Join(Table.Buffer(Country), {"Country Code"},
Table.Buffer(#"Country-Series"), {"CountryCode"},
JoinKind.LeftOuter)
Whilst this has no impact on the content of the output, it should speed up the refresh. Note that the extent buffering will improve refreshing depends on the conditions that affect the memory available, from the version of the app to the capability of the computer. The only way to find out, is to test it.
Next, I move on to the second step, ’Merged Queries’:
This time the M code is:
= Table.NestedJoin(Source, {"Country Code"}, Data,
{"Country Code"}, "Data", JoinKind.LeftOuter)
I can use Table.Buffer() on a step, since each step is a snapshot of the current query:
= Table.NestedJoin(Table.Buffer(Source), {"Country
Code"}, Table.Buffer(Data), {"Country Code"}, "Data",
JoinKind.LeftOuter)
This again, has no impact on the data produced:
My changes should speed things up for users of this query.
Next time, I’ll take a look at the rows in Data that didn’t link in the ‘Merged Queries’ step.
Come back next time for more ways to use Power Query!