Power Query: Project Population – Part 3
3 May 2023
Welcome to our Power Query blog. This week, I keep on keeping on the transformation of selected data from a public source.
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
I have downloaded the Excel file, and in Part 1, I extracted the queries I needed:
I reduced the data by selecting only the columns I wanted to keep:
Last time, I identified and removed any unnecessary rows:
I also transformed the System of National Accounts column:
I renamed these steps to make it easier to follow the transformations:
This time, I am going to look at the column Latest Population Census:
This column currently contains mixed data types, which is why Power Query has given it a type of ‘Any’. I would like to transform this column into two [2] columns, one with the year, and one with any additional notes. On the Transform tab, I choose to split the column ‘By Positions’:
This prompts me for where to split the column, remembering that Power Query counts from zero [0]:
Clicking on ‘Advanced options’ also indicates how to enter multiple positions to split the data. For this example, I just need to split at position 4, however, if I only enter 4, then the column after position 4 would be retained. In order to preserve the year and have two [2] columns, I need to enter ‘0,4’ in the Positions box. I take the default to ‘Split into Columns’.
This gives me two [2] new columns.
Note that I did not have an option to name my split columns. I can add a step to rename them, but instead I will change the M code in the ‘Split Column by Positions’ step:
I change the code from:
= Table.SplitColumn(Table.TransformColumnTypes(#"Changed SNA
to Number", {{"Latest population census", type text}},
"en-GB"), "Latest population census",
Splitter.SplitTextByPositions({0, 4}), {"Latest population census.1",
"Latest population census.2"})
to:
= Table.SplitColumn(Table.TransformColumnTypes(#"Changed SNA
to Number", {{"Latest population census", type text}},
"en-GB"), "Latest population census",
Splitter.SplitTextByPositions({0, 4}), {"Latest population census",
"Population census notes"})
I rename the step, and delete ‘Changed Type1’. It no longer uses the correct column names, and it left Latest population census with data type ‘Text’.
Next time, I will look at why the column Latest population census was not assigned a numerical data type.
Come back next time for more ways to use Power Query!