Power Query: Project Population – Part 4
10 May 2023
Welcome to our Power Query blog. This week, I continue to transform selected data from a public source (just for a change).
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image1-1682012020.png/30c8a60bb2d578f0a6a55550914399e2.jpg)
I have downloaded the Excel file, and in Part 1, I extracted the queries I needed:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image2-1682012052.png/51488dbb4817b730e113004b4d036fe0.jpg)
I reduced the data by selecting only the columns I wanted to keep:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image3-1682012072.png/08662dfef7a944c912e97d22adf64839.jpg)
In Part 2, I identified and removed any unnecessary rows:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image4-1682012099.png/744f8a79830db037afd3c11c2544f808.jpg)
I also transformed the System of National Accounts column:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image5-1682012118.png/51802e8be784554bb99ac0c81ea41b30.jpg)
I renamed these steps to make it easier to follow the transformations:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image6-1682012136.png/993a2712fefdd9da3e749ad6ff3e6f3f.jpg)
Last time, I transformed the column Latest population census into two [2] columns, Latest population census and Population census notes :
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image7-1682012163.png/a48aefee82a1cc71742ba4aef6578021.jpg)
In the process, I noticed that when Power Query generated a ‘Changed Type1’ step, the data type assigned was ‘Text’ for Latest population census. This suggests that not all the data in this column corresponds to a year. I can use the filter dropdown to look at the values in this column:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image8-1682012250.png/0630115cfe4a38d0bcaf069ca542e81b.jpg)
‘Guer’ is not a year! Not all the data has been split correctly. If I want to include this with the rest of my data, I need to transform it separately. I filter on ‘Guer’ to see the data.
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image9-1682012369.png/a64976c6ca0403d3078cf375359a8a9b.jpg)
This data has been provided per island, and not per country like the rest of the data. I would like Latest population census to be ‘2015’, and then for Population census notes to include everything that is currently split across the columns. I start by merging the columns, by selecting them and using ‘Merge Columns’ on the Transform tab:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image10-1682012392.png/8d17453a6916baadf84c05be085292c0.jpg)
I can recreate Population census notes:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image11-1682012409.png/fd73e6fa580f9ad40b73b091a1fa3bf1.jpg)
Annoyingly, Power Query renames my column, even though the previous version of Population census notes no longer exists!
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image12-1682012433.png/39d1d44848b307edb2d78fbdc23e42a8.jpg)
I need to rename the column created by changing the M code from:
=
Table.CombineColumns(#"Extract data for CHI",{"Latest population
census", "Population census
notes"},Combiner.CombineTextByDelimiter("",
QuoteStyle.None),"Population census notes.1")
to:
= Table.CombineColumns(#"Extract
data for CHI",{"Latest population census", "Population
census notes"},Combiner.CombineTextByDelimiter("",
QuoteStyle.None),"Population census notes")
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image13-1682012490.png/263468a922b3eaca990709655cf50fe8.jpg)
Now, I need to recreate Latest population census. I can do this from the ‘Add Column’ tab, where I can extract the data using ‘Text Between Delimiters’ on the ‘Extract’ dropdown:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image14-1682012624.png/e9d9aee8bd7660d23dbaaf2041b1b67a.jpg)
I enter delimiters ‘: ‘ (note the trailing space) and ‘;’:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image15-1682012682.png/5b98a6e209e47f92b7b5edbdf6e7bde2.jpg)
This gives me the year:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image16-1682012707.png/02eeca7d2ba86d715392ec9c31e85f6a.jpg)
I had no option to define the column name, so I amend the step from:
= Table.AddColumn(#"Merged Columns", "Text Between
Delimiters", each Text.BetweenDelimiters([Population census notes],
": ", ";"), type text)
to:
= Table.AddColumn(#"Merged Columns", "Latest population
census", each Text.BetweenDelimiters([Population census notes], ":
", ";"), type text)
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image17-1682012787.png/069a1333b3a729426a4aacbfd42a3773.jpg)
I need to rename the steps (simply right-click and ‘Rename’) to make them easier to follow:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image18-1682012836.png/201237a0fa2ed1763bcc4b01e7690e9c.jpg)
Now to add this to the rest of the data. I can use ‘Append Queries’ from the Home tab:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image19-1682012862.png/88e32b8960e06c48295d058ca5c1c426.jpg)
I choose to append the current query to itself: I will amend this step once I have the M code:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image20-1682012883.png/d92ceb049ab92297f6fa76dea114e79e.jpg)
This gives me the ‘Append Query’ step:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image21-1682012907.png/150b972f60ea34e102290126876ed9a8.jpg)
I change the M code from:
= Table.Combine({#"Create Pcensus Year for CHI",
#"Create Pcensus Year for CHI"})
to:
= Table.Combine({#"Create Pcensus Year for CHI",
#"Create PCensus Notes"})
This appends this line of data to the rest of the country data which is in the ‘Create PCensus Notes’ step:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image22-1682012979.png/5026677ae33d47a051ebd68cd0b9ff1f.jpg)
This means I have two [2] rows for Country Code ‘CHI’. There are several ways I could select the correct row. I choose to change the data type for column Latest population census to ‘Whole Number'.
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image23-1682013013.png/2079309d66f63d988ef9b2f80f437534.jpg)
Note the red on the bar under Latest population census. This means I have an error, which is the original ‘CHI’ row. I can right-click on the ‘Column quality’ bar and choose to ‘Remove Errors’:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image24-1682013035.png/0dcde50dfffd84078855c8ea52cf9175.jpg)
This gets rid of the extra row, and I rename the ‘Changed Type1’ step:
![](http://sumproduct-4634.kxcdn.com/img/containers/main/./image25-1682013063.png/7efbf0cdb1dd9d05ce0c43134708e60f.jpg)
Next time, I’ll continue transforming the remaining columns.
Come back next time for more ways to use Power Query!