Power Query: Project Population – Part 4
10 May 2023
Welcome to our Power Query blog. This week, I continue to transform selected data from a public source (just for a change).
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
data:image/s3,"s3://crabby-images/73f74/73f74cf3e158ff640b37c6cf7b22853e63815d70" alt=""
I have downloaded the Excel file, and in Part 1, I extracted the queries I needed:
data:image/s3,"s3://crabby-images/f96ce/f96ceb406aad67162cfd2978893f070d71dbb091" alt=""
I reduced the data by selecting only the columns I wanted to keep:
data:image/s3,"s3://crabby-images/dce9b/dce9ba453db84e2bbc106ab7e5568b06d70001b2" alt=""
In Part 2, I identified and removed any unnecessary rows:
data:image/s3,"s3://crabby-images/ceb83/ceb830d51b108f3d858d3b706def5f9e5e183be8" alt=""
I also transformed the System of National Accounts column:
data:image/s3,"s3://crabby-images/395e9/395e9dbc33124edbdbe3a917a2e2c97e3ecb88e8" alt=""
I renamed these steps to make it easier to follow the transformations:
data:image/s3,"s3://crabby-images/0545a/0545abc200dd6dcdefca180d6fcd74484a2df698" alt=""
Last time, I transformed the column Latest population census into two [2] columns, Latest population census and Population census notes :
data:image/s3,"s3://crabby-images/132ed/132eda9a4ca0bdad29a0b6e713335e10cc392f8f" alt=""
In the process, I noticed that when Power Query generated a ‘Changed Type1’ step, the data type assigned was ‘Text’ for Latest population census. This suggests that not all the data in this column corresponds to a year. I can use the filter dropdown to look at the values in this column:
data:image/s3,"s3://crabby-images/625f4/625f415c6edece3f202c0e65e019a5906a31697f" alt=""
‘Guer’ is not a year! Not all the data has been split correctly. If I want to include this with the rest of my data, I need to transform it separately. I filter on ‘Guer’ to see the data.
data:image/s3,"s3://crabby-images/76eec/76eecb045610fe2b9c9702517b6289bb817ff484" alt=""
This data has been provided per island, and not per country like the rest of the data. I would like Latest population census to be ‘2015’, and then for Population census notes to include everything that is currently split across the columns. I start by merging the columns, by selecting them and using ‘Merge Columns’ on the Transform tab:
data:image/s3,"s3://crabby-images/57eee/57eeec38db6dcdca9bdfbe5bc8fe9ea4827f9d98" alt=""
I can recreate Population census notes:
data:image/s3,"s3://crabby-images/f2942/f294235c4f699c418f75cd4a2cbe1b2caf9059c0" alt=""
Annoyingly, Power Query renames my column, even though the previous version of Population census notes no longer exists!
data:image/s3,"s3://crabby-images/d99ad/d99adf305cafddc115942d0f4419fbc1e788ba5c" alt=""
I need to rename the column created by changing the M code from:
=
Table.CombineColumns(#"Extract data for CHI",{"Latest population
census", "Population census
notes"},Combiner.CombineTextByDelimiter("",
QuoteStyle.None),"Population census notes.1")
to:
= Table.CombineColumns(#"Extract
data for CHI",{"Latest population census", "Population
census notes"},Combiner.CombineTextByDelimiter("",
QuoteStyle.None),"Population census notes")
data:image/s3,"s3://crabby-images/ac403/ac40389c42a92db80a69867d6b85fcc1a36e8c89" alt=""
Now, I need to recreate Latest population census. I can do this from the ‘Add Column’ tab, where I can extract the data using ‘Text Between Delimiters’ on the ‘Extract’ dropdown:
data:image/s3,"s3://crabby-images/f6650/f6650c42b7b6b3fdfefb60a34fc2d5a6fc627924" alt=""
I enter delimiters ‘: ‘ (note the trailing space) and ‘;’:
data:image/s3,"s3://crabby-images/9c651/9c65113c6c9b1ef5bde00233af7b6d2119d2caf7" alt=""
This gives me the year:
data:image/s3,"s3://crabby-images/ef76f/ef76f2a7b52f612a96d164cd94f8d7518bd51336" alt=""
I had no option to define the column name, so I amend the step from:
= Table.AddColumn(#"Merged Columns", "Text Between
Delimiters", each Text.BetweenDelimiters([Population census notes],
": ", ";"), type text)
to:
= Table.AddColumn(#"Merged Columns", "Latest population
census", each Text.BetweenDelimiters([Population census notes], ":
", ";"), type text)
data:image/s3,"s3://crabby-images/8583e/8583e677423ee76fd490cd74069e15d097e16e62" alt=""
I need to rename the steps (simply right-click and ‘Rename’) to make them easier to follow:
data:image/s3,"s3://crabby-images/902c2/902c205db213854917de908ab8f4b7682eb318d6" alt=""
Now to add this to the rest of the data. I can use ‘Append Queries’ from the Home tab:
data:image/s3,"s3://crabby-images/cc6a9/cc6a98ca9cf06985cb4a866078b05892f5dab1dc" alt=""
I choose to append the current query to itself: I will amend this step once I have the M code:
data:image/s3,"s3://crabby-images/99ddd/99dddbd566196e0c9de3236be3f933d28b6c366a" alt=""
This gives me the ‘Append Query’ step:
data:image/s3,"s3://crabby-images/0262f/0262fec8c052e2997125f446863c73e03e09b6c7" alt=""
I change the M code from:
= Table.Combine({#"Create Pcensus Year for CHI",
#"Create Pcensus Year for CHI"})
to:
= Table.Combine({#"Create Pcensus Year for CHI",
#"Create PCensus Notes"})
This appends this line of data to the rest of the country data which is in the ‘Create PCensus Notes’ step:
data:image/s3,"s3://crabby-images/cdc64/cdc64ee1a2dc7e633d163530ad63b722e5d8f499" alt=""
This means I have two [2] rows for Country Code ‘CHI’. There are several ways I could select the correct row. I choose to change the data type for column Latest population census to ‘Whole Number'.
data:image/s3,"s3://crabby-images/c70b2/c70b2c073af8db2a536e1f6bd113c959c877d8dc" alt=""
Note the red on the bar under Latest population census. This means I have an error, which is the original ‘CHI’ row. I can right-click on the ‘Column quality’ bar and choose to ‘Remove Errors’:
data:image/s3,"s3://crabby-images/e71f0/e71f0711b4bc8cb7f2b2b71c855e90247959ab9b" alt=""
This gets rid of the extra row, and I rename the ‘Changed Type1’ step:
data:image/s3,"s3://crabby-images/fba15/fba15f8ea90c01eb5aa3838b56e33b59b7063246" alt=""
Next time, I’ll continue transforming the remaining columns.
Come back next time for more ways to use Power Query!