Power Query: Project Population – Part 5
17 May 2023
Welcome to our Power Query blog. This week, I complete the first query that I downloaded from a public source.
I have found some information on population growth provided by The World Bank, which I am using as an example of how to transform real-life data.
data:image/s3,"s3://crabby-images/03eb7/03eb73583921c9de172e3485533e3275fa7f123e" alt=""
I have been transforming the data, and last week, I created and tidied the Latest population census column.
data:image/s3,"s3://crabby-images/bb755/bb7558d84daeff8bedd326f50d0facfd3edffafc" alt=""
I also kept the Population census notes, which I need to tidy up. To begin, I move the column after Latest population census. I can do this by selecting the column and dragging it to the end, or by right-clicking and choosing to Move ‘To End’:
data:image/s3,"s3://crabby-images/32337/323371cbc81e277256bed4b9275a6cdb1c1e1739" alt=""
I examine the contents of the column by looking at the values on the Filter dialog:
data:image/s3,"s3://crabby-images/f5c16/f5c1639d28c7999fb7ccccf0a0e4bccc2062cf59" alt=""
I need to remove the full stops (.) from the beginning of the some of the values. I can do this by using ‘Replace Values’ from the right-click menu:
data:image/s3,"s3://crabby-images/2aff7/2aff7bd32aa9b5a87303fe1caf9a799985912762" alt=""
When I click OK, my data looks better, but I need to remove the leading spaces:
data:image/s3,"s3://crabby-images/91ad0/91ad0bc9f06d945bd6630b73225c614d996f5ebc" alt=""
I can do this by using ‘Trim’ from the Format dropdown on the Transform tab:
data:image/s3,"s3://crabby-images/ebd53/ebd5398840115658bc8f288a50bed13316c9d3ee" alt=""
My data looks better. However, the recent steps are a little ambiguous:
data:image/s3,"s3://crabby-images/06cd4/06cd4f6b3f317a83b8a533123ae07ce962cd9990" alt=""
Renaming them will make it easier to see what I have done:
data:image/s3,"s3://crabby-images/5b562/5b5622cdb8530e9c154f97277a25c038663e05b9" alt=""
Moving on to the National accounts base year column, a quick look at the data might suggest that I use the same approach that I used for Latest population census, where I converted the column to a number to easily identify any values that are not years in order to remove them.
data:image/s3,"s3://crabby-images/23e0d/23e0d4c01c2dc61ecd553c3d4e4303755eef1a75" alt=""
However, if I view the values on the Filter dialog, I see that there are combined year values, which cannot be expressed as a number:
data:image/s3,"s3://crabby-images/a4312/a431218c42c0f21dac106407a7a67e19ac74e6d7" alt=""
In order to preserve this data, I am going to use a different approach. If the first four [4] characters are not numeric, then I want to replace the current value with null. I go to the ‘Add Column’ tab, where I choose to Extract ‘First Characters’:
data:image/s3,"s3://crabby-images/67873/678738df6ccfb00f6649bcdc84574c4c5962a716" alt=""
This takes me to a dialog where I can select the first four [4] characters. Note I cannot choose the name of the new column, but as it will be temporary, this doesn’t matter here.
data:image/s3,"s3://crabby-images/76ebc/76ebc319ff7f9f6e285a9e2d4732e5d3d7a18534" alt=""
This gives me a new column First Characters containing the first year, which I can change to data type ‘Whole number’:
data:image/s3,"s3://crabby-images/d2bc3/d2bc3ec766bd8d5fecedf857154978882af78ded" alt=""
This has identified the values that did not contain a year as errors. I can right-click and choose to ‘Replace Errors’. I shall replace them with null:
data:image/s3,"s3://crabby-images/b5693/b5693a5565a92ae71c35b6a0042fd9548dd3dcd6" alt=""
Now I can add a ‘Conditional Column’ from the ‘Add Column’ tab:
data:image/s3,"s3://crabby-images/29e86/29e86580eeaa12fe0d925f1ce59962eadf2241c0" alt=""
In the new column name, I have capitalised ‘Accounts’ so that I am not trying to use the same name as the existing column. If the temporary column First Characters is null (meaning that National accounts base year did not have a year in the value), then the new column is set to null, otherwise it will have the same value as National accounts base year.
This gives me a new column, and I can delete National accounts base year and First Characters.
data:image/s3,"s3://crabby-images/df975/df975b71080ed09e842c9c25fe23a9f2d1d78c30" alt=""
I tidy up by changing the new column to data type Text, and rename the steps.
data:image/s3,"s3://crabby-images/e3ed7/e3ed752a0421f7d4cb5ddfb0ceed10fa9d712cab" alt=""
Now that all the columns in this query have consistent data, next time, I will look at the data that I can add from the other queries I imported.
Come back next time for more ways to use Power Query!