Power Query: PDF Pandemonium – Part 3
29 September 2021
Welcome to our Power Query blog. This week, I continue transforming some data that is coming in from a PDF file.
The tent business is doing well, and the UK division have plans to expand the workforce. I have a PDF file, and it contains tables for 10 stores. Last week, I created two Reference Queries, and this week I will continue transforming Pay Scales.
data:image/s3,"s3://crabby-images/a4c1b/a4c1baef80de2492e53599aed4985a3f8d55b178" alt=""
I start by keeping the columns I will be needing for this table.
data:image/s3,"s3://crabby-images/197ba/197ba59e2fd1c863803861a873df0c576d1649b2" alt=""
This means I can concentrate on the data I need to transform for this table.
data:image/s3,"s3://crabby-images/4effe/4effe66a2bbb431e850538ea7d4cae32049436be" alt=""
Having checked the data, I actually only need the pay scale columns, so I select them whilst holding down the CRTL key and click on ‘Remove Other Columns’.
data:image/s3,"s3://crabby-images/a55db/a55db62daa5cb9b0b37dc545db481827826135c1" alt=""
Power Query incorporates this into the existing ‘Remove Other Columns’ step.
data:image/s3,"s3://crabby-images/d9964/d9964f5bcae846c0233d3718c06f91aa474dd24c" alt=""
I can remove empty rows from the Home tab.
data:image/s3,"s3://crabby-images/e652d/e652d53c40564b42a5992b3348a82b8521f79390" alt=""
I can now look at how to transform my data from this into a useful table.
data:image/s3,"s3://crabby-images/eaa1f/eaa1f06cfe824d2da5663f76fedea3f9ebf01e15" alt=""
I want to transpose the data, but if I do this with some of the data I need in the column headings, I will lose it. First, I need to demote the column headings so that I have the information in a row. I can do this from the Home tab.
data:image/s3,"s3://crabby-images/5b5ad/5b5adcf08100ff0ca36199e24ce1d6025e7f7da7" alt=""
This creates a ‘Change Type’ step which I delete as I am not ready to decide column types yet. I am now ready to transpose my data, using the option on the Transform tab.
data:image/s3,"s3://crabby-images/027c2/027c270cb5bfa72c02fbc7b56a9377b82b794201" alt=""
My data is starting to take shape.
data:image/s3,"s3://crabby-images/f25f2/f25f2fd46cd1ccc1a3381439dd6fed737a543087" alt=""
I can rename the headings.
data:image/s3,"s3://crabby-images/c5bde/c5bdef0ec96f2ca5bcc99494db3aa438ea29c3fe" alt=""
I want to show a start and end salary, rather than have the information in one column. I can split the Salary column from the Transform tab.
data:image/s3,"s3://crabby-images/f0d0d/f0d0d47a06a0fd55f24da1b27ddd2034bdfa127f" alt=""
From the dropdown, I choose to split ‘By Delimiter’; this brings up a dialog.
data:image/s3,"s3://crabby-images/959a5/959a5b45b199af65da83886b291e7780b8775c15" alt=""
I choose to split by space at each occurrence of a space, because this will give me a column with the lower and upper limit.
data:image/s3,"s3://crabby-images/e1a3b/e1a3b55f9ac3a52e009bdc310e79674d575823ff" alt=""
I delete the automated ‘Changed Type’ step again. I don’t need Salary.2, so I can remove this. As usual, I do this by selecting the columns I want to keep and ‘Remove Other Columns’.
data:image/s3,"s3://crabby-images/de602/de602c8beac249fe61a454c3b4f1ae4dfdb2781a" alt=""
I want Salary.1 and Salary.2 to be numeric columns, so I need to remove the £ signs. I can do this by selecting the columns and replacing £ with blank. I start by using ‘Replace Values’ on the Transform tab.
data:image/s3,"s3://crabby-images/8fb3c/8fb3c2abf5403b9431f265de05a042ee41923e3d" alt=""
This provides a dialog where I can enter the details.
data:image/s3,"s3://crabby-images/7699b/7699b2195d0c900aaed36343da71cbfdf6ea344e" alt=""
This will remove the £ signs.
data:image/s3,"s3://crabby-images/b6812/b6812bc05588a4ede5b57562bc8d44da46c15387" alt=""
Next, I change both columns to whole numbers. I can do this from the Home tab or the Transform tab, or by using the right-click menu, and changing the data type.
data:image/s3,"s3://crabby-images/d7fc2/d7fc2628a8e86a322da2a50c8249a2e5967ceecf" alt=""
I get an error, but I can use ‘Replace Values’ again; this time, I choose ‘Replace Errors’.
data:image/s3,"s3://crabby-images/e3872/e3872e72ad75ae90368577a5abff536e82baca11" alt=""
I want to replace it with null, not zero [0], since zero is the starting point for ‘Pay Scale A’.
data:image/s3,"s3://crabby-images/3181d/3181dcb88fc7acd8e968b1fa3313729947f5a2ff" alt=""
Next, I need to transform the Percentage Increase column. If I make it the data type Percentage, I will get values of over 100: I need to divide the values by 100 first, which I can do from the Transform tab, but first I change the data type to ‘Decimal Number’. This will allow me to access the ‘Standard’ dropdown.
data:image/s3,"s3://crabby-images/51cb6/51cb67c21bd559926cc1cb08595c98480f61f2ff" alt=""
I choose ‘Divide’ and enter 100 in the dialog.
data:image/s3,"s3://crabby-images/eba0e/eba0eec9cb3c999dffd184c89496e66069493daa" alt=""
I can now set the correct data types for all of the columns.
data:image/s3,"s3://crabby-images/92877/9287762272b39e08df1c1f5cc98a580db8bd93f6" alt=""
I rename the salary columns, and my table is ready to ‘Close & Load’.
data:image/s3,"s3://crabby-images/db657/db657da94ef991302b7e555ad6074f470545dd40" alt=""
Next time, I will look at the Stores query.
Come back next time for more ways to use Power Query!