Power BI Blog: Sydney’s Register of Food Penalty Notices – Part 5
28 November 2019
Welcome back to this week’s Power BI blog series. This week, we are going to look at creating more custom categories that simplify current data.
We’ve been digging through the register of food penalty notices and as we were looking through the different offence codes, we noticed two things:
- One, the offence codes were split into two categories, ‘Individual’ or ‘Corporation’
- 90% of all the offences fell into the general categorisations of ‘Fail to comply with Food Standards Code – Individual’ or ‘Fail to comply with Food Standards Code – Corporation’:
The first thing we want to do is simplify the offence codes by combining the Individual and Corporation tags together. How do we do this without a giant nested IF statement? We can go back into the Query Editor and select the Offence code column then select the ‘Split Column By Delimiter’ option on the Ribbon:
The delimiter should be set to Custom then we enter “ - ”, including the spaces in the delimiter will eliminate the need to remove the spaces later on. Then select the ‘Each occurrence of the delimiter’ option and hit OK, viz.
Then, we rename the three columns appropriately and change their types where necessary:
We may then close and apply the changes to the data.
Now to deal with the second point, where 90% of all the offences fell into the general categorisation of ‘Fail to comply with Food Standards Code – Individual’ or ‘Fail to comply with Food Standards Code – Corporation’. After some further investigation, we found that there are 177 different “Nature & circumstances of alleged offence” items (from now on we are going to refer to these as ‘descriptions’). Replacing each and every categorisation will be relatively tricky in Power Query, so let’s adopt another approach using DAX.
Using DAX, we can use a lookup table to group each description into our own defined categories, to allow for easier data analysis (we have done this on our Power Pivot blog series, you can read about it here).
We are going to use the following lookup table Nature_ReCat (the following screenshot is not exhaustive):
There is a bit of work involved in this step, as we had to go through all of the possible ‘Nature & circumstances of alleged offence’, remove all duplicates, and remove all rows that have text that repeat within each other. Then, go through all of the unique row values that we want Power BI to lookup and assign a custom category to that row.
After loading the table into our model in Power BI, we can create a custom column:
Category Stepping = CONCATENATEX(
Nature_ReCat,
IF(
SEARCH(
FIRSTNONBLANK(
'Nature_ReCat'[Lookup],
1
),
'Main Data'[Nature & circumstances of alleged offence],
,
9999
)
<>9999,
Nature_ReCat[Category],
""
)
)
Now we have a column of categories that is much more reader friendly.
We included ‘stepping’ in the name in this column because we will need to combine it with the previous Offence Description column in the next step:
Custom Penalty Descriptions =
IF('Main Data'[Category Stepping]=BLANK(),
'Main Data'[Offence Description],
'Main Data'[Category Stepping]
)
We create a column called Custom Penalty Descriptions, it combines the Offence Description column with our custom category name together. Moving on to our report we can use the Custom Penalty Descriptions as a filter:
We now have more descriptive categories for our data, lets try selecting ‘Unclean Kitchen Surface’:
That’s alarming, not sure we can really stomach that.
That’s it for this week. Check back again in seven days when we talk about formatting our dashboard and creating measures for this dataset.
In the meantime, please remember we offer training in Power BI which you can find out more about here. If you wish to catch up on past articles, you can find all of our past Power BI blogs here.