Power Query: See it, Save it, Sort it - Part 7
29 June 2022
Welcome to our Power Query blog. This week, I refine my grouping and show how to preserve data order while grouping.
In Power Query: See it, Save it, Sort it – Part 1, I started with some data for my imaginary salespeople:
and extracted it into Power Query, in order to perform some transformations.
Last week, I grouped my data so that I had a sum of amounts for each date and a Names column containing all the salespeople who had made sales that day.
However, I had values for Names with repeating names, e.g. ‘Paul, Paul, John, John, John, Paul, Paul, Paul, Paul’
I want to only show each name once. I need to amend the ‘Grouped Rows’ step again. The M code is this step is currently:
= Table.Group(#"Removed Columns", {"Date"}, {{"Amount", each List.Sum([Amount]), type nullable number}, {"Names", each Text.Combine([Name], ", "), type nullable text}})
I need to look again at the section of this code that creates the Names column:
each Text.Combine([Name], ", ")
Before I combine the names, I need to make sure I only use unique values. I can do this by using the function List.Distinct().
each Text.Combine(List.Distinct([Name],Comparer.OrdinalIgnoreCase), ", ")
I have embedded the List.Distinct() function inside the Text.Combine() function so that I get a unique list of names and then combine them. Table.Group() takes the rows that satisfy the grouping – in this case rows where the Date is the same, and treats each column of data in those rows as a list, then applies aggregations to that list. This is why List.Distinct() works here. The full ‘Grouped Rows’ step is now:
= Table.Group(#"Removed Columns", {"Date"}, {{"Amount", each List.Sum([Amount]), type nullable number}, {"Names", each Text.Combine(List.Distinct([Name],Comparer.OrdinalIgnoreCase), ", "), type nullable text}})
The second operator (Comparer.OrdinalIgnoreCase) ignores the case, so that ‘Paul’ will be treated as the same as ‘paul’, for example. This step now gives me the Names column in the format I require:
Taking the first row as an example, the value in Names is not in alphabetical order:
I can rectify this by sorting my data before I group it. I go to the ‘Removed Columns’ step. I sort on Date first, using the dropdown on the right of the heading:
I am prompted to insert a step:
I Insert the step, and repeat the process for Name:
Having sorted the data the way I want it, I return to the ‘Grouped Rows‘ step:
Well that’s not what I wanted: the names are still not being combined in alphabetical order! The problem is, the Table.Group() function is also sorting my data by Date. There is a way to prevent this: I can buffer the data in the ‘Sorted Rows1’ step into memory to protect it from changes. I am going to change the ‘Grouped Rows’ step again. Currently, it groups the data in ‘Sorted Rows1’:
= Table.Group(#"Sorted Rows1", {"Date"}, {{"Amount", each List.Sum([Amount]), type nullable number}, {"Names", each Text.Combine(List.Distinct([Name],Comparer.OrdinalIgnoreCase), ", "), type nullable text}})
However, I may use the Table.Buffer() function to preserve the order of the data:
= Table.Group(Table.Buffer(#"Sorted Rows1"), {"Date"}, {{"Amount", each List.Sum([Amount]), type nullable number}, {"Names", each Text.Combine(List.Distinct([Name],Comparer.OrdinalIgnoreCase), ", "), type nullable text}})
When I apply this change, the order of the combined names in Names changes:
So now you know what the title of this series of blogs is about! By saving the data into memory and protecting it from changes, I have preserved the sort order.
Come back next time for more ways to use Power Query!