top of page

Four Steps to Create the Perfect Key Column in Power BI

Updated: Jun 30, 2023

There is a saying amongst data people that goes something like this: “if your primary and foreign keys don’t match—your data model won’t work.”


typing on a computer

Okay, I made that saying up.


It really should be a saying—right? I mean, it could be catchier, but it is 100% true.


Fact is, making your keys match is an essential piece of data modeling and thus data analysis. I’ve run into a few issues with keys in my time working with Power BI. So, I have created a general outline of what to do with your keys inside M/Power Query, (PRIOR to creating relationships in Power BI desktop, or AFTER creating the relationships and realizing they need to be improved).

I call this the Clutter (CLUTR) Method. This name pulls from the M functions that, (if applied), can prevent or fix many of these key issues.


These functions are:

Lower/Upper(case)

Trim, and

Remove Duplicates (Distinct)


CLEAN(Text.Clean) removes line feeds (#(lf)) and other non-printable characters from text values. No matter the source, these non-printable and error-causing texts often show up without much fanfare or introduction. As you can imagine, if one side of a key pairing contained “#(lf)” and another did not—the matching would be ruined. These little additions can be impossible to find, so CLEAN should be used every time—as soon as an issue arises.


LOWER and UPPER (Text.Lower and Text.Upper) makes all text either lowercased or uppercased. The purpose of employing this function is two-fold: (1) unifying the text on both sides of a key-pairing, and (2) making the removal of duplicates more accurate.


Sources tend to approach the casing of text differently. Even within Power BI itself that is the case, (for example, M v. DAX). By making all keys uppercased or lowercased, you ensure the text of keys are matchy—matchy, in a good way.


This leads naturally into the removal of duplicates issue that arises without standardizing text across sources. If there is text with just one case different from an otherwise similar text—both versions will be kept in M if a remove duplicated function is run on that column.


Admittedly, this could be what you want, but I’m guessing (more often than not), that is not what you want. This is a very important thing because one-to-many or one-to-one relationships are impossible if the intended one side has just 1 pair of the same value. Also, M and DAX read cases differently, meaning a remove duplicates done in M could still leave duplicates in DAX.


TRIM(Text.Trim) removes spaces before or after a value in a field. These spaces can be basically impossible to spot, much like the additions the Clean function removes. TRIM (and CLEAN) should be used every time you have a text key column, or at the very least, as soon as issues arise with you text key column.


REMOVE DUPLICATES (Table.Distinct or List.Distinct) removes all the duplicates within the selected columns (or column in the case of List.Distinct). As mentioned earlier, this is often essential for one-to-(m)any relationships to be successfully formed. This should be used on every table you expect to be on the one-side of a relationship.


Employing the CLUTR principle should get you through any issues you’re having with a text key column. Not only that, if employed on all text key columns (issue or not), you’ll not only avoid errors, you’ll avoid the unseen errors that mismatching keys always produce. Thankfully, each of these functions is incredibly easy to do using the UI in Power BI’s Power Query (as Doug Burke so kindly pointed out).


If you employ CLUTR and find you enjoy it, let me know in the comments below. And if you want to come up with a better “saying” about relationships—I want to hear it; and I’ll use it now and then if I like it!


Please contact JourneyTEAM for a.

705 views

Comments


bottom of page