Codementor Events

Building Calculated Columns in Power BI

Published Oct 25, 2018
Building Calculated Columns in Power BI

Learn how to build calculated columns in Power BI through various use case scenarios in this article by Manuel Quintana, the coauthor of Microsoft Power BI Quick Start Guide.

Calculated columns are stored in the table in which they are assigned, and the values are static until the data is refreshed. There are many use cases for calculated columns, but the two most common are as follows:
• Descriptive attributes
• Concatenated key columns

Creating a calculated column

Now you are going to create your first calculated column. Before you get started, though, you first need to know that Power BI Desktop has IntelliSense. IntelliSense will help you out a lot when writing code, as you will discover very soon.

This built-in functionality autocompletes your code as you go, and it also helps you explore and discover new functions in the DAX language. In order to take advantage of IntelliSense, you simply need to start typing in the formula bar. Now you are ready to start writing DAX!

Click on the Data View—this is located on the left side of the Power BI Desktop screen. Next, click on the customer table from the Fields list. Once the customer table has been selected, click New Column—this is found under the modeling ribbon, as shown in the following screenshot:

1.png

You will now see the text, Column =, in the formula bar. First, name the new column by replacing the default text of Column with Full Name. Then, move your cursor to after the equals sign and type a single quote character.

Immediately after typing the single quote character, a list of autocomplete options will appear. This is IntelliSense at work. The first option in this list is the name of the table you’ve currently selected—Customer. Click the Tab key and the name of the table will automatically be added to the formula bar, as shown in the following screenshot:

2.png

Next, type an opening square bracket into the formula bar followed by a capital letter F. Once again, you will immediately be presented with autocomplete options. The list of options has been limited to only columns that contain the letter f, and the first option available from the dropdown is First Name. Click tab to autocomplete. The formula bar should now contain the following formula:

Full Name = 'Customer'[First Name]

The next step is to add a space, followed by the last name. There are two options in DAX for combining string values. The first option is the concatenate function. Unfortunately, concatenate only accepts two parameters, so if you have more than two parameters, your code will require multiple concatenate function calls. On the other hand, you also have the option of using the ampersand sign (&) to combine strings.

The ampersand will first take both the input parameters and convert them into strings. After this data conversion step, the two strings are then combined into one. Let's continue with the rest of the expression. Remember to use the built-in autocomplete functionality to help you write code.

Next, add a space and the last name column. To add a space—or any string literal value for that matter—into a DAX formula, you’ll need to use quotes on both the sides of the string. For example, " " inserts a space between the first and last name columns. The completed DAX formula will look like the following:

Full Name = 'Customer'[First Name] & " " & 'Customer'[Last Name]

String functions – Month, Year

Now that you have completed your first calculated column, it’s time to build a calculated column that stores the month–year value. The goal is to return a month–year column with the two-digit month and four-digit year separated by a dash, making it "MM-YYYY".

Select the Date (order) table and then click New Column from the modeling ribbon. Write the following code in the formula bar and then hit Enter:

Month Year = 'Date (Order)'[Month Number of Year]

As you begin validating the code, you will notice that this only returns the single-digit month with no leading zero. Your next attempt may look something like the following:

Month Year = "0" & 'Date (Order)'[Month Number of Year]

This will work for single-digit months; however, double-digit months will now return three digits. Take a look at the following screenshot:

3.png

To improve upon this and only return the two-digit month, you can use the RIGHT function. The RIGHT function returns a specified number of characters from the right side of a string. Modify your existing DAX formula to look like the following:

Month Year = RIGHT("0" & 'Date (Order)'[Month Number of Year], 2)

The rest of this formula can be completed quite easily. First, to add a dash, the following DAX code can be used:

Month Year = RIGHT("0" & 'Date (Order)'[Month Number of Year], 2) & "-"

Complete the Month-Year formula by combining the current string with the calendar year column:

RIGHT("0" & 'Date (Order)'[Month Number of Year], 2) & "-" & 'Date (Order)'[Year]) 

You may have noticed that the Year column is of the whole number data type, and you may have expected that this numeric value would need to be converted to a string prior to the combine operation. However, remember that the ampersand operator will automatically convert both inputs into a string before performing the combine operation!

Format function – Month Year

As with any other language, you’ll find that there are usually multiple ways to do something. You’ll now learn how to perform the calculation that you saw in the previous section using the FORMAT function.

The FORMAT function allows you to take a number or date column and customize it in a number of ways. A side effect of the FORMAT function is that the resulting data type will be text. Perform the preceding calculation again, but this time using the FORMAT function.

Make sure you have the Date (order) table selected, and then click on Create a New Calculated Column by selecting New Column from the modeling ribbon. In the formula bar, write the following expression:

Month Year Format = FORMAT('Date (Order)'[Date], "MM-YYYY")

Age calculation

How about determining the age of each customer now? The Customer table currently contains a column with the birth date of every customer. This column, along with the TODAY function and some DAX, will allow you to determine each customer's age. Your first attempt at this calculation may be to use the DATEDIFF function in a calculation that looks something like the following:

Customer Age = DATEDIFF('Customer'[Birth Date], TODAY(), YEAR)

The TODAY function returns the current date and time. The DATEDIFF function returns the count of the specified interval between two dates; however, it does not look at the day and month, and therefore does not always return the correct age for each customer.

To return the exact age of the customer, you can select the Customer Age column from the previous step and rewrite the formula to look like the following:

4.png

The FORMAT function is used to return the two-digit month and two-digit day for each date (the birth date and today's date). Following the logical test portion of the IF statement are two expressions. The first expression is triggered if the logical test evaluates to true, and the second expression is triggered if the result of the test is false.

So, if the customer's month and day combo is less than or equal to today's month and day, then their birthday has already occurred this year, and the logical test will evaluate to true, which will trigger the first expression. If the customer's birthday has not yet occurred this year, then the second expression will execute.

SWITCH() – age breakdown

Now that you have the customer's age, it's time to put each customer into an age bucket. For this example, there will be four separate age buckets:
• 18-34
• 35-44
• 45-54
• 55 +

The SWITCH function is preferable to the IF function when performing multiple logical tests in a single DAX formula. This is because the SWITCH function is easier to read and makes debugging code much easier.

With the Customer table selected, click New Column from the modeling ribbon. Type in the completed DAX formula for the following example:

5.png

The preceding formula is very readable and understandable. There are three logical tests, and if a customer age does not evaluate to true on any of those logical tests, then that customer is automatically put into the 18-34 age bucket.

You may have noticed that the second and third logical tests do not have an upper range assigned. For example, the second test simply checks whether the customer's age is 45 or greater. Naturally, you may assume that a customer whose age is 75 would be incorrectly assigned to the 45–54 age bucket.

However, once a row evaluates to true, it is no longer available for subsequent logical tests. Someone who is 75 would have evaluated to true on the first logical test (55 +) and would no longer be available for any further tests.

It's finally time to create a relationship between the temperature table and internet sales table. The key on the Temperature table is a combination of the region name and the month number of the year. This column combination makes a single row unique in this table, as shown in the following screenshot:

6.png

Unfortunately, neither of those two columns currently exist in the Internet Sales table. However, the Internet Sales table has a relationship to the Sales Territory table, and the Sales Territory table has the region.

So, you can determine the region for each sale by doing a simple lookup operation. Well, it should be that simple, but it's not quite that easy. Here’s why.
Calculated columns do not automatically use the existing relationships in the data model. This is a unique characteristic of calculated columns; calculated measures automatically see and interact with all relationships in the data model. Why is this important?

The following screenshot shows a new column on the Internet Sales table, trying to return the region name from the Sales Territory table. Take a look at the following screenshot:

7.png

Note that there is no IntelliSense, and that the autocomplete functionality is unavailable as you type in "Sales Territory". The reason for this is that the calculated column cannot see the existing relationships in the data model, and that’s why it doesn’t automatically return the column you want from another table.

There is a much more complicated explanation behind all this, but for now, suffice to say that navigation functions (RELATED and RELATEDTABLE) allow calculated columns to interact with and use existing relationships.

If you rewrite the following DAX formula with the RELATED function, then you’ll notice that IntelliSense has returned, along with the autocomplete functionality that was previously discussed:

8.png

Now it's time to create a Temperature Key column on the Internet Sales table. Create a new column on the Internet Sales table and then type in the following DAX formula:

9.png

Now that the temperature key has been created on the Internet Sales table, create the relationship. Click Manage Relationships from the home ribbon and then click New... to open the Create Relationship window. Then, complete the following steps to create a new relationship. The relevant fields and entries for each step are marked out in the following screenshot:

  1. Select Internet Sales from the first drop-down selection list
  2. Select the Temperature Key from the list of columns
  3. Select Temperature from the second drop-down selection list (scroll right)
  4. Select Key from the list of columns
  5. Click OK to save your new relationship:

10.png

Hope you found this article interesting and helpful. If you want to learn more about the remarkable features of Power BI, check out Microsoft Power BI Quick Start Guide. A step by step guide that covers the fundamentals of Power BI capabilities right from the basics, this book is for aspiring Business Intelligence professionals who want to build amazing dashboards and visualizations to make data come alive.

Discover and read more posts from PACKT
get started
post commentsBe the first to share your opinion
Show more replies