ALLURE ANALYTICS
  • Home
  • About Us
  • Our Services
  • Blog
  • Our Team
  • Contact Us

October 9, 2022

  • By  Allurean
  • 0 comments

Data Modeling in Power BI

I wanted to highlight in this post some of my best practice tips for Power BI data modeling.

Data modeling is one of the foundations of your Power BI reports, so it’s important to set it up correctly. It’s crucial to create a well-developed data model to help you fully communicate your organization’s information. Data modelling helps you optimize your tables and columns. Doing it right will save you so much time.

 

These tips and best practices will help you create a more organized model, making it easier for you to understand the relationships in your data set. Once you’ve fully grasped how to set up data models, you can easily create intuitive and meaningful Power BI reports.

Benefits of using a Good Data Model:

• Increased report performance
• Less complex DAX to be used and accurate results
• Queries will load faster

Disable Autodetect new relationships option

If you add a new table to the model that has field names (and values) matching with other tables in the model, then Power BI automatically creates a relationship between each of the existing tables and the new table, and this can be sometimes too many new relationships.

This can lead to an unexpected relationships and would cause data issues.

Disable Auto date/time option

The main purposes of the auto date/time are to support convenient time intelligence reporting based on the date columns loaded into the data model.

Instead of using this date hierarchy which is not complete, just create a Date Calendar Table. How to create a Date Calendar Table can be found here

For each date column that generates a hidden auto date/time table, it will result in an increased model size and also extend the data refresh time.

Enable Visual cross-filtering

When setting up your data model, start by organizing your tables. One of the best ways to do this is by using the Waterfall technique (layout). Another way to do this effectively is by using the Star Schema approach.

The Layout method – Waterfall or stacked around the fact table(s) as a STAR model, does not matter as long as you understand it

My personal preference is the Waterfall layout, as you can visualize how the data will be filtered (also called “Lookup” Layout)

STAR SCHEMA

The *STAR* Layout

One of the ways is by using the Star Layout

  • Fact table at the center of the star
  • Dimension (Lookup) tables at the end points of the star
You can see that the Fact Table, fctSales, is at the center while the Dimension Tables are positioned to look like a star.

The Waterfall Layout

The second approach and layout for data modelling in Power BI is called the Waterfall technique. You set up your Dimension Tables at the top and the Fact Tables at the bottom.

Here are some of the benefits of this approach:

  • Arranges tables in rows (Dimension tables in the first row, Fact tables in the second row).
  • Easy to visualize the dimensions “falling” to the Fact table.
  • Displays all relationships and makes it easier to see errors.

 

When using this layout, you need to remember these details:

  • Delete any relationships autogenerated by Power BI.
  • Manually create all relationships.
  • Use the “Manage Relationships” dialogue to maintain relationships.

Handling Multiple Fact Tables

Do not create direct relationship between two fact tables! Follow Star Schema!

So maybe you are thinking of having a multi-fact table model, how do you go about doing this?

You will need to do the following. I have provided an example from the few points above.

1.      Identify the two fact tables — Actuals & Forecast

2.      Agree on the granularity — Monthly, with the ability to drill

3.      Identify the common dimensions — Country and Dates (create a FK)

4.      Link the dimensions — Only Many to one (One to Many) cardinality

Hide Foreign Key (FK)

What is a foreign key? In your data model, you create relationships between a Primary Key (PK) and a Foreign Key (FK). An example of that is demonstrated below.

In the dimProduct table, you have a list of all products of your company. Each product is listed exactly one time, so you have unique values. This is your Primary Key.

In the fctSales table, a product could have multiple sales or no sales at all. This is the Foreign Key.

The Foreign Key should not be used in the Report View and should therefore be hidden (right click on it and select Hide in report view or click on the eye icon to hide).

 

RELATIONSHIP CARDINALITY

Each model relationship is defined by a cardinality type. There are four cardinality type options, representing the data characteristics of the “from” and “to” related columns. The “one” side means the column contains unique values; the “many” side means the column can contain duplicate values.

If a data refresh operation attempts to load duplicate values into a "one" side column, the entire data refresh will fail.

The four options, together with their shorthand notations, are described in the following bulleted list:

  • One to many (1:*)
  • Many to one (*:1)
  • One to one (1:1)
  • Many to many (*:*)

The one-to-many and many-to-one cardinality options are essentially the same, and they’re also the most common cardinality types.

When configuring a one-to-many or many-to-one relationship, you’ll choose the one that matches the order in which you related the columns. Consider how you would configure the relationship from the Product table to the Sales table by using the ProductID column found in each table. The cardinality type would be one-to-many, as the ProductID column in the Product table contains unique values. If you related the tables in the reverse direction, Sales to Product, then the cardinality would be many-to-one. 

Please check the below example of Many to one cardinality relationship.

One-to-one cardinality

A one-to-one relationship means both columns contain unique values. This cardinality type isn’t common, and it likely represents a suboptimal model design because of the storage of redundant data.

For more information on using this cardinality type, see One-to-one relationship guidance.

Many-to-many cardinality

A many-to-many relationship means both columns can contain duplicate values. This cardinality type is infrequently used. It’s typically useful when designing complex model requirements. You can use it to relate many-to-many facts or to relate higher grain facts. For example, when sales target facts are stored at product category level and the product dimension table is stored at product level.

When you create a relationship in Power BI Desktop, the designer automatically detects and sets the cardinality type. Power BI Desktop queries the model to know which columns contain unique values. For import models, it uses internal storage statistics; for DirectQuery models it sends profiling queries to the data source. Sometimes, however, Power BI Desktop can get it wrong. It can get it wrong when tables are yet to be loaded with data, or because columns that you expect to contain duplicate values currently contain unique values. In either case, you can update the cardinality type as long as any “one” side columns contain unique values (or the table is yet to be loaded with rows of data).

Avoid using Many to Many cardinality. This can lead to unexpected results. Use it if you know what are you doing!
Try to follow the best practice and use Star Schema approach. In most cases the cardinality there is Many to One (One to Many)

Data type should match

The two fields that you are connecting to each other should have the same data type. Otherwise, when you create a relationship, you get an error.

The below example the relationship is based on ProductKey which is numeric (number)

Many to one (*:1)

The ProductKey column in the dimProduct table contains unique values.

Cross filter direction

Each model relationship is defined with a cross filter direction. Your setting determines the direction(s) that filters will propagate. The possible cross filter options are dependent on the cardinality type.

You have to avoid bi-directional relationships which are denoted by double-directional arrowheads. Bi-directional relationships can lead to inconsistent results (ambiguity) and often require a more complicated DAX code.
Bi-directional relationships can be used carefully i.e. using SnowFlake Schema

Alternatively, to avoid using bi-directional relationships, you can add a filter to the slicer visual to exclude items that are blank. It is worth noting that you would need to do this for each visual that uses fields across dimension tables.

Active or Inactive Relationship?

Only one active relationship is allowed! Inactive relationship is dotted line

There can only be one active filter propagation path between two model tables. However, it’s possible to introduce additional relationship paths, though you must set these relationships as inactive. Inactive relationships can only be made active during the evaluation of a model calculation. It’ is achieved by using the USERELATIONSHIP DAX function.

An example of using Inactive relationships is so called Role Playing Dimensions

A role-playing dimension is a dimension that can filter related facts differently. For example, at Adventure Works, the date dimension table has three relationships to the reseller sales facts. The same dimension table can be used to filter the facts by order date, ship date, or delivery date.

Orders Shipped =
CALCULATE(
    COUNTROWS(fctSales)
    ,USERELATIONSHIP(‘Date’[Date], fctSales[ShipDate])
)

Create Measure Table

In order to have a well structured report and keep your data model organized, it’s best to organize measures in separate tables. See below how to create dedicated Measures Tables:

1. Add a new table

2. Name the table and load it to the report

3. The new table has been added to the report

4. Allocate a new measure to this table

Create your first measure:
 
Sales Actuals = SUM(fctSales[SalesAmount])

5. Hide Column1

Format your measures

DAX measures can become quite complex. To still be able to read complex DAX functions, you should consider adding some formatting to the formula in order to increase readability. 

Alternatively, you can use Dax Studio or daxformatter.com to beautify your measures
Use CamelCase naming for hidden columns

1. Add line breaks. Use Shift + Enter to add a line break (Alt + Enter)

Sales Actuals =
SUM(fctSales[SalesAmount])

2. Indent to create a structure. Use Tab key to indent

Sales Actuals Red =
CALCULATE(
SUM(fctSales[SalesAmount]),
    dimProduct[ColorName] = “Red”
)

3. Add comments.
For more complex DAX measure, you can add comments to explain what the measure is calculating.
* For short use // or --
* For multiple code lines you can use block comment /* your dax code */

Sales Actuals Red =
//Calculate only Red products
CALCULATE(
SUM(fctSales[SalesAmount]),
    dimProduct[ColorName] = “Red”
)
Sales Actuals =
CALCULATE(
SUM(fctSales[SalesAmount])
/*
CALCULATE(
SUM(fctSales[SalesAmount]),
    dimProduct[ColorName] = “Red”
)
*/
Use variables to optimize your DAX measures. The variable is calculated one time and can be reused and thus minimize the query time instead of referencing to other measures

Try to use good variable names like _TotalSales

Sales % Red =
VAR _RedProducts = CALCULATE(SUM(fctSales[SalesAmount]), dimProduct[ProductName] = “Red”)
VAR _TotalActuals = SUM(fctSales[SalesAmount])
RETURN
    DIVIDE(_RedProducts, _TotalActuals)

Organize your measures in folders

Often, you end up having a big number of measures in a report. To still have the overview, you can organize measures in folders. To create and manage folders, you need to go to the Model view.

The newly created measure folder will appear in your model.

Once a folder has been created, you can also use drag and drop to move measures in between folders.

Create Date Dimension Table

Date table will not be discussed in this post. How to create a date dimension can be found here

Reduce calculated columns using RELATED function​
* Give good names to calculated columns, measures, variables.
* Use CamelCase or underscore.
Examples: TotalSales (measure)
Sales (fact table) or Fact_Sales or fctSales
DimProduct (Dimension table)
_LastMonth (variable)
Use UPPERCASE for formulas in DAX measures or calculated columns (use intellisense) like SUM, CALCULATE etc.
Sales Actuals =
sum(fctSales[SalesAmount])
Format your measures

Disconnected tables and their purpose

Disconnected tables are useful in a variety of ways, allowing a user to select a value that drives the behavior of a dynamic measure, or by grouping values and performing some kind of dynamic calculation.

Typical use of disconnected tables is toggle slicers, i.e. MTD/YTD

Similar result can be achieved using Field Parameters

This table was created using the Enter Data feature and includes only one column named Period, which includes the following values:

Current
MTD
YTD
Same
Per LY

PY

 

The purpose of this post is to demonstrate effective data modeling and not DAX fundamentals but one cannot exist without the other. For our purposes, my model in this example contains five measures that calculate the Store Sales Qty with different time series variations. The Store Sales Qty measure simply returns the Sum of the Quantity field and the Store Sales Qty MTD calculates the month-to-date value for a given date. I’ve created another measure named Store Sales Qty Selected Period which is shown below. When a value in this table is selected, this measure returns the results of one of the other measures.