This article describes the differences between HASONEVALUE and ISINSCOPE, which are two useful DAX functions to control the filters and the grouping that are active in a report.

HASONEVALUE and ISINSCOPE are two DAX functions that are mostly used by developers to check the current level of a hierarchy. It is very common to see code snippets like the following, to show the value only at the *Year* level and below:

IF ( ISINSCOPE ( Date[Year] ), TOTALYTD ( … ) )

You can obtain a similar behavior by using HASONEVALUE:

IF ( HASONEVALUE ( Date[Year] ), TOTALYTD ( … ) )

To most DAX coders, the two versions of the formula yield the same result. But in fact, this is not true. ISINSCOPE and HASONEVALUE serve different purposes, and they have a different impact on performance. Learning when to use either of those functions is important to obtain efficient formulas.

Even though the two functions are different in their implementation, their net result is very close. Therefore, in order to choose which one to use, it is necessary to clearly understand the scenarios where they produce a different result.

Let us start with the definition of the two functions:

- HASONEVALUE checks that a column has exactly one value visible in the current filter context.
- ISINSCOPE checks that a column is currently being filtered because it is part of a grouping performed by a summarizing function.

With the aid of a report, let us check how the functions behave in different cells of the matrix.

Focus on the highlighted cell:

*Brand*has one value visible, because it is filtered by the matrix.*Brand*is also being grouped by the query generated by Power BI to create the matrix. For this reason, both HASONEVALUE and ISINSCOPE are TRUE.*Category*has only one value visible, due to the filter coming from the slicer. This is why HASONEVALUE is TRUE. But because*Category*is not part of the matrix, ISINSCOPE is FALSE.*Calendar Year*has two values visible, because of the slicer. It is not part of the matrix; because of this, no grouping is happening. Both functions yield FALSE.

The first point, *Brand*, deserves a better explanation. How come a matrix – or any visual – is performing a grouping? When Power BI needs the results for a visual, it runs a DAX query. The query often uses SUMMARIZECOLUMNS to perform a grouping by the columns (*Brand*, in this case) that the user puts on rows or on columns. Therefore, each cell is evaluated in a filter context that contains one value for a column currently being grouped by.

As a reference, this is the query that generates the matrix – just a bit polished to make it more readable:

EVALUATE SUMMARIZECOLUMNS ( ROLLUPADDISSUBTOTAL ( 'Product'[Brand], "IsGrandTotalRowTotal" ), TREATAS ( { "TV and Video" }, 'Product'[Category] ), TREATAS ( { "CY 2009", "CY 2008" }, 'Date'[Calendar Year] ), "Sales_Amount", 'Sales'[Sales Amount] )

As you can see from the query, *Sales Amount* is computed by SUMMARIZECOLUMNS during the grouping by *Product[Brand]*. The two TREATAS (the slicers) perform a filter, but they do not participate in the grouping. *Product[Brand]* is being grouped by, and it also participates in the filtering.

At the grand total, the scenario is different because the grand total is still computed by SUMMARIZECOLUMNS, but without any grouping active. This is why at the grand total *Brand* neither has one value, nor is it in scope.

As a rule of thumb – you use ISINSCOPE when you want to detect which level of a hierarchy you are browsing in a report; on the other hand, you use HASONEVALUE to check if the user has selected exactly one value either from a slicer, or when using a grouping in a visual.

In other words, you use ISINSCOPE when the check you need to perform is strictly tied to the visual, like when you need to change the behavior of a measure at the grand total. You use HASONEVALUE when a measure can be computed only when a single value is selected, either from a visual or from a slicer. For example, a year-to-date value only makes sense when a single year is selected, no matter where the filter comes from.

There is a big difference between the two functions in terms of performance. HASONEVALUE needs to count the values of a column. The value of ISINSCOPE does not depend on how many values are visible: either there is an active grouping, or there is not. Consequently, ISINSCOPE is faster than HASONEVALUE.

At first sight, it looks like ISINSCOPE should be preferred: it is faster and lets you distinguish between grouping in the visual and filtering with a slicer. Nevertheless, here is a caveat: ISINSCOPE does not guarantee that there is a single value. When ISINSCOPE is TRUE, there can be either zero or one value. This is not an issue in most scenarios – but sometimes the difference between zero and one is relevant. In that scenario, HASONEVALUE is slower but provides a better solution.

As an example, look at the following query. It was generated by Power BI, with a measure that ranks the orders based on the order number. Because the ranking column should only be displayed if the row in the table is only showing one order number, we protect it with ISINSCOPE:

DEFINE MEASURE Sales[Ranking] = VAR OrderNumbers = CALCULATETABLE ( VALUES ( Sales[Order Number] ), ALLSELECTED () ) VAR CurrentOrder = SELECTEDVALUE ( Sales[Order Number] ) VAR Result = IF ( ISINSCOPE ( Sales[Order Number] ), RANKX ( OrderNumbers, Sales[Order Number], CurrentOrder, ASC ) ) RETURN Result VAR __DS0FilterTable = TREATAS ( { DATE ( 2009, 11, 8 ) }, 'Date'[Date] ) EVALUATE SUMMARIZECOLUMNS ( 'Sales'[Order Number], 'Product'[Product Name], __DS0FilterTable, "Ranking", [Ranking], "Sales_Amount", 'Sales'[Sales Amount] )

On our Contoso model, this query – that works on a single day – takes around 14 seconds to run. If you are interested in optimizing DAX, we can only suggest that you stop reading, download the demo file and run the query in DAX Studio. Spend some time not only optimizing it – given the topic of the article, the optimization should be an easy feat – but also understanding exactly why it is so slow. Trust us, it will be time well spent!

The problem of this query is that it is summarizing by *Sales[Order Number]* and *Product[Product Name]*. Because the two columns are in different tables, DAX needs to inspect the full cross-join of their values.

Now, the cross-join of *Sales[Order Number]* and *Product[Product Name]* is huge: 21,601 x 2,517 = 54,369,717 rows. Most of the combinations do not have transactions in *Sales*: given an order number, it contains few products, not all of them. Regardless, the engine needs to inspect the full cross-join because of the way the query was authored. For each row, it needs to decide whether to compute the ranking or not.

The calculation of the ranking is protected with ISINSCOPE – for this reason, it turns out that the order number is always in scope even though it shows zero value. Indeed, for that product there are no orders. Despite being fast, ISINSCOPE is not the function to use here. If we replace ISINSCOPE with HASONEVALUE, the query goes from taking 14 seconds to taking 24 milliseconds:

VAR Result = IF ( HASONEVALUE ( Sales[Order Number] ), RANKX ( OrderNumbers, Sales[Order Number], CurrentOrder, ASC ) )

This does not mean that HASONEVALUE is preferable. As stated earlier, the semantics of the two functions is different. Although both functions can be used in many scenarios, knowing the difference between the two helps you make an educated choice in scenarios like the one shown in this article.

]]>**DAX is simple, but it is not easy.** I cannot remember how many times I read this sentence on social media, attributed to me. For good reason: I have repeated this sentence at almost every classroom course I delivered over the years, at conferences, user group meetings… it is my mantra.

The sentence itself is not mine. It came as the answer to a question I routinely ask my students at the end of a training: “What do you think of DAX now?”. The course was in Dublin, around 10 years ago, and a student answered that. I stole his sentence, and it became a mantra for both my DAX students and myself. This sentence is a powerful synthesis of DAX.

Anyway, the question is, “Why is DAX not easy, if it is simple?”. In this article I want to expand on the topic. I do not pretend to be thorough, or to provide an authoritative answer to the question. I just want to share my experience teaching DAX to my students, and to highlight the reasons why – to many – DAX is some sort of black magic, mastered by a few wizards and passed through their voice down to a small number of enlightened apprentices.

The truth is quite different: anybody can learn DAX; at SQLBI we are no wizards. Or maybe we are… In which case I want to share our secrets, so that any of my readers can join the secret wizardry of DAX developers!

Functional languages are extremely elegant, from a mathematical point of view. Yet, they are also very counter-intuitive. As humans, when we describe an algorithm, we follow steps. We do not provide a function that computes the result: we provide the steps to reach the result.

Imagine you need to assemble a new piece of furniture that you just brought home. Let us say it is a chair. **You open the instructions and find this**:

CHAIR = ADD ( USESCREW ( Back ), REPEAT ( 4, ADD ( USESCREW ( Leg ), Seat ) ) )

That would be depressing, wouldn’t it? If you translate it from functional to iterative, it says: add a leg to the seat using a screw, 4 times. Once that is done, use some other screws to add the back to the seat. Et voilà, you built a chair!

Although I hope nobody would ever give me functional instructions in a furniture kit, with DAX you need to learn to reason in a functional way. Gaining this skill takes time and effort.

Functional languages are nothing new. They were around back in the time of LISP, in the prehistory of IT – yes, I was around at the time. They never gained a lot of popularity, mainly because they are hard to learn. With that said, an algorithm expressed in a functional language can be optimized much more easily than the same algorithm expressed in an iterative language – meaning step-by-step instructions, as you would expect for your new furniture kit. Being functional is part of the price you need to pay to have a fast language.

If you struggle with the functional nature of DAX, how can you improve your experience? Practice, practice, and practice again. At some point, your brain starts to read functional languages. Practice is the only thing that helps here. “Thinking functional” is hard and counterintuitive; practice is your only weapon. Which leads us to reason #2.

DAX is our job: we spend all day writing and teaching DAX. All day, every day. **This is how you become an expert: by practicing.** With DAX, that means practicing a lot!

When it comes to Power BI, most users do not spend their entire time preparing reports day in, day out. This makes perfect sense: a report is meant to produce numbers, and users are expected to reason off of those numbers, not just produce them. Any regular user would need DAX when preparing a report, but this is a tiny fraction of that person’s daily job. So what happens is that you have a few days to prepare a report, and then weeks, maybe months go by during which you do not write any DAX. Until you need to update the report. Guess what? You struggled to learn certain details when you first authored the report, and after weeks without practice you discover that you forgot everything. Starting over is frustrating. You eventually succeed with the next step, and then you go through another round of months without writing DAX. And the cycle goes on.

Is there a solution to this? Yes. **Incorporate DAX in your daily job.** This means that if you work in a team comprised of many report builders, maybe you can become the DAX expert so that you spend more time writing DAX and less time analyzing numbers. If you cannot work all day long on DAX, at least try to allocate some time every week to work on your craft. Do not let months go by without writing a single formula or understanding a new concept.

The same could be said of any skill. Is DAX any different from other languages? Yes and no. After a few months not practicing SQL, you would have no problem writing a simple SELECT statement. But changing a complex query with five joins and two correlated subqueries is not a trivial task if you have lost your confidence with SQL. DAX is not very different, though details are more important and less evident. DAX is less verbose than SQL, after all.

The entire DAX language is based on just a few concepts:

**evaluation contexts****iterators****context transition****expanded tables**

I have not been able to find a fifth important topic. If you master these four, you are a DAX guru. If you stopped at the first three, you are a DAX ninja. With the first two, you are already a good expert. The thing is: you need to master them, not only have some basic knowledge of what they are. Moreover, these are foundational concepts: they have nothing to do with specific functions.

So, how do you become a DAX guru?

Learning DAX does not mean learning what all the functions do in every detail. You can be a guru without knowing the meaning of the third argument of RANKX. For that kind of detailed knowledge, you can read articles, dax.guide, and many other sources of information at any time.

The main foundational topics are different. You need to learn and understand the foundations really well. You need to make them yours. Profoundly. By practicing and by authoring a lot of code; by understanding exactly what happens under the hood. **Do not stop working on a formula just because it works…** You need to have a profound understanding of how it works. This means mastering the foundations of DAX.

How can you learn these foundational topics? **Be curious!** Do not write a formula with the sole goal of producing a number. Be curious to understand exactly how the formula is evaluated and whether it is really the best way to compute that number. If your code generates an error, do not try to fix it by adding a CALCULATE here and there until it works. Yes, I know: we have all been there. But you need to go further: stop, read the code, understand exactly what is wrong with it. Only when you know the problem, is it time to fix it.

This actually comes as a consequence to the previous points: DAX is a language based on mathematics, built on a few foundational topics. If DAX were a new mathematical theory, this would be perfect. Unfortunately, DAX is not a mathematical theory: it is the language you use to build reports. Still, it is based on the same concepts as a mathematical theory is. Now, if you are a very logical type of person, then you will probably love DAX. By very logical, I mean you like to spend hours on enigmatography and puzzles. If on the other hand that is not your cup of tea, then DAX will need some getting used to. Read on!

Luckily, there is an easy way to avoid this problem: **think about DAX as if it were a game**. When you start, do not use it to solve real problems. Instead, start with simple games. Like: “Hey, no matter what, I want to write a single expression that computes the average age of my customers at the time they bought their first pair of shoes from my store”. You are likely to spend hours, maybe days trying to solve this. And yet that will be the most fruitful time ever invested on your path to learning DAX. You will have forced yourself to learn the foundations, in a brave attempt to strive for perfection.

If your mind goes straight to: “I need this number, does not matter if the formula is elegant or fast, I just want it computed”, then… well, you could end up wasting a lot of time. Attempting to learn DAX by just solving practical problems is not the way to do it: you need theory. Learning theory from practice is a much harder path to follow. The good thing is that – if you learn DAX as a game – it will be a lot of fun. Trust us, learning DAX is fun – if you’re doing it right!

Details… details everywhere. Humans do not love small details; we want to understand the big picture, leaving details for later. But DAX can be quite unforgiving if you forget a small detail in a formula. You forgot that the row context does not propagate through relationships? You get an inaccurate result. DAX will turn its nose up and think that you should have known better, my friend. You do not remember that a table filter in CALCULATE works on the expanded table? Guess what? The only thing DAX uses to warn you of the problem is an inaccurate result. There is nothing surprising about this behavior: DAX computes what you ask it to compute. **If you ask the wrong question, you obtain the wrong answer.** Plain and simple.

How come you asked the wrong question? This, again, goes back to the foundational theory of DAX: you need to master the basic theory. If you are serious about becoming a DAX guru, you need to not only understand it, but master it. You will master the theory once you have practiced with tons of formulas and you understand exactly how the pieces are bound together in a formula. Once you have tried several variations of the same code, with the sole purpose of understanding exactly how a small change affects the results. Once you have learned all the details. After a lot of practice you reach such a level of proficiency, that writing the formula comes naturally. Remember, you only set the bar at this level once you are prepared to graduate from DAX expert to DAX ninja, or even DAX guru.

When you have reached this point, DAX has no secrets. I know I am repeating myself, but the thing is: if your formula produces a wrong result, then it is likely that you forgot one detail. Do not fix the formula until you understand what the wrong number is. I know it does not look useful to be able to describe what a wrong number represents. In fact, it is: it is extremely useful. Understanding what the wrong number is forces you to discover exactly where the problem is. If the result of your formula is wrong, it means your brain still does not master the basic concepts. Stop, learn the basics again, and then proceed.

If you think that writing DAX in a text box is a depressing experience, and you are wondering why we do not have a fully featured and powerful editor… well, I totally feel your pain. Building a complex model with tens of DAX measures is like writing an 800-page book on your cell phone, where you can see one paragraph at a time. Whatever the reason, authoring DAX in a text box is probably the only thing that remained constant since the first DAX edition, in Power Pivot for Excel.

Although there was a strong need for a good editor for DAX, Microsoft has not provided one. Besides, users have not made enough pressure to obtain a better editor. Indeed, you can find several ideas on ideas.powerbi.com, but they have not received enough votes to force the Microsoft engineers to consider them. This is such a pity. To me, the absence of a good editor is one of the main reasons learning DAX looks hard.

I have seen many students start to be enlightened when, during the training, they are given the option of using DAX Studio instead of Power BI to write DAX. By writing simple queries, by looking immediately at the results, by formatting the code with a keystroke, their brain is suddenly freed from many small user interface constraints and it has the option of quickly trying different variations of the same expression. As a teacher, I have learned to recognize when a student’s eyes start to light up as they are suddenly making sense of what they are doing. It often happens exactly at that point.

**If you want to learn DAX, start with DAX Studio.** Download it; learn it; use it to try different pieces of DAX code. Trust me, it is time very well spent. While it is true that Microsoft is lacking a good editor for DAX, it is also true that they are a very open company, humble enough to let others fill the gaps when they know there are some. The integration of Power BI with DAX Studio and Tabular Editor is only going to get better with time. Therefore, I am confident this reason – though compelling – is going to fade away pretty soon.

If you have read everything so far, this last point should be no surprise. The way many people learn a new language is by googling examples, trying them on their model, correcting the errors and finetuning things until it works. There is nothing wrong with this approach: it works with most programming languages. To be honest, it works with most topics. That is how we learned to speak! No theory, just a lot of examples and a lot of mistakes.

DAX is more like mathematics, statistics, physics. **You cannot learn DAX just by trial and error.** You need to learn it the same way you learn a mathematical topic: you study it. You proceed to the next step only when the previous one is understood and solid.

I know, most of the users learn DAX because they want to solve a specific problem. They are not in it to learn a new programming language; they just want the next report done by yesterday. Unfortunately, this approach is not very productive with DAX, because of the additional time required to fix the calculation when the result is not the one you expected.

If you want to learn DAX, take the time required to learn it the right way. Do not try to solve complex problems before you have the knowledge required to understand exactly what you are doing. Once you master the basic concepts, you will be able to solve more complex problems.

If you need the moving average over the last 30 working days, you cannot search on the web for the formula, copy & paste it and hope for the best. I mean, you can, if you are proficient enough to fix any issue you might encounter. First, gather the knowledge about the fundamentals, then copy & paste will work just fine.

Time to draw some conclusions about why DAX is simple, but not easy. To me, it is all a matter of how you approach the learning process. DAX is simple, meaning that the fundamentals of the language are not complex. They are simple, and there are few of them. But you need to train yourself again and again, until you really master and understand them. If you succeed, then DAX is also easy. If you do not, or you just run too fast towards your objective, then learning DAX becomes very hard.

As I said in the introduction: at SQLBI we are not wizards. We have been practicing for years. We wanted to understand. We are not alone at all. A whole generation of great DAX developers is rising. We can see this in the increasing quality of blog posts of authors and developers who write elegant and efficient DAX code. I bet all these developers spent a lot of time on learning the fundamentals.

If I were to give a single piece of advice to any newbie in DAX, this would be it: practice the fundamentals. It is the same advice any seasoned coach would give an eager young athlete.

The.

Fundamentals.

Do not give up when everything still seems fuzzy: go back to the fundamentals, study them once more. Good results will follow.

We know it worked for us. It will work for you too.

**Enjoy DAX!**

This article describes how to use DAX calculated columns to sort names that look like duplicates at a certain level of a hierarchy, but are unique when considering their full path within the hierarchy.

In a Tabular model and in Power BI, one column can be sorted using a different column. For example, we sort the month name by the month number, obtaining the sequence we are used to (January, February, March, …) instead of the default alphabetical sorting order (April, August, …). Generally speaking, the month number is the *sort key*, whereas the month name is the *sorted column*.

In order to use a column as the sort key for a sorted column, there is a definite requirement: For each value of the sorted column, there must be one and only one value for the sort key. In other words, the relationship between the sort key and the sorted column must be one-to-one. Therefore, we need 12 month numbers to sort the month name. However, if the month name column includes the year (such as “January 2007”), then we need 12 values for each year in the *Date* table.

In most scenarios, this restriction is not an issue. If you think about the months, March always comes after February. Therefore, no matter how you filter, slice and dice your data, the order between February and March is written in stone.

However, there are scenarios where the sort order depends on the selection and the same value can be before or after other values. These scenarios are trickier to solve. To be honest, they are also fun problems to work out! Let’s see an example.

We created a simple model by downloading the USA 2010 census information. Then we restricted the data to only the counties and removed all the small counties – counties under 45,000 residents. The reason for this massaging of the data is purely aesthetic: we wanted less data and we wanted there to be duplicates. Based on this dataset with one single table, we created the following report which isolates three states.

There are three different “Washington County” in three different states. We chose states with a small number of counties just to reduce the report size. States and counties are sorted alphabetically, with the default settings.

If you were to choose a state with more counties, there are a lot of options. It would be great if the report were able to show the most relevant ones first, which can be achieved by sorting the counties by number of residents. This way, the larger counties are shown first.

We can sort the full matrix by *Population*, but doing so involves some limitations. This is shown in the picture below. On the left, the original matrix; on the right, we have the same matrix sorted by *Population* in descending order.

There are at least two problems. First, we wanted to sort the counties, but we ended up sorting the states too. Second, the *Population* measure must be present in the matrix. You cannot sort a matrix by a column that is not part of the report. For example, if we want to keep the same sorting order in a slicer, then we cannot rely on Power BI to sort the visual by a measure.

It would be better to use a sort key for the *County* column so that its sorting order depends on *Population*. Unfortunately, it turns out that the same “Washington County” has three different positions in different states. It comesfifth in Utah, third in Rhode Island, and third again in Vermont. There are 15 different Washington counties in our sample dataset, and we isolated only three states for the sake of simplicity. For sure, their sort order can be different in every state. Moreover, there are likely multiple counties with exactly the same number of residents. Therefore, *Population* cannot be directly used as the sort key because it might be non-unique.

We have two problems to solve. First, sorting the same value in the *County* column with different sort orders depending on the state. Second, using a sort key that – though based on *Population* – cannot be just the *Population* itself because of possible duplicated values.

The final goal is a model that makes it possible to create a report like the following.

The slicer shows counties sorted by *Population*, even though *Population* is not part of the slicer itself. In the matrix, states are sorted alphabetically whereas counties are sorted by *Population*. If you look carefully, you should notice that *Population* does not sort the matrix which is using the default sorting.

Let us solve the first problem: we want a sort key that does not contain any duplicated value. By using a composite ranking of the population and the county key, we obtain a number that starts with 1 for the most populated county and increases by 1 for each county. In case there are ties in the population, the second key (the county key) ensure the result is unique. Because the RANKX function only accepts one expression for the ranking, we multiply the population by a factor large enough to hold the largest key:

Sort Key = VAR NumberOfRows = COUNTROWS ( Census ) VAR CountyPopulation = SUMMARIZE ( Census, Census[Key], Census[Population] ) VAR Result = RANKX ( CountyPopulation, Census[Population] * NumberOfRows + Census[Key] ) RETURN Result

The *Sort Key* calculated column has a unique value for each county, which increases according to the corresponding *Population* value. Therefore, the report always sorts the counties by *Population* regardless of the data being sliced by *State* or not.

Once we have a sort key, we still cannot use the Sort by Column feature of the data model; indeed, several counties have the same name, like “Washington County”. We must make all the *County* names unique without changing how they are displayed. The Handling customers with the same name article describes how to accomplish this goal by adding zero-width spaces to the county name; this generates different county names for different states, even though the displayed county name is always the same – like “Washington County” in our example:

Name Unique = VAR PlaceInSameState = CALCULATETABLE ( SUMMARIZE ( Census, Census[Name], Census[Population] ), ALLEXCEPT ( Census, Census[Name] ) ) VAR Ranking = RANKX ( PlaceInSameState, Census[Population] ) VAR Result = Census[Name] & REPT ( UNICHAR ( 8204 ), Ranking - 1 ) RETURN Result

At this point, we can sort the *Census[Name Unique]* column by using the *Census[Sort Key]* column: both columns have corresponding unique values in a one-to-one relationship.

The technique described in this article is compelling. Sorting columns the right way changes the usability of a model dramatically, letting users focus on the values that are most relevant to them. You can probably think of columns in your model that could be sorted in a non-trivial way. Categories, products, customers: all these entities could be sorted using values computed in columns, ranked, and then applied to the entity itself. We are not saying that the default alphabetical sorting is wrong. Just that sometimes you can turn your model into a more usable one by using more specific sorting techniques.

]]>I recently implemented Power BI models by extracting data from Azure Data Lake Storage (ADSL) Gen 2, and I wanted to share a lesson learned when using this connector.

In this project, I had tens of thousands of CSV files stored in the data lake and organized in different folders according to the Folder Path.

The user interface of Power Query provides the user with two options:

- File System View
- CDM Folder View

The CDM Folder View is not useful for plain CSV files. By providing the URL of the storage and choosing the File System View, you get access to all the files stored in the data lake regardless of their hierarchical structure. The interesting thing is that when you provide a path, you get the list of all the files included in any subfolder. The list is a plain list, and the hierarchical folder is simply an attribute in this result (Folder Path).

With Power Query you can apply filters to the list obtained by the File System View option, thus restricting the access to only those files (or a single file) you are interested in. However, there is no query folding of this filter. What happens is that every time you refresh the data source, the list of all these files is read by Power Query; the filters in M Query to the folder path and the file name are then applied to this list only client-side. This is very expensive because the entire list is also downloaded when the expression is initially evaluated just to get the structure of the result of the transformation.

A better way to manage the process is to specify in the URL the complete folder path to traverse the hierarchy, and get only the files that are interesting for the transformation – or the exact path of the file is you are able to do that. For example, the data lake I used had one file for each day, stored in a folder structure organized by yyyy\mm, so every folder holds up to 31 files (one month).

By specifying the folder for April 2020 (2020/04), I get only the 30 files in the list – instead of the tens of thousands obtained when specifying the root folder for the data lake.

In my project, I modified the transformation of a single CSV file into an M function, so the Sample File transformation was extracting a single file from the data lake; it did so by applying a filter to the list of thousands of file names read from the data lake.

This was the initial code to extract the 15.csv file from a certain folder of the data lake:

let Source = AzureStorage.DataLake("https://<storagename>.dfs.core.windows.net/ "), #"Filtered Rows" = Table.SelectRows( Source, each ( [Folder Path] = "https://<storagename>.dfs.core.windows.net/weblogs/2020/04/") and ([Name] = "15.csv")), Navigation1 = #"Filtered Rows"{0}[Content] in Navigation1

This is an improved version of the code, where the filter over the folder has been moved to the initial URL:

let Source = AzureStorage.DataLake( "https://<storagename>.dfs.core.windows.net/weblogs/2020/04/"), #"Filtered Rows" = Table.SelectRows(Source, each ([Name] = "15.csv")), Navigation1 = #"Filtered Rows"{0}[Content] in Navigation1

This way, we are still reading all the files of April 2020 in the data lake, corresponding to the final 2020/04 branch of the URL. This step alone solved the performance issue – but if you need a single file and you know its name, there is a better way to do that. In the connection string, specify the complete path including the filename.

This way, the list only has one file.

In my code, I changed the complete path including the filename in the argument of *AzureStorage.DataLake*:

let Source = AzureStorage.DataLake( "https://<storagename>.dfs.core.windows.net/weblogs/2020/05/01.csv" ), CsvContent = Source[Content] {0} in CsvContent

The performance difference is huge, especially when you use this transformation to create an M function in Power Query. The same code must be executed for every file, so reading three years of data requires calling the same function over 1,000 times. Reading a list of one file 1,000 times is much better than reading a list of thousands of files 1,000 times!

The lesson is that when using the *AzureStorage.DataLake* connector in Power Query you should always provide a path that restricts the files returned, or just one file if you know the filename.

This article explains how to show different customers with the same name in a Power BI report by using zero-width spaces, thus simplifying the presentation without adding visible characters to make the names unique.

Sometimes, we deal with cases where two or more entities of the same type have the same name. If we take the example of customers, the data model stores both the customer code and the customer name to correctly identify each customer. The customer code is what identifies each of the two customers with the same name as being unique.

The Contoso sample database contains many customers with the same name. For example, look at the following table: there are two customers named Jordan Allen. They have the same name but a different customer code. Indeed, they live in different states and their customer history is likely to be very different.

If we just leave the customer name in the table and remove all the columns that make them different, Power BI merges the two customers, showing misleading figures. Looking at the following matrix, the user may incorrectly conclude that the customer Jordan Allen purchased 810.94 USD worth of products. However, this never happened because that number is the sum of the purchases of two different customers with the same name.

The query generated by Power BI groups by customer name. If two customers share the same name, then they are shown as a single customer.

One option to solve the scenario is to append the customer code to the customer name, to guarantee that customer names are always unique. Using the customer code is an effective technique, but the result is unpleasant to the eye because we pollute the name column with the customer code.

There is another technique that we can use to make the names unique, while keeping the visualization of the name the same. We can use the zero-width space, which is a very special kind of space character in the Unicode set.

Zero-width spaces are special characters that affect how to print words used in some writing systems. In this article, we are not interested in their usage for printing, but instead in their unique characteristic of being invisible to the eye. You can learn more at https://en.wikipedia.org/wiki/Whitespace_character. Among the many zero-width spaces, we chose the zero-width non-joiner Unicode 8204.

The idea is the following: if we add a zero-width space at the end of the second Jordan Allen, it looks the same as the first Jordan Allen, but it is different because his name contains the hidden space at the end. What if we have more than two customers with the same name? Well, DAX is there for us. We can author a new calculated column for the customer name, where we append to the original name a suitable number of zero-width spaces to guarantee that – no matter what – the customer name is unique.

To create the column, we must identify the customers with the same name, rank them using the customer code, and then add spaces: zero spaces to the first customer, one space to the second customer, two spaces to the third customer, and so on. In general, once we have the ranking position of the customer, we add as many zero-width spaces as its ranking position minus one. The *Name Unique* calculated column implements this technique:

Name Unique = VAR CustomersWithSameName = CALCULATETABLE ( SUMMARIZE ( Customer, Customer[Customer Code], Customer[Name] ), ALLEXCEPT ( Customer, Customer[Name] ) ) VAR Ranking = RANKX ( CustomersWithSameName, Customer[Customer Code],, ASC, DENSE ) VAR Blanks = REPT ( UNICHAR ( 8204 ), Ranking - 1 ) VAR Result = Customer[Name] & Blanks RETURN Result

To human eyes, the result in the *Name Unique* column is identical to the original *Name* column. However, DAX sees the result differently because there are zero-width spaces at the end of the name of different customers with the same name. Indeed, the *Name Unique* column in the matrix produces the visualization below.

The highlighted customers with the same name are displayed in different rows, despite having the same name. The zero-width space appended to the customer name does not increase the visual length of the name itself. Indeed, the Contoso database contains 385 customers with no name. If we appended a regular space, the length of the last of those empty customers’ names would be enormous: 385 spaces. By using a zero-width space, the name still looks empty, even though the longest one has 385 characters.

To simplify the user’s browsing experience, it may be worth showing the *Name Unique* column, hiding the original *Name* column, and renaming them so that the user sees only a *Name* column containing the unique names. This way, the expected behavior of separating customers with the same name is the default behavior. That said, the original *Customer[Name]* column is still useful. For example, the original *Customer[Name]* column is necessary to check whether a customer name is empty or not. The *Customer[Name Unique]* column never contains an empty string, whereas *Customer[Name]* might contain an empty string.

This technique works just fine with most local cultures and languages. Be mindful that the zero-width space has a specific meaning in some languages. Therefore, it is worth checking that you can safely add a zero-width space at the end of a name in your specific culture without changing its visualization. However, if you can afford to use it, this technique opens up some exciting opportunities in Power BI that are not limited to just separating the names in a column. For example, this technique is useful to implement advanced sorting techniques in hierarchies, as we will show in a future article.

]]>DAX Studio is the best tool to analyze the performances of DAX queries. In this webinar recorded for powerbi.tips, Marco Russo shows how to collect metrics in DAX Studio and investigate on storage engine and formula engine cost of some sample queries.

]]>In this article, we share an idea to reduce the friction in using Power BI, by extending the DAX language with new visual calculations.

At SQLBI, we teach DAX. We have been teaching DAX to thousands of people all over the world. Both in-person and online, we have met an extremely diverse crowd of students: different cultures, ages, backgrounds, experiences, and requirements. If there is a single statement all our students would agree on, it is the following: learning DAX is much harder than one would think. Moreover, we think anybody – even non-SQLBI students – would agree on this: DAX is not simple. We feel the same way! It took both of us years of hard work and study before we felt we had reached proficiency in DAX.

This article is not a rant about DAX. At SQLBI, we love DAX. Instead, the topic of this article is an analysis of the reasons why DAX is hard to learn; this article searches for ways to make DAX easier. We conclude with a proposal: adding visual calculations to the language would greatly reduce the friction in using Power BI. Be prepared: this is a long article. Not hard to read, but quite long. There are many topics to cover, and we hope the article is just the beginning of an interesting conversation with our readers and with the Microsoft engineers.

Before diving into the reasons why DAX is hard, let us see if that applies to any programming language. If all programming languages were hard and DAX was no exception, there would be nothing to discover. Being hard would just be a standard.

It turns out that programming can be easy. Think about the Excel formula language: everybody uses Excel. Not everybody is a master of Excel, and yet they use the language at whatever level of complexity is required for their specific needs. You start using Excel with simple math, and then you discover ranges; maybe you delve into tables, arrays, and CSE formulas. Anyway, you do not need to master all the details of the language to build gorgeous reports. You start learning, and you stop when your needs are met. At every step on your Excel learning curve, you feel powerful; the sense of satisfaction for each of your small victories is huge.

Excel is somewhat unique in this sense. It relies on an extremely powerful language, which is also surprisingly easy to use. In our opinion, the secret ingredient of Excel is that the language is *visual*. In Excel, there is a direct connection between what you see on the screen and the result you want to obtain. Do you want to sum two values? Click on the first value, hit plus, click on the second value, job done. Throughout the entire process, you see what you are doing. If the number is wrong, you immediately see the result.

Let us elaborate on this with an example. Imagine you have some data in an Excel workbook, and you want to compute a sales running total. The solution is straightforward: the value of the previous row plus the current row.

Do you see how visual Excel is? Using colors, it drives your eyes easily to the cells referenced in code. The result is satisfying, to say the least: you compute a running total by knowing only how to sum two cells, and how to copy & paste to repeat the formula in all the rows where you want the result to appear.

Visual programming goes one step further than that. Not only do you see the result immediately, based on numbers visible on the screen; The code for the cell is visible in the same place too. You do not need to author the formula in a different environment: formulas, parameters, and results are all found within the same worksheet.

If you want to achieve the same result in Power BI, things are quite different. Frustrating, to put it mildly. Let us analyze what you need to do:

- First, you need to know that you want to write a measure. This is not trivial: if you want a result in a Power BI report, you need a measure.
- Second, you need to know your model. In our case, we have a single table only, with four rows. In the real world, things are very different. The table contains columns and values that might be different, in name and content, than what you see on the screen.
- Third, you need to figure out an algorithm that works the same way on all the rows. A single DAX expression is evaluated in every cell of the report. In our Excel example, the first cell (D3) has a specific formula: being the first, its value is just the current value. You sum the previous value with the current value starting from the second row. In Excel, two cells in the same table can thus have different formulas, whereas Power BI does not offer this feature: all the cells are controlled by the same formula.

Let us say that you come up with the correct algorithm, either on your own or by searching on the Web; you find out that the formula is, “sum the values of sales that are found before or equal to the current date”.

The last step is the hardest: how do you write the code? Here is a possible answer:

Running Total := VAR CurrentDate = MAX ( Sales[Date] ) VAR PreviousSales = FILTER ( ALL ( Sales ), Sales[Date] <= CurrentDate VAR Result = SUMX ( PreviousSales, Sales[Amount] ) RETURN Result

And here is what you need to know, to author it:

- You need MAX in order to find the current date, because you want the last date visible in the current filter context; the latter happens to be a filter applied to the Sales table by the visual when you are slicing by date.
- You need ALL over Sales, in order to get rid of the filter context. It would otherwise limit the number of rows visible in Sales.
- You need FILTER in order to restrict the result of ALL to the sales that took place before the current date.
- You need SUMX in order to iterate over the result of FILTER; this in turn retrieves the value of Amount in the row context of the iteration introduced by SUMX.

None of these concepts is intuitive. Most newbies fail in authoring a simple running total because they are not familiar with things like row context, filter context, iteration. It does not matter how easy it is for an experienced DAX developer to author this formula. In the self-service BI world, anybody should be able to author a running total easily. No PhD required, just some googling and a hint of common sense.

There is more. The entities that you manipulate in Excel are cells. Yes, you can work with arrays, tables, and more advanced structures. But by just referencing cells and ranges, you can already leverage tremendous analytical power. In Power BI, there are no cells. There are tables, columns, relationships, and measures. You do not see these entities in your report. Tables and relationships are the foundation of the model you are working on – they are mostly invisible in a report. In a report, you see the result of calculations and aggregations. Tables are in the background, they are not what you see. Still, measures work on tables and columns.

Bear in mind that we are not preaching that Excel is better than Power BI. We are not comparing them at all. All we are doing is trying to understand what makes Excel so much easier than Power BI, because we want to steal the secrets of Excel to make Power BI a better tool. Being visual, in our opinion, is the most powerful of those secret ingredients of Excel.

Simplicity in Excel comes with limitations. The very fact that you can mix cells with different formulas in the same table comes at a price. Indeed, Excel presents severe limitations when it comes to the number of rows allowed in a table. Theoretically, you can reach a limit of one million rows in a worksheet. In practice, as soon as you reach a few hundred thousand rows, performance becomes so bad that you start rethinking your report. Power BI, on the other hand, handles tens of millions of rows in the blink of an eye. Billions are also a viable option – not scary at all when using Power BI Premium.

Similarly, Excel is super-easy as long as you work with a single table. If you need to work on a model containing multiple tables, you end up relying on VLOOKUP – again with a huge performance hit. Power BI, on the other hand, handles models with multiple tables seamlessly.

Therefore, we can conclude that programming need not be hard. DAX can be made simpler by adding some visual elements to it. That said, making DAX more visual should not disrupt its performance. After all, we do not want to go back to Excel; we want to improve DAX.

The reason DAX is not visual like Excel is, lies deep in the roots of the tool. Power BI is the result of merging multiple tools into a single product. Over time, Microsoft developed Analysis Services Tabular (AS), Power View, and Power Query. Merging these three products was the genesis of Power BI. Although the old tools were merged into a single product, they remain different tools based on different technologies:

- AS is the engine that stores your data: it is based on tables, relationships, measures, and columns. It is an extremely powerful database that provides Power BI with its unprecedented power and speed.
- Power Query is the ETL tool introduced in Power Pivot first. It has a great ETL language, and it can mashup data from virtually any source. It provides Power BI with its connectivity to external data sources.
- Power View was originally introduced in Excel and SharePoint as the next data visualization tool. Only when it was moved into Power BI did Power View really spread its wings and start to fly, at the heights we see now. Power View lost its name and identity – nobody talks about Power View anymore – but it gained the unconditional love of Power BI users.

Now the thing is that these three products are integrated together in Power BI, but they are not a single tool: they are still three souls in one body. For sure, they communicate together. Power Query loads from external sources and provides data to AS, which stores it in its database. Power View queries AS to produce visuals. AS is the core of Power BI: it sits in the middle between the data sources and the final visualization. AS is also responsible for any calculations over the data. This is the reason why measures are stored in AS.

Although AS is the core, you do not interact with AS. What you see in the end is Power View, the tool that produces the visuals in your reports and dashboards. These visuals are generated by Power View out of queries executed by AS. AS has no clue whether the result of a query lands in a matrix, a line chart or a slicer. AS receives a query, computes it, and returns the result. At the cost of repeating ourselves: AS is responsible for calculations, Power View for the visualization.

In AS calculations occur through measures. Measures are entities stored and executed by AS. AS speaks tables and relationships. AS does not even know about the existence of visuals. This is the reason why a measure cannot reference values in a visual. A measure lives in a different world: the world of AS. In other words, there is no way a measure can reference something like “the previous row in a matrix”, or “the subtotal at the year level”. These are visual representations of values, but AS is not visual at all.

So far, we have seen that in order for a language to be easy, it needs to be visual. Moreover, we also saw that DAX, Power BI’s programming language, cannot be visual. Should we conclude that DAX will never be a visual language – therefore we need to live with its complexities? Well, the purpose of this article is quite the opposite: we would like to build extensions to the DAX language to get the best of both worlds: a language that can speak both “tables and relationships” and “visuals, rows and columns”. The extension we would like to discuss is *Visual Calculations*. Let us elaborate on this.

The data in each visual is populated by sending a query to the underlying model in AS. The result of this query is a table containing the most detailed data gathered to produce the visual. Let us call this table *ReportData*.

You can think of ReportData as the result of exporting the data of a matrix in Power BI to a .CSV file, for later massaging with Excel.

ReportData in and of itself is just a name for the result of a query. What if we extend DAX to add calculations on top of this table? For example, the percentage against the grand total could be expressed this way:

% Total := DIVIDE ( ReportData[Sales Amount], TOTAL ( ReportData[Sales Amount] ) )

There is no TOTAL function in DAX. Think about it as a new function, that returns the total of the table resulting from the query that the matrix sent to the report.

The same calculation in DAX would be:

% Total := DIVIDE ( SUM ( Sales[Sales Amount] ), CALCULATE ( SUM ( Sales[Sales Amount] ), ALLSELECTED () ) )

If the new functionality were restricted to only totals, it would not be very useful. What about the percentage over the category total? We bet many DAX developers would have to think for a while before coming up with the correct answer:

% CategoryTotal := DIVIDE ( SUM ( Sales[Sales Amount] ), CALCULATE ( SUM ( Sales[Sales Amount] ), ALLSELECTED (), VALUES ( Product[Category] ) ) )

Whereas by working with ReportData, the calculation could be as easy as using a hypothetical PARENTTOTAL function:

% Category := DIVIDE ( ReportData[Sales Amount], PARENTTOTAL ( ReportData[Sales Amount] ) )

Before we move on, you can notice that the new syntax we are suggesting works regardless of the columns used in the report. Indeed, by using the content of ReportData and a higher-level function like PARENTTOTAL, it would work on the report – returning the percentage against the parent total no matter which columns are being used for slicing and dicing. The DAX code, on the other hand, works in a report that is slicing by category; it would fail if you sliced by other columns.

What about the running total we were showing at the beginning of the article? In this visual language, it could look something like:

Running Total := ReportData[Sales] + PREVIOUS ( ReportData[Running Total] )

It looks like a recursive definition. It is not. Using PREVIOUS, you could reference the previous row in ReportData. The engine can be smart enough to implement it using a windowing function, or an iteration.

Are we suggesting a new programming language? Not at all. The idea is to extend DAX, not to replace it. This feature could be implemented as part of DAX, with a user interface on top of it to make it simpler. What follows is neither DAX nor a new project being developed, it is just an example of what such DAX extension might look like:

EVALUATE ADDVISUALCOLUMNS ( SUMMARIZECOLUMNS ( 'Product'[Category], 'Product'[Subcategory], "Amt", [Sales Amount] ), "Pct", DIVIDE ( [Amt], PARENTTOTAL ( [Amt] ) ) )

The ADDVISUALCOLUMNS processes the result of SUMMARIZECOLUMNS by adding new columns. In the new column definition, you may access specialized functions to navigate through the table returned by the inner SUMMARIZECOLUMNS.

The advantage of this approach is that the calculations in ADDVISUALCOLUMNS reference values that are directly shown on the screen in a Power BI visual. Remember: any number you see in a visual is returned by a SUMMARIZECOLUMNS query executed on AS. The visual knows the names of these columns and their content. This extension of the language would let the visual know which value you are clicking on when you select any cell of the visual.

DAX would not become totally visual. It would still be structured with the usual tables and relationships. Yet the existence of ReportData tables makes certain calculations possible, without the names of the tables and columns in the model being known: you only need to point and click on a cell in the visual. Like you would do in Excel.

Obviously, we do not want all users to learn this syntax. A good user interface in Power BI could let users define the set of visual calculations in a simpler way, with a code generator that implements the more intricate DAX syntax.

The goal of visual calculations is not to replace DAX. DAX is still there for power users and model builders. More casual users can experience great satisfaction by building non-trivial calculations in a visual way. If and when they are ready to move to the next level, the full AS DAX is waiting for them.

On the other hand, developers and more experienced users could use this new feature as an enhancement of DAX to author queries and/or complex calculations. Indeed, ADDVISUALCOLUMNS would be extremely handy for everyone. Just think, for example, that you could use this feature to precompute a table and then perform further calculations on the resulting table, reducing the need to compute certain values again and again – better usage of the cache, better performance.

Of course, it is just an idea. A lot of work needs to be done in defining the features of visual calculations. For instance, we need a definition of a set of features. It should include the navigation through the table – previous row, next row, running sums – and maybe some time intelligence logic because time intelligence is almost everywhere. Do we need ADDVISUALROWS, in addition to ADDVISUALCOLUMNS? We do not have an answer yet; it is something to think about.

Moreover, the result of any query in DAX consists of a table with many rows. Power BI visuals oftentimes pivot their data, showing for example the year on columns instead of rows. Do we need to be able to refer to the previous column, too? In some way, the goal of this feature is to express calculations using the entities seen in the visual. Therefore, the concepts of row, column, and subtotal should be present in the navigation functions.

For example, should subtotal rows be present in the ReportData table or not? For non-additive measures, the presence of sub-totals would be needed, along with more complexity in the language. Avoiding them might prevent some calculations, keeping the language simpler.

We could go on and show further ideas on this feature for days, but the goal of this post is not to define a new feature. Instead, we want to show the generic idea and – most importantly – ask for your help.

We published a new idea on ideas.powerbi.com (DAX calculations at visual level of reports), referencing this article and providing a brief description of the idea described here. There are at least two ways you can help:

- Use the comments on this article to express your opinion. What kind of functions do you think would be useful to have for the visual calculation? Does the idea sound interesting? Do you think you or your colleagues may benefit from visual calculations?
- Vote for the idea if you like it, and spread the word so that – if it reaches a good number of votes – Microsoft engineers might be tempted to actively think about visual calculations and finally develop the feature.

Finally, if you honestly believe that this idea would only mess up Power BI by adding a useless layer of complexity, then say that. We are truly interested in unbiased opinions from you. Together, we can better understand the implications of visual calculations and – hopefully – apply pressure to bring it to life in the best way.

]]>Starting from the May 2020 version of Power BI Desktop, regardless of the Windows locale settings DAX always uses standard separators by default. This change does not affect most Power BI users around the world; but if you use the comma as a decimal separator and 10,000 is just ten and not ten thousand, then you are affected by this change.

First of all, you can restore the previous behavior as I describe in this post, but the recommendation and the default are now to use the standard DAX separators. I want to describe why this (I think good) choice was made and how I contributed to making this happen.

If you used Power BI Desktop before the May 2020 version and your Windows locale settings use list and decimal separators other than comma (,) and period (.), you always had a different user experience when writing DAX code. In all examples and articles, you have been seeing code like this:

Taxes := SUMX ( Sales, Sales[Quantity] * Sales[Net Price] ) * 0.10

In your Power BI Desktop, this code should have been written as:

Taxes := SUMX ( Sales; Sales[Quantity] * Sales[Net Price] ) * 0,10

Over time, Power BI Desktop inherited from Excel this idea of customizing separators in a formula. But Excel goes beyond that and translates the function names. While this could be a good idea for the user interface, when applied to a language it has the negative side effect of making the code samples available on the Internet harder to use with various languages.

Luckily, Power BI never went so far and only limited the translation to the separators. The more common situation is having the semicolon (;) as a list separator and the comma (,) as a decimal separator. However, having a different presentation of the DAX formulas in the user interface creates several issues:

- Internally, DAX is not translated according to locale settings. The engine uses the period character (.) as a decimal separator and the comma (,) as a list separator. The PBIX file always uses this DAX syntax. Therefore, the Power BI user interface must translate the expressions back and forth by changing these separators.
- There could be bugs in this conversion. I have seen several of these bugs over the years, and some of them were never fixed because of the ambiguity of certain expressions when using the comma as a decimal point separator.
- The copy & paste of an expression from a web page is often a pain when there is a different separator convention at play. This “conversion tax” does not help people be more productive.
- Tools like DAX Studio and DAX Formatter had to deploy an additional effort to support options for separators. The time spent developing and maintaining these features could have been better spent on more productive features.

In October 2019, I suggested implementing an option to choose separators characters in Power BI Desktop regardless of the Windows locale settings.

As part of the Power BI Contributor program, I have been able to work on this feature from December 2019 to April 2020. While the feature did not require months of development, the process of defining requirements, getting approval, implementing the code, and testing use cases requires coordination with many people – and this takes time.

Starting from May 2020, by default Power BI Desktop uses what I call “the standard separators” and ignores the Windows locale settings. We finally have one DAX syntax. You can copy and paste a code sample without having to worry about separators.

In order to minimize the impact, the following message appears the first time you edit DAX code in Power BI.

To be more accurate, this message dialog box is shown when:

- You enter the DAX formula bar to edit a measure, a calculated column, or a calculated table,
**AND**the separators used by the Windows locale settings are different than the “standard” ones for DAX (comma as a list separator, period as a decimal separator),**AND**it is the first time you enter the DAX formula bar,**AND**you have never clicked the “Dismiss” button in the message dialog box.

So if you close the window by clicking on the close button, the same message appears again when you edit another DAX formula. Once you click “Dismiss”, the message is no longer displayed when you edit a DAX formula.

At this point, you should be happy. Really, you should!

Now, what if you prefer to keep your old habits? What if you just need to help Power BI users who do not like the new behavior, and you cannot convince them that it is right? You can restore the previous behavior of Power BI, which respects the localized separators. The Regional Settings dialog box shows the DAX separators section you see below, if and only if the Windows locale settings provide a different choice of list and decimal separators other than the standard ones used by DAX.

You cannot see the “DAX separators section” if your Windows locale settings are set to “English US”.

DAX Formatter and DAX Studio will continue to support the conversion of separators. However, going forward the feature should be taken as is – no more bug fixes, no more improvements, more time available for other features in the backlog!

]]>This article shows how to use the USERELATIONSHIP function in DAX to change the active relationship in a CALCULATE function.

If two tables are linked by more than one relationship, you can decide which relationship to activate by using USERELATIONSHIP. Indeed, you can only have one active relationship between any two tables. For example, you start with a model where *Sales* is related to *Date* through the *Date* column, like in the following picture.

If you create a second relationship, based on the *Delivery Date *columns, the second relationship is an inactive relationship.

Inactive relationships are – by nature – non-active. Consequently, if you filter by *Year* the report shows the sales whose *Order Date* belongs to the filtered year. *Delivery Date* is not used as part of the default filtering. In order to filter *Sales* based on *Delivery Date*, you can temporarily activate a relationship using USERELATIONSHIP.

USERELATIONSHIP is a CALCULATE modifier, which instructs CALCULATE to temporarily activate the relationship. When CALCULATE has computed its result, the default relationship becomes active again.

The *Delivery Amount* measure below computes the amount delivered within a certain time period, in contrast with the ordered amount returned by the *Sales Amount* measure:

Delivery Amount := CALCULATE ( [Sales Amount], USERELATIONSHIP ( Sales[Delivery Date], 'Date'[Date] ) )

You can compare *Delivery Amount* with *Sales Amount* in the following figure.

The arguments of USERELATIONSHIPS are the two columns that form the relationship. The order of the columns is not relevant, even though it is common practice to use the column of the many-side of the relationship (*Sales* in our example) first, and the one-side (*Date* in the example) second. With that said, it is just a standard convention; the opposite order does not alter the result.

The relationship must already be in the model. You cannot use USERELATIONSHIP to create a relationship on the fly. USERELATIONSHIP can only activate an existing relationship.

]]>

This article shows how to use DAX and conditional formatting together to highlight the minimum and maximum values in a matrix in Power BI.

Conditional formatting is a handy feature in Power BI; combined with DAX skills, conditional formatting can produce excellent results. In this example, we show how to highlight relevant information in a matrix by formatting the desired cells.

We look at the sales of Contoso in different countries and years. If we look for geographical areas where some brands do not sell well, we can start the investigation with a matrix like the one below.

If we were to search for specific kinds of information by just looking at the numbers, we would be on our own. The matrix itself is not highlighting the most relevant information, nor guiding our eyes where it matters. The conditional feature in Power BI can improve data visualization, highlighting the higher and lower values with different background colors.

However, there are cases where we want to identify the best and worst figures in a matrix full of numbers. In other words, we want to highlight the cell with the maximum value – the best combination of continent and brand in the previous picture – and the minimum value – the worst combination of brand and continent. The following picture shows a matrix that highlights the best/worst cases.

We can obtain this result by mixing conditional formatting of the background with a measure that returns a number indicating whether the current cell is the minimum or the maximum of all the cells in the matrix. As we discuss later, we can use the same technique to highlight cells with an arbitrary business rule.

We need a *MinMax* measure that rebuilds the full result of the matrix in a temporary table. It then identifies the minimum and maximum values among all the ones computed. Finally, it checks if the current value of the measure is the minimum, the maximum, or any other value:

MinMax = VAR Vals = CALCULATETABLE( ADDCOLUMNS ( SUMMARIZE ( Sales, 'Product'[Brand], Store[Continent] ), "@SalesAmt", [Sales Amount] ), ALLSELECTED () ) VAR MinValue = MINX ( Vals, [@SalesAmt] ) VAR MaxValue = MAXX ( Vals, [@SalesAmt] ) VAR CurrentValue = [Sales Amount] VAR Result = SWITCH ( TRUE, CurrentValue = MinValue, 1, -- 1 for MIN CurrentValue = MaxValue, 2 -- 2 for MAX ) RETURN Result

The *MinMax* measure must return a number because otherwise, it cannot be used in the conditional formatting panel. The *MinMax* measure returns 1 only for the cell with the minimum value, 2 for the cell with the maximum value, and blank otherwise.

We configure the conditional formatting of the matrix so that it uses the information provided by the *MinMax* measure to color the cell background with the following rule.

By using this configuration, we obtain the desired result.

We used this technique to highlight the minimum and maximum values in the matrix. In other reports, you can use the same approach to highlight cells based on different business rules. For example, you can highlight the presence of large transactions in the underlying data even though the report only shows aggregated values, or you can highlight a difference with a reference value that goes beyond a predefined threshold.

Because the measure must internally rebuild the exact values shown in the matrix, the measure used for the conditional formatting strongly depends on the matrix where it is being used. For example, if a user removes the *Continent* attribute from the columns and replaces it with the *Year* attribute, then the custom formatting rule stops working or could return unexpected results.

As you can see, simple measures like *MinMax* can make a significant difference in the usability of a report. Therefore, they are the perfect candidate for report measures, built in a specific report for the sole purpose of improving the consumption user experience.

This article describes how to optimize a slow Power BI report with a large number of card visuals, by obtaining the same graphical layout with a smaller number of visuals.

Every visual element in a Power BI report must complete a number of tasks to provide its result. Visuals showing data must generate one or more DAX queries to retrieve the required measures by applying the correct filters. The execution of these queries increases the wait for the end user and increases the workload on the server – especially when multiple users access a report at the same time. In order to improve the performance and scalability of a report, the best practice is to reduce the number of visuals consuming data on any one page of a report.

The focus is on a single page of the report. Power BI only fetches data and builds the visualizations required for the active page of a report. When the user switches their focus to a different page, the waiting time only depends on the visuals of the new page. The content of the other pages of that report is not relevant as far as performance is concerned. The goal is to reduce the number of visuals in a single page of a report. This makes it challenging to achieve the desired report layout; however, we can look for the right visualization once we realize that the number of visuals in the same page is negatively affecting the user experience.

For example, the following report shows one card visual for each box representing the sales of a brand in a different color.

The report contains 33 card visuals, one column chart, and one matrix, each populated by a different DAX query. There are another four text boxes and one image that do not come from DAX queries. The rendering time of this page executed on a very powerful desktop is 2.7 seconds. We consider this to be unbelievably bad performance, since the slowest DAX query in the report runs in 22ms. You can see the exact timing on your computer by enabling the **Performance Analyzer** pane. You then:

- Click on
**Start recording**. - Click
**Refresh visuals**. **Sort by**Total time in a**Descending**

By clicking on **Refresh visuals** again, we could see a different sort order; still, the overall execution time should be similar. In this example, the slowest visual in the page was the “Red – Contoso – Card” which displays 44.50K as the *Sales Amount* of red products for the Contoso brand. By expanding this node in the list provided in the Performance Analyzer pane, you can see that most of the time is spent in “Other”.

The meaning of “Other” is that this visual had to wait for other tasks to be completed before being able to execute the DAX query and render its result. Because the report contains 40 visuals, 35 of which execute at least one DAX query, some visuals must wait for other visuals to complete their task before they are able to perform any action.

You can also export this result in a JSON file using the Export link, and then analyze the events in more detail. For example, the actual cost of the DAX query execution in the Analysis Services (AS) engine is just 123ms, whereas the overall cost of the aggregated DAX query events is 539ms. The difference between these numbers is the overhead generated by 35 requests made to the AS engine. Benchmark data is available in the PerformanceOriginalCards.pbix report included in the samples files.

How can we improve the performance, all the while returning the same layout? We need a single visual that produces the same illustrative effect as many cards. For example, by using Card with States by OKVIZ we can create a grid of cards where a measure is repeated for each value of a column. This custom visual is available for free in the AppSource marketplace, and the latest private version is also available at https://okviz.com/card-with-states/ . Even though the marketplace displays the message “May require additional purchase”, all the features are free. All you need to do is register on the OKVIZ website for your free lifetime license.

We replaced each set of card visuals for one of the colors (red, black, and blue) with a single instance of Card with States. We do this by specifying the corresponding color measure (*Red Sales* for the red color) and the *Brand* column as a Category.

By setting the properties of the visuals properly, we obtain an identical rendering for our report.

The real difference is the performance obtained by reducing the number of visuals in the report. If we repeat the Refresh visuals operation in Performance Analyzer, we see that the slowest visual is refreshed in 310ms and that we now have only 5 visuals being generated by DAX queries.

Because we removed 30 visuals from the report, first we greatly reduced the workload on the AS engine, and second we improved the scalability of our report when executed by multiple concurrent users. Last but not least, the refresh of the page went down from 2.7 seconds to 0.3 seconds, saving 88% in execution time. This is a massive improvement of one order of magnitude, achieved without changing any part of the data model.

If you export that result in a JSON file using the Export link, the detailed analysis (included in PerformanceCardWithStates.pbix) shows the following:

- the actual cost of the DAX query execution in the AS engine is now 12ms (formerly 123ms),
- the overall cost of the aggregated DAX query events is 21ms (formerly 539ms).

If you look at the previous screenshots, you might notice that the rendering time of Card with States is 181ms – see Visual display – which is much more than the original card (13ms). However, the aggregated rendering time of all the visuals in the new report is now 764ms; that same operation in the original report took an aggregated rendering time of 1,331ms.

The following table compares the metrics between the original and the optimized reports.

Original report | Optimized report | |
---|---|---|

Card visual | Card (default) | Card with States |

Number of card visuals | 33 | 3 |

Number of DAX queries | 35 | 5 |

Execute Query (AS) event duration | 123ms | 12ms |

Execute DAX Query event duration | 539ms | 21ms |

Render event duration | 1,331ms | 764ms |

User waiting time duration | 2,700ms | 310ms |

You may wonder how it is possible that the report completes the operations in 310ms – which is half of the total rendering time in the optimized report. You want to keep in mind that Power BI can parallelize the execution of different tasks. There is a limit to this parallelism and 35 visuals cannot be executed and rendered in parallel. When you have only 5 visuals generating queries in a report, the parallelism is more effective.

This article shows that the number of visuals in a report can degrade the performance by one order of magnitude, compared to the results obtained by using a smaller number of visuals. At the time of writing, Power BI does not have a generic “small multiples” feature to automatically create multiple copies of the same visual based on the value of an attribute, such as the *Brand* attribute used in this article. The AppSource marketplace already offers several custom visuals with a specialized “small multiples” option, like Small Multiple Line Chart and Sparkline. The latest version of Card with States by OKVIZ is the first visual providing the “small multiples” option to replace multiple instances of the Card visual, which is a very common source of performance issues in Power BI.

This article describes how to use conditional formatting with a DAX expression to color the rows of a table in Power BI based on the order number instead of using the alternate rows color formatting option.

Power BI offers the capability to show tables with alternating colored rows so to make it easier to read the content of a report.

You might want to use a similar alternate coloring style for a different purpose: highlighting all the lines of one same order. If you look carefully at the figure above, you can discover that there are indeed five orders visible, each with one or more products. But there is no visual indication of the rows belonging to the same order. A better format using alternate background colors would be the following.

The background color of the rows depends on *Sales[Order Number]*. The background color switches between white and light gray every time the order number changes, so all the rows of the same order have the same background color and can be easily identified. You cannot obtain this visualization by only using a Power BI style, because the coloring of a row depends on the actual data in it. You can achieve this goal by using the conditional formatting feature in Power BI. You can set the background color of a cell according to the value of a measure. Therefore, you need a DAX formula that returns two values: one for the white rows and one for the gray rows. The value returned by the measure must alternate between those two values with each consecutive order number.

A possible (but incomplete) solution would be a measure that returns 0 for even order numbers and 1 for odd order numbers. However, this formula does not guarantee that the values are changing for every order. In case you filter two orders with an even number in a row, they would be colored the same. If you look at the figures, all order numbers are even! Nevertheless, the idea of using odd and even order numbers looks promising. We only need to transform the order number into an integer that increases by one for each order, regardless of the actual order number.

The RANKX function in DAX serves exactly this purpose. We can rank the rows in the report by their order number, and then color the table in the report depending on whether the rank is odd or even. Besides, due to its simplicity this formula is a perfect opportunity to expand on one of the lesser known arguments of RANKX: its third parameter.

Let us start with the code:

RankOrder = VAR OrderNumbers = -- By using ALLSELECTED ( Sales[Order Number] ) CALCULATETABLE ( -- we would get all the order numbers VALUES ( Sales[Order Number] ), -- if there were no explicit filter on the ALLSELECTED ( ) -- order numbers. Therefore, we use ) -- VALUES / ALLSELECTED to reduce the number of orders -- evaluated in the ranking. VAR CurrentOrder = SELECTEDVALUE ( Sales[Order Number] ) VAR Result = IF ( HASONEVALUE ( Sales[Order Number] ), -- Check that only one order is visible. RANKX ( OrderNumbers, -- Table to build the lookup values. Sales[Order Number], -- Expression to use during the iteration. CurrentOrder, -- Value to rank. We must use SELECTEDVALUE ASC -- to retrieve the current row in the report. ) ) RETURN Result

The *OrderNumbers* variable stores the order numbers that are visible in the table visualization. Be mindful that we did not use *ALLSELECTED( Sales[Order Number] )* because it would return all the values of the *Sales[Order Number]* column. Indeed, ALLSELECTED as a table function returns the filtered values only if the column is being actively filtered. We retrieve the order numbers visible in the current visual by using the VALUES function evaluated with ALLSELECTED as a CALCULATE modifier.

The *CurrentOrder* variable is important too. Each row of the table visual is showing exactly one order number. Therefore, each cell has a filter context that filters exactly one order. In this scenario, SELECTEDVALUE returns the value of the order number currently being filtered. You cannot omit SELECTEDVALUE, because the *Sales[Order Number]* column reference alone requires a row context which is not present in our formula.

The *Result* variable evaluates the ranking if and only if there is only one order number visible, to avoid any calculation for the total and for any subtotal that groups two or more orders. If there is only one order in the filter context, then RANKX provides the ranking.

If you have used RANKX in the past, you were likely ranking a measure like *Sales Amount*. When you apply RANKX to a measure, you typically only use the first two arguments of RANKX. RANKX evaluates the measure during the iteration over the table provided in the first argument; because of the context transition, RANKX evaluates the expression of the measure in a filter context obtained by the row being iterated. After the iteration, RANKX re-evaluates the same measure in the original filter context to obtain the value to rank.

In our case, we are ranking a column value and not the result of a measure. Therefore, we do not rely on a context transition. During the iteration, RANKX evaluates the *Sales[Order Number]* expression within a row context corresponding to the row being iterated. At the end of the iteration, RANKX must obtain the value to rank, but can no longer use the *Sales[Order Number]* expression because the row context is missing in the original evaluation context.

The third argument of RANKX is now useful: RANKX uses the third argument to get the value to rank. If the third argument is missing, RANKX evaluates the same expression passed in the second argument in the original evaluation context, which is why you have probably never used it. In our example, we provide *CurrentOrder* as the third argument of RANKX, getting the rank of the order number displayed in the report.

You can see the result of the *RankOrder* measure in the following picture.

*RankOrder* is a number increasing by one each time the row shows a different order number, regardless of the actual *Order Number* value. We can create the *BandNumber* measure to transform this number into 0 or 1:

BandNumber := MOD ( [RankOrder], 2 )

The *BandNumber* measure is easy to use in the conditional formatting options of Power BI to change the background of all the columns you want to format. Unfortunately, you have to apply this format to every column shown in the visual because it is not possible to set conditional formatting for the entire row.

The report now shows alternate colors depending on the order number. You can hide the *RankOrder* and *BandNumber* measures; indeed, they are just internal measures whose sole purpose is to improve data visualization.

As you see, DAX is not only useful to compute super-complex expressions. You can also rely on DAX to format your report the way you like, by combining the power of DAX with the flexibility of conditional formatting.

]]>VertiPaq Analyzer is a set of tools and libraries to analyze VertiPaq storage structures for a data model in Power BI and Analysis Services Tabular. In this webinar recorded for powerbi.tips, Marco Russo shows how VertiPaq Analyzer has been integrated into DAX Studio, explaining how to use these features to optimize the size and the performance of a data model.

]]>This article describes in which conditions the precedence of calculation groups might return unexpected results when filtering calculation items in both the visuals and the measures present in a report.

When there are multiple calculation groups in a model, we must define their precedence. This ensures we get the expected result when applying different calculation groups to a measure. More details are available in the Understanding Calculation Group Precedence article.

Even in a well-designed model, there are scenarios where a poor combination of calculation groups precedence, CALCULATE, and context transition in a DAX formula can translate into the results being inaccurate. This article shows how much care needs be taken to work with calculation groups.

You create a model with two calculation groups: *Adder* and *Multiplier*. The purpose of *Adder* is to add a number to the selected measure, whereas *Multiplier* multiplies the selected measure by a factor. Each calculation group contains a single calculation item:

-- -- Calculation Group: Adder -- Calculation Item: Plus5 -- SELECTEDMEASURE () + 5 -- -- Calculation Group: Multiplier -- Calculation Item: Mul10 -- SELECTEDMEASURE () * 10

Following mathematics rules, we want to apply the multiplication before the addition. Therefore, the *Multiplier* calculation group has a higher precedence of 200, whereas *Adder* has a lower precedence of 100. We also define a measure to complete the data model to test in this article: the *Just2* measure simply returns the 2 value:

Just2 := 2

The article requires full knowledge of the precedence in calculation groups and the application of calculation items. Before covering the main topic of this article, we warm-up your brain and refresh your knowledge of calculation groups with four small puzzles, each one with a piece of DAX code and the result. Your job is to explain why DAX computed that result. Try to explain the result before reading further. The explanation is right after each figure, and these puzzles are super helpful to train your brain on how to think about calculation groups.

Let us start with the first test: why is the result 2, as if the calculation items were not being applied?

The answer is that a calculation item is only applied to a measure reference. If there are no measures in the code, the application item has nowhere to be applied. Because the first argument of CALCULATE is just a constant value, the calculation items are not applied at all. No measure, no application.

Warming up? Good. Now it is time for the second puzzle: why is the result 70 and not 25?

The precedence of calculation groups defines the order of ** application** of calculation items. The application is not an evaluation.

-- -- This the initial expression -- CALCULATE ( [Just2], Adder[Addend] = "Plus5", Multiplier[Factor] = "Mul10" ) -- -- The first calculation item being applied is Multiplier -- CALCULATE ( [Just2] * 10, Adder[Addend] = "Plus5" ) -- -- The second calculation item being applied is Adder -- ( [Just2] + 5 ) * 10

The order of the calculation items in CALCULATE does not affect the result. Calculation items are only applied following the precedence of their corresponding calculation groups.

Now, the third difficult question. Can we force the order of evaluation of calculation items by using CALCULATE? Why does not DAX follow the order we try to force by using nested CALCULATE functions?

The answer requires a bit more attention here, although the explanation is the same as the previous ones. A calculation item is applied on a measure reference. The outer calculation item from *Adder* does not have any effect on the inner CALCULATE, because it is applied to the *Just2* measure only inside the inner CALCULATE. Therefore, this code is identical to the previous puzzle. Nesting CALCULATE functions is not a way to change the order of application of calculation items, at least not in this example. Things can be much more complicated if we have multiple measure references instead of a simple CALCULATE. Nevertheless, the goal of this article is not to write more complex code. Instead, we want to show how to correctly understand the result of an expression when calculation groups are involved.

Let us complicate the scenario in the fourth puzzle. We define a new *Just2Times10* measure:

Just2Times10 := CALCULATE ( [Just2], Multiplier[Factor] = "Mul10" )

Then, we run the following query: we finally obtain 25 as a result, but why?

The answer is still the same: calculation items are applied on a measure reference. In this case, the measure reference is no longer *Just2*. Because the CALCULATE function evaluates the *Just2Times10* measure, the calculation item is applied to the *Just2Times10* measure. Therefore, *Adder* works on *Just2Times10* and *Multiplier* kicks in only later, when *Just2Times10* is being evaluated:

-- -- This the initial expression -- CALCULATE ( [Just2Times10], Adder[Addend] = "Plus5" ) -- -- The only calculation item being applied is Adder -- [Just2Times10] + 5 -- -- Just2Times10 invokes another calculation item -- CALCULATE ( [Just2], Multiplier[Factor] = "Mul10" ) + 5 -- -- Only now does Multiplier kick in -- ( [Just2] * 10 ) + 5

The code we have seen so far is only a refresher of calculation groups. This warm-up is the longer part of the article. The end goal is to explain that there is a big difference between using a calculation item in CALCULATE and defining a measure that applies the calculation item. When a measure formula applies a calculation item to another measure reference, it overrides the precedence of the calculation groups – whereas when the same CALCULATE applies multiple calculation items, the DAX engine chooses the precedence based on the precedence of the calculation groups involved.

For example, the following query contains two EVALUATE statements. The first EVALUATE groups by *Adder[Addend]* and evaluates the *Just2Times10* measure, whereas the second EVALUATE evaluates the *Just2* measure grouped by both *Adder[Addend]* and *Multiplier[Factor]*. The results of the two EVALUATE statements are different:

The first EVALUATE groups by *Adder[Addend]. * For this reason, *Just2Times10* is included in an implicit CALCULATE function that applies to each row of the result, a different calculation item from the *Adder* calculation group. The result produced by the *Just2Times10* measure reference is 25, as in our last puzzle.

The second EVALUATE groups by both *Adder[Addend]* and *Multiplier[Factor]*, so there is only one CALCULATE with the two calculation items applied to the *Just2* measure. DAX applies the calculation items according to the precedence of the corresponding calculation groups and the result is 70.

At the risk of sounding pedantic, let us repeat the concept once more. If you define a measure that applies a calculation item, you are not doing the same operation as if you add a calculation item to a slicer or rows or columns of a visual. By using a measure, you enforce the application order. In contrast, by using a report filter, you are letting the engine choose the application order based on the precedence setting of the calculation groups. These are not the same thing.

As you have seen, the scenario is rather complex to analyze and understand. Yet, as a DAX expert, you want to handle this complexity successfully. So approach any model you build by making sure that precedence, measures, and calculations are all fine-tuned to provide a smooth user experience. As soon as users decide to create their own measure in a report by using a calculation group with CALCULATE, they have the power to break your well-engineered calculations.

Multiple calculation groups in a model can be extremely complex to manage. Before using them in your models, make sure you take the time to understand how they work in depth, and to understand what the implications of using them can be. Well, do not just take the time. Invest lots of time! The good news is that calculation groups are extremely powerful once you master them. Remember: with great power comes great responsibility!

]]>This article explains how to control the result provided by a formula in a DAX measure when the meas-ure is being used in a report with granularity below the granularity supported by the formula.

In the article Obtaining accurate totals in DAX, we described the granularity of a formula as being the level of detail that the formula uses to compute its result. If you are not familiar with formula granularity, please read that article first, because this is the natural continuation.

The starting point is a measure computing the number of months where *Sales Amount* is greater than 30,000:

High Months = SUMX ( VALUES ( 'Date'[Calendar Year Month] ), IF ( [Sales Amount] &amp;gt; 30000, 1 ) )

In a report at the month level, the formula produces the expected results.

Nevertheless, when expanded at the date level that same formula produces incorrect figures. As you can see in the next figure, the *High Months* measure returns 1 for those days where *Sales Amount* is greater than 30,000.

The formula is designed to work at the level of the month and above – like quarter and year. Below the month level – like day – the results are inaccurate. At this point, the important question is what the result below the granularity of the formula should be. There are several possible answers; all of them are sort-of correct:

- It is fine as it is, no need to fix it; users should not use the measure at the day level.
- It always needs to show 1, because each individual date belongs to a month that satisfies the condition.
- It must show a blank, because that granularity is not supported.

If you are ok with the first option, then the remaining part of the article is useless. Still, you are quitting the game too early: sooner than later, a user will use and read the values at the unsupported granularity. A real DAX pro does not let their formulas compute an incorrect value. Under. Any. Circumstances.

The second option, which proves to be much better than the first one, requires some deep reasoning about the filter context. For example, when the report is showing data at the day level, the measure must move the granularity of the calculation to the corresponding month level by removing the filter on the day column, while keeping the filter on the month only. The right way to obtain this behavior is by using ALLEXCEPT in CALCULATE:

High Months (2) = SUMX ( VALUES ( 'Date'[Calendar Year Month] ), CALCULATE ( IF ( [Sales Amount] &amp;gt; 30000, 1 ), ALLEXCEPT ( 'Date', 'Date'[Calendar Year Month] ) ) )

ALLEXCEPT removes any filter from the

Datetable, except the filter that is currently being created by the context transition invoked by CALCULATE during the iteration over the months. Please note that in this case ALLEXCEPT is the correct solution – though we often suggest you use ALL/VALUES instead. Indeed, this version of the formula would not work:High Months (Wrong) = SUMX ( VALUES ( 'Date'[Calendar Year Month] ), CALCULATE ( IF ( [Sales Amount] &amp;gt; 30000, 1 ), ALL ( 'Date' ), VALUES ( 'Date'[Calendar Year Month] ) ) )The reason why this formula is wrong is quite subtle: when VALUES is being evaluated, the context transition still has not taken place. Therefore, the filter on the year/month is not effective yet. This is not an issue at the month or day level inside the matrix, but it provides an incorrect result at the grand total level, where no filter is active. ALLEXCEPT being a modifier, it executes after the context transition; it efficiently removes any filter except the one on

Date[Calendar Year Month]introduced by the context transition.

The third and last option requires blanking the value when the granularity of the report is below the granularity of the formula. The DAX code is not too complicated, but the reasoning behind it requires some special attention.

The formula must detect the granularity of the cell in the report before completing the calculation. If the cell is below the supported granularity, the result must be a blank. In order to detect the granularity of the cell, we can compare the number of rows in the *Date* table at the cell granularity (the filter context of the cell being evaluated) with the supported granularity (the granularity defined by the formula).

Because the formula should work at the month granularity, we create two variables. The *DaysAtCurrentGrain* variable contains the number of visible rows. The *DaysAtFormulaGrain* variable contains the number of rows in a new filter context where we enforce the month granularity. If the two variables return the very same number, then DAX is working in a filter context at the month granularity level or above that (such as the year). If the two numbers are different, it means that the formula is computing the result below the supported granularity, and the result must be a blank:

High Months (3) = VAR DaysAtCurrentGrain = COUNTROWS ( 'Date' ) VAR DaysAtFormulaGrain = CALCULATE ( COUNTROWS ( 'Date' ), ALL ( 'Date' ), VALUES ( 'Date'[Calendar Year Month] ) ) VAR IsSafeToCompute = (DaysAtCurrentGrain = DaysAtFormulaGrain) VAR Result = IF ( IsSafeToCompute, SUMX ( VALUES ( 'Date'[Calendar Year Month] ), IF ( [Sales Amount] &amp;gt; 30000, 1 ) ) ) RETURN Result

The *High Months (3)* measure returns a blank whenever the report goes below the supported granularity – guaranteeing that if a number is shown, it is the correct one.

As you can see, computing the right values at any granularity proves to be challenging. Building useful measures in a data model always requires thinking in terms of granularity and accuracy of the results.

DAX offers all the tools that are necessary to obtain the correct result. It is up to you to think in advance about the conditions under which your formula might fail. Once you have identified these critical conditions, you must define what the expected result at the given granularity is, and eventually implement the solution in DAX.

]]>This article describes how to compute visual totals accurately, in case a measure does not provide the right result at the total level but works correctly in other rows of the report.

In simple DAX measures, the total of a report is the sum of its individual rows. For more sophisticated measures, the total might seem wrong because the formula does not include an aggregation over the rows that are visible in the report. For example, if the total of a measure must be the sum of the values displayed in the rows of a report, we consider the expected result a “visual total”, which is a total that corresponds to the visual aggregation of its values sliced by different rows in the report.

Let us elaborate on the topic with a simple example: a measure counts on how many days inside a specific time period, *Sales Amount* is greater than 30,000. The measure below checks the condition and returns 1 if *Sales Amount* is greater than 30,000:

High Days := IF ( [Sales Amount] > 30000, 1 )

The measure works fine in a report that shows one day in each row. However, the sub-totals and the total look wrong because they show 1 instead of the sum of the individual days.

Be mindful that the total is correct, given the definition of the formula. Yet, it does not correspond to the “visual total” we would like to get. The reason is that the measure is evaluated for each cell of the report. The Total row does not correspond to the day level; it is a row that groups all the days into a single calculation. When DAX shows a value that looks wrong, it is not DAX that is not working. DAX is answering a question, perhaps the wrong one! Let us elaborate on this.

The measure checks whether the sales amount is greater than 30,000. It returns 1 if that is the case, and 0 otherwise. Because Sales Amount is greater than 30,000 in the Total row, the formula returns 1 as requested. The same happens in January. DAX answers the wrong question correctly!

The feeling that the total is wrong comes from the visual perception that we – as humans – obtain from the report. Let us look at the same report without the details at the day level.

The results are identical, and we still have the feeling that the yearly total is wrong. The perception is now that the Total should be the sum of the individual months. Not seeing the day rows, we no longer expect the Total to be the sum of individual days.

DAX does not see the report the way we do. Therefore, we need to rephrase the question to obtain the desired result: We modify the formula granularity, which is the level of detail the formula needs to analyze.

The first question to ask yourself is: at what granularity should I check the total? It might be at the day level, the month level, or whatever time period. Let us say we set the granularity at the day level. The question becomes: *at the day level, I want to check if the total is greater than 30,000. At any level above the day level, I want the sum of the values computed at the day level*. Asking the question this way gets us closer to the correct formulation, but it is not perfect just yet.

Before getting to the correct formula, one further step is needed. The above description is bottom-up; we started at the detail level, and then the next step was to provide the aggregated result. In DAX, you need to reason top-down: first the aggregation, then the details. In other words, the same question should be stated as: *for each day in the current time period, sum the result of this formula: if the sales of the day are greater than 30,000 then return 1, otherwise return 0.* The question is the same. This time, we first provide the aggregation, and then we specify the calculation to perform at the detail level.

All this reasoning leads to the right formula:

High Days Correct := SUMX ( 'Date', IF ( [Sales Amount] > 30000, 1 ) )

The formula iterates the *Date* table and computes the result for each day. Finally, it aggregates the daily results by using SUM.

It is worth noting that there is no need to specify a different formula to compute a value at the day and the aggregated levels. Indeed, when a single day is selected, the formula behaves correctly: it only performs one iteration. Over any longer period of time, the number of iterations is the number of days in the period. Moreover, because the formula contains the desired granularity, it works no matter the granularity of the report. For example, the following report shows the right data at the month and the year levels, although the individual days are not visible.

Once the question is phrased the right way, DAX behaves nicely. The granularity of the calculation depends on the definition of the formula. Each business and user might have different requirements. For example, to identify months with sales greater than 30,000, the granularity must be the month: *for each month in the current period, sum the result of this formula: if the sales of that month are greater than 30,000 then return 1, otherwise return 0. *The formula requires the same structure as the previous version; it just replaces the day with the month to perform the calculation at the right granularity.

The formula flows logically:

High Months = SUMX ( VALUES ( 'Date'[Calendar Year Month] ), IF ( [Sales Amount] > 30000, 1 ) )

Indeed, when used in the report at the month level, the results are correct.

The topic of granularity is not yet complete. There is still an issue with the formula above: the problem appears below the granularity supported by the formula. Indeed, the formula works at the month level – but what happens if we expand the month at the day level, showing details where the formula is not designed to work?

The result is correct at the month level, but the *High Months* measure also returns values at the day level. The scenario is similar to the previous one: when the formula works at a granularity that it is not designed for, the result is counterintuitive. Fixing this issue requires other deep reasonings about what to do at a higher level of detail (aka granularity). We save this interesting topic for a new article in the coming weeks.

For now, here is our advice: if the total looks wrong, it might be because the DAX formula does not perform the calculation at the right granularity. Think about the business problem, define the right question, and implement the correct DAX expression.

]]>This article describes how to control format strings in calculation groups. Before starting, we suggest you read the previous articles in this series.

Each calculation item can change the result of a measure through the DAX code in the *Expression* property. The author of a calculation item can also control the display format of the result, overwriting the standard behavior of the measure where the calculation item is applied. The *Format String Expression* property can contain a DAX expression that returns the format string to use after applying a calculation item to a measure reference.

For example, the dynamic format string can display the result of the **YOY%** calculation item as a percentage – that is the year-over-year as a percentage.

In this specific example, we are looking at sales volumes. The *Format String Expression* property of the **YOY%** calculation item is always a percentage, regardless of the original format of the underlying measure.

The model in this example has a **Currency Conversion** calculation group that applies a currency conversion exchange to all the measures representing currency amounts. The details of the calculation are not important for the purpose of this article. Instead, we focus on the goal of displaying the converted amount by using a different format for each currency available in the report.

The conversion should not affect measures that are not displaying sales values. For example, in the following report only the *Sales Amount* and *Margin* measures get the currency conversion and the corresponding format, whereas *# Quantity* and *Margin %* are not affected by the **Currency Conversion** calculation group.

The **Report Currency** calculation item applies a custom format string depending on the selected currency, checking whether the measure selected should be converted or not. This is done by the DAX expression included in the *Format String Expression* property.

The complete code of the Format String Expression property is the following:

VAR MeasureName = SELECTEDVALUE ( Metric[Metric], SELECTEDMEASURENAME () ) VAR SkipConversion = ( SEARCH ( "#", MeasureName, 1, 0 ) > 0 ) || ( SEARCH ( "%", MeasureName, 1, 0 ) > 0 ) VAR CurrencyFormat = SELECTEDVALUE ( 'Currency'[Currency Format], "#,0.00" ) RETURN IF ( SkipConversion, SELECTEDMEASUREFORMATSTRING (), CurrencyFormat )

The *MeasureName* variable retrieves the measure selected by looking at the *Metric[Metric]* selection, because the **Metric** selection can override the measure selected in the report. If the **Metric** calculation group is not used or does not have a valid selection, then the SELECTEDMEASURENAME function returns the name of the active measure.

The *SkipConversion* variable controls whether the measure should be converted based on the measure name.

The *CurrencyFormat* variable retrieves the format string from the corresponding column in the *Currency* table, where only one value should be selected to ensure the currency conversion can take place. In case the currency conversion is not needed, the SELECTEDMEASUREFORMATSTRING function returns the format string of the current measure. This format string may already have been defined by another calculation item with higher precedence.

There are three calculation groups in the model used in this example, with the following precedence order:

**Metric**: 10**Time Intelligence**: 20**Currency Conversion**: 30

The calculation group with the highest precedence order is applied first. This is valid for both the *Expression* and the *Format String Expression* properties of calculation items.

This is important considering how the *MeasureName* variable is evaluated in the **Report Currency** calculation item. Because **Report Currency** has a higher precedence, it is applied to the measure reference when the **Metric** calculation group is still in the filter context and has not yet been applied. For this reason it is possible to retrieve the current selection of *Metric[Metric]* from the filter context. In this case this technique is required to make sure that the behavior of the **Report Currency** calculation item is consistent regardless of the technique used to choose the measure to display in a report: both direct measure selection or **Metric** calculation group selection should work seamlessly.

The *Format String Expression* property enables the control of the format string of any measure in a dynamic way. The SELECTEDMEASUREFORMATSTRING function allows access to the existing format string – that could be the one originally defined by the measure or the one already modified by another calculation item. Controlling the format string is vital to provide a good user experience when using calculation items that can change the numeric representation of the result of a measure.

I am glad to announce that we just released an additional module about Calculation Groups to the Mastering DAX video course. All current students of the video course will immediately be getting access to this new section.

We covered calculation groups in The Definitive Guide to DAX book published last year. Yet we waited until the release of the feature in Analysis Services 2019 before recording the video on this section. This way, we were able to show the user interface required to create calculation groups in Tabular Editor. This was not a priority in the book, whose purpose is to explain the language and not the user interface. For avid readers, we have an entire section dedicated to Calculation Groups on SQLBI.com. The content is more updated than the book, especially the references to the user interface in Tabular Editor.

We decided not to show examples using SQL Server Data Tools for Visual Studio as an editor. The reason is that the user interface to manage calculation groups is far from ideal. You can safely edit a BIM file to add calculation groups in Tabular Editor and go back to Visual Studio if you want. If you want more details about the user interface in Visual Studio you should turn to the official calculation groups page by Microsoft.

When will this feature be available on Power BI Desktop? We hope very soon, but in the meantime continue to push the request with Microsoft by voting to Support creating calculation groups in Power BI Desktop.

**UPDATE 2020-03-28**

Just a few hours after our announcement, Christian Wade (Principal Program Manager in Microsoft) announced that Calculation Groups will be available in Power BI Desktop by this summer (2020). During the presentation of XMLA endpoint for Power BI Premium, Christian revealed that Power BI Desktop is going to be more integrated with external tools such as DAX Studio and Tabular Editor. As a consequence, Calculation Groups will be available in edit mode in Power BI Desktop by using Tabular Editor. And there is more, watch the entire Tabular Editor demo from minute 00:34 to 00:47.

A common requirement in authoring DAX code is to provide a default value for the result of a calculation that might otherwise be blank. The classic approach to solving this requirement is by using an IF statement. For example, the following code returns the value of either the *Sales[Net Price]* column or the *Sales[Unit Price]* column, whichever is first to not be blank:

IF ( ISBLANK ( Sales[Net Price] ), Sales[Unit Price], Sales[Net Price] )

When the expressions considered are measures instead of column references, it is a best practice to save the result of the first measure in a variable. The variable improves the performance by ensuring that the measure is evaluated only once:

VAR MyResult = [MyMeasure] VAR DefaultValue = [Default Measure] VAR Result = IF ( ISBLANK ( MyResult ), DefaultValue, MyResult ) RETURN Result

The alternative to this expression requires two references to the same *MyMeasure* measure, which could result in a double evaluation of the same expression to obtain the same result. The following expression evaluates *MyMeasure* a first time to test whether its value is blank or not, and then a second time to determine the result of the IF function if *MyMeasure* returns a non-blank value:

IF ( ISBLANK ( [MyMeasure] ), [Default Measure], [MyMeasure] )

By using the COALESCE function the code is more readable. COALESCE implements the same logic as in the previous examples, evaluating its arguments and returning the first argument that is not a blank:

COALESCE ( [MyMeasure], [DefaultMeasure] )

COALESCE accepts multiple arguments and returns the first argument that is not blank. If the first argument is blank, COALESCE returns the value provided by the expression in the second argument, and so on. If all the arguments are blank, COALESCE returns blank as well.

For example, consider a measure showing the Price of a product, using the average of *Sales[Net Price]* or using the *Product[Unit Price]* column if there are no sales transactions. Without COALESCE, the measure looks like this:

Price = VAR AvgPrice = AVERAGE ( Sales[Net Price] ) VAR UnitPrice = SELECTEDVALUE ( 'Product'[Unit Price] ) VAR Result = IF ( ISBLANK ( AvgPrice ), UnitPrice, AvgPrice ) RETURN Result

By using COALESCE, the code becomes much simpler:

Price = VAR AvgPrice = AVERAGE ( Sales[Net Price] ) VAR UnitPrice = SELECTEDVALUE ( 'Product'[Unit Price] ) VAR Result = COALESCE ( AvgPrice, UnitPrice ) RETURN Result

Performance-wise, COALESCE implements an IF condition. For this reason, you should not expect the latter code based on COALESCE to be faster than the former example based on an IF . Nevertheless, using COALESCE improves readability and this is already a great reason to start using it whenever needed. Moreover, the behavior of COALESCE may be further optimized in future versions of the DAX engine, so that the COALESCE function also ends up improving performance.

]]>**2020-03-15 UPDATE**: The original version of this article published in 2011 did not mention SELECTCOLUMNS, which was introduced in 2015. This article was rewritten in 2020 to provide updated coverage of projection functions in DAX.

The projection is provided by this classic SELECT in SQL:

SELECT * FROM Product

It corresponds to this DAX query:

EVALUATE 'Product'

A common projection consists in selecting just a few columns from the source table. For example, the following SQL query only fetches 3 columns from the Product table:

SELECT [ProductKey], [Product Name], [Unit Price] FROM Product

In DAX you can obtain the same result by using the SELECTCOLUMNS function, which requires you to specify the column name for each column of the result:

EVALUATE SELECTCOLUMNS ( 'Product', "ProductKey", 'Product'[ProductKey], "Product Name", 'Product'[Product Name], "Unit Price", 'Product'[Unit Price] )

Another common projection in SQL adds one or more columns to all the columns of a table. For example, the following SQL query adds the *Unit* *Margin* column to all the columns of the *Product* table:

SELECT *, [Unit Price] - [Unit Cost] AS [Unit Margin] FROM Product

You can get the same result in DAX by using the ADDCOLUMNS function:

EVALUATE ADDCOLUMNS ( 'Product', "Unit Margin", 'Product'[Unit Price] - 'Product'[Unit Cost] )

Both SELECTCOLUMNS and ADDCOLUMNS keep the duplicated rows included in the original table expression. Applying a DISTINCT condition can be done in one of two ways. Consider the following SQL query:

SELECT DISTINCT [ProductKey], [Product Name], [Unit Price] FROM Product

The more efficient way to get the same result is by using the SUMMARIZE function:

EVALUATE SUMMARIZE ( 'Product', 'Product'[ProductKey], 'Product'[Product Name], 'Product'[Unit Price] )

In case SUMMARIZE cannot be used over a complex table expression used as first argument instead of the simple *Product* table reference of this example, you could apply DISTINCT to the result of SELECTCOLUMNS . However, the following expression should only be used if SUMMARIZE cannot be used:

EVALUATE DISTINCT ( SELECTCOLUMNS ( 'Product', "ProductKey", 'Product'[ProductKey], "Product Name", 'Product'[Product Name], "Unit Price", 'Product'[Unit Price] ) )

By using SUMMARIZE you cannot change the column names. If you need to rename a column it is advisable to use a SELECTCOLUMNS consuming the result of a SUMMARIZE, in order to achieve the best possible performance. For example, consider the following SQL query:

SELECT DISTINCT [ProductKey] AS [Key], [Product Name] AS [Name], [Unit Price] AS [Price] FROM Product

The corresponding DAX version is the following:

EVALUATE SELECTCOLUMNS ( SUMMARIZE ( 'Product', 'Product'[ProductKey], 'Product'[Product Name], 'Product'[Unit Price] ), "Key", 'Product'[ProductKey], "Name", 'Product'[Product Name], "Price", 'Product'[Unit Price] )

Only if SUMMARIZE cannot be used, should you resort to this alternative, slower technique:

EVALUATE DISTINCT ( SELECTCOLUMNS ( 'Product', "Key", 'Product'[ProductKey], "Name", 'Product'[Product Name], "Price", 'Product'[Unit Price] ) )

In general, we always suggest using SUMMARIZE as an equivalent of SELECT DISTINCT in SQL, whereas ADDCOLUMNS and SELECTCOLUMNS are the DAX functions to use to get a projection without removing duplicated rows in the result.

]]>This article describes how to implement currency conversion for reporting purposes in Power BI.

Currency conversion applied to reporting can include many different scenarios.

- Data in multiple currencies, report with a single currency
- Data in multiple currencies, report with multiple currencies
- Data in a single currency, report with multiple currencies

The rule of thumb is to apply the currency exchange conversion upfront. Therefore, it is a good idea to solve all the scenarios requiring a single currency in the report by converting all amounts at the time of data import into the data model. When you need to create a report in multiple currencies, computing data in advance could be challenging and expensive. Therefore, a dynamic solution based on DAX measures and a properly designed data model makes more sense.

In this article, we only consider the third scenario, “Data in a single currency, report with multiple currencies”. The second scenario could be implemented by transforming data so that it is imported in the model in a single currency, moving the challenge over to the very scenario we describe in this article. An extended description of the three scenarios with other dynamic solutions is included in a chapter of the Analyzing Data with Microsoft Power BI and Power Pivot for Excel book.

The data model we consider includes a *Sales* table with all transactions recorded in USD as a currency. There is also a hidden *ExchangeRate* table that contains the exchange rate for every currency in every month, assigned to the first day of each month. The *AverageRate* column has the value we consider the average exchange rate of the month that must be applied as a conversion rate to all the transactions included in the same month. The following diagram shows the tables that are relevant to this description.

The complete model also includes other tables such as *Product* and *Customer*. The following report shows the *Sales Amount* measure for the sales of Cell phones in 2008. All the amounts are in USD, which is the single currency used to record transactions in the *Sales* table.

The requirement is to enable the user to choose a different reporting currency by using a slicer, obtaining something like the following report where the selected reporting currency is EUR.

The *Sales Currency* measure applies the required currency conversion to the result of the *Sales Amount* measure. In order to achieve good performance in DAX, it is better to aggregate all amounts found in the same currency and exchange rate, and then apply the conversion rate to that total, than to apply the conversion to every single transaction. Because the exchange rate is at the monthly granularity in this scenario, the *Sales Currency* measure is implemented by grouping *Sales Amount* by month, as shown in the following code:

Sales Currency := IF ( ISCROSSFILTERED ( 'Currency' ), VAR SelectedCurrency = SELECTEDVALUE ( 'Currency'[Currency Code] ) VAR DatesExchange = SUMMARIZE ( ExchangeRate, 'Date'[Calendar Year Month Number], 'ExchangeRate'[AverageRate] ) VAR Result = IF ( NOT ISBLANK ( SelectedCurrency ), IF ( SelectedCurrency = "USD", [Sales Amount], SUMX ( DatesExchange, [Sales Amount] * 'ExchangeRate'[AverageRate] ) ) ) RETURN Result, [Sales Amount] )

The *DatesExchange* variable creates a table in memory that has one row for every month. The assumption is that there is a single exchange rate for that month and a single currency selected. If this is not the case, this measure will return inaccurate numbers. It is thus important you validate the assumptions before implementing this calculation in your model.

The *Result* variable computes the conversion by – for each month – applying the exchange rate to the result of *Sales Amount* for that month. In case there is a multiple selection of currencies, the result is blank, whereas the value of *Sales Amount* is returned without performing any conversion in case *SelectedCurrency* is USD.

You might want to modify the business logic of *Sales Currency* to better adapt the formula to your specific requirements. For example, in case the user selects two or more currencies, the current implementation returns blank, but you might want to raise an error or display a different result in that case.

Defining the currency conversion with a specific measure like *Sales Currency* has two disadvantages:

**Duplicated code and measures**: The currency conversion should be applied to all the measures displaying a currency amount. If you have 3 measures (*Sales Amount*,*Total Cost*, and*Margin*) you have to create another three measures to display the value with the required reporting currency.**Format string**: The format string of the result of the*Sales Currency*measure does not include the currency symbol. When dealing with a report with a currency selected, it could be hard to understand the currency used for the report by just looking at the numbers.

By implementing the currency conversion in a calculation group, it is possible to solve both problems. At the time of writing, calculation groups are not available in Power BI. Thus, currently the remaining part of this article can only be implemented in Azure Analysis Services or Analysis Services 2019.

The expression of a calculation item that applies the currency conversion to any measure is similar to the definition of the *Sales Currency* measure. It just replaces the *Sales Amount* measure reference with a call to the SELECTEDMEASURE function:

-- -- Calculation Group: Currency Conversion -- Calculation Item : Report Currency -- VAR SelectedCurrency = SELECTEDVALUE ( 'Currency'[Currency Code], "USD" ) VAR DatesExchange = ADDCOLUMNS ( SUMMARIZE ( ExchangeRate, 'Date'[Calendar Year Month Number] ), "@ExchangeAverageRate", CALCULATE ( SELECTEDVALUE ( 'ExchangeRate'[AverageRate] ) ) ) VAR Result = IF ( ISBLANK ( SelectedCurrency ) || SelectedCurrency = "USD", SELECTEDMEASURE (), -- Single Currency non-US selected SUMX ( DatesExchange, SELECTEDMEASURE () * [@ExchangeAverageRate] ) ) RETURN Result

Because this conversion could be applied to any measure, including percentages and non-currency measures such as *# Quantity*, it is a good practice to check whether the measure should be converted or not. In order to minimize future maintenance requirements when new measures are added to the data model, we can define a naming convention for measure names:

**#**– If the measure name includes “#” then the measure represents a quantity and not a currency value. The currency conversion is not applied.**%**– If the measure name includes “%” then the measure represents a percentage and not a currency value. The currency conversion is not applied.

The calculation item definition is longer but more flexible this way:

-- -- Calculation Group: Currency Conversion -- Calculation Item : Report Currency -- VAR MeasureName = SELECTEDMEASURENAME () VAR SkipConversion = NOT ISCROSSFILTERED ( 'Currency' ) || ( SEARCH ( "#", MeasureName, 1, 0 ) > 0 ) || ( SEARCH ( "%", MeasureName, 1, 0 ) > 0 ) RETURN IF ( SkipConversion, [Sales Amount], VAR SelectedCurrency = SELECTEDVALUE ( 'Currency'[Currency Code] ) VAR DatesExchange = SUMMARIZE ( ExchangeRate, 'Date'[Calendar Year Month Number], 'ExchangeRate'[AverageRate] ) VAR Result = IF ( NOT ISBLANK ( SelectedCurrency ), IF ( SelectedCurrency = "USD", [Sales Amount], SUMX ( DatesExchange, [Sales Amount] * 'ExchangeRate'[AverageRate] ) ) ) RETURN Result )

By using the calculation groups, it is also possible to control the format string in a dynamic way through the Format String Expression property of the calculation item. For example, the following expression uses the *Currency[Currency Format]* column as a format string if there is a selection on the *Currency* table:

-- -- Calculation Group: Currency Conversion -- Calculation Item : Report Currency -- Format String Expression -- VAR MeasureName = SELECTEDMEASURENAME() VAR SkipConversion = NOT ISCROSSFILTERED ( 'Currency' ) || (SEARCH ( "#", MeasureName, 1, 0 ) > 0) || (SEARCH ( "%", MeasureName, 1, 0 ) > 0) VAR CurrencyFormat = SELECTEDVALUE ( 'Currency'[Currency Format], "#,0.00" ) VAR Result = IF ( SkipConversion, SELECTEDMEASUREFORMATSTRING(), CurrencyFormat ) RETURN Result

Using the previous format string expression, we get the following result.

Currency conversion should be applied to the imported data whenever possible, thus reducing the complexity of the calculation required at query time. When it is necessary to select the reporting currency at query time, pre-calculating all the reporting currencies could be impractical, so a dynamic currency conversion can be implemented in a DAX measure. You can also implement the currency conversion in a calculation group in order to avoid the proliferation of measure (with and without currency conversion) and to dynamically modify the format string of a measure.

]]>This article describes the types of relationships available in Power BI and Analysis Services, clarifying the differences in cardinality and filter propagation of physical relationships.

Power BI and Analysis Services rely on a semantic model based on the Tabular Object Model (TOM). Even though there are different tools to manipulate these models – we suggest using Tabular Editor – the underlying concepts are identical because the engine and the data model is the same: we call it the Tabular model. This article is based on a section of The Definitive Guide to DAX (second edition) and on the Different types of many-to-many relationships in Power BI video.

In a Tabular model, a physical relationship connects two tables through an equivalence over a single column. Under certain conditions that we will be describing later, a relationship can define a constraint on the content of a table, even though such constraint is different from the “foreign key constraint” many SQL developers are used to. However, the purpose of a relationship in a Tabular model is to transfer a filter while querying the model. Because transferring a filter is also possible by writing DAX code, we define two categories of relationships:

**Virtual relationships**: these are relationships that are not defined in the data model but can be described in the logical model. Virtual relationships are implemented through DAX expressions that transfer the filter context from one table to another. A virtual relationship can rely on more than one column and can be related to a logical expression that is not just a column identity. Virtual relationships can be used to define multi-column relationships and relationships based on a range.**Physical relationships**: these are the relationships defined in a Tabular model. The engine automatically propagates the filter context in a query according to the filter direction of the active relationship. Inactive relationships are ignored, but there are DAX functions that can manipulate the state and the filter direction of physical relationships: USERELATIONSHIP and CROSSFILTER.

In this article we first show virtual relationships. We then discuss in detail the different types of physical relationships, also introducing the concept of weak relationships. Finally, we introduce many-to-many relationships, describing the different kinds of many-to-many relationships available in a model and in Power BI, thus clarifying the differences between the available options and briefly discussing performance implications.

A virtual relationship is a filter applied to the filter context in a CALCULATE or CALCULATETABLE function. In order to propagate a filter from A to B, you have to read the values active in the filter context on A and apply a corresponding filter to B.

For example, imagine having to transfer the filter from *Customer* to *Sales*. The optimal way to do this is by using TREATAS:

CALCULATE ( [measure], TREATAS ( VALUES ( Customer[CustomerKey] ), -- Read filter from Customer Sales[CustomerKey] -- Apply filter to Sales ) )

The TREATAS function can also implement a logical relationship based on multiple columns. For example, the following code transfers the filter from a *Sales* table over to a *Delivery* table using *Order Number* and *Order Line Number*:

CALCULATE ( [measure], TREATAS ( SUMMARIZE ( Sales, Sales[Order Number], -- Read filter from Sales Sales[Order Line Number] -- based on multiple columns ), Delivery[Order Number], -- Apply filter to Delivery Delivery[Order Line Number] -- specifying corresponding columns ) )

**NOTE**: creating a virtual relationship based on multiple columns is not optimal from a performance standpoint. A physical, single-column relationship is orders of magnitude faster than a virtual relationship.

Because TREATAS is a relatively new function, there are other common techniques used in order to implement virtual relationships. We include these alternatives as a reference; Remember that TREATAS is the preferred way to create a virtual relationship based on column equivalency.

The first alternative technique is using INTERSECT:

CALCULATE ( [measure], INTERSECT ( ALL ( Sales[CustomerKey] ), -- Apply filter to Sales VALUES ( Customer[CustomerKey] ), -- Read filter from Customer ) )

The second alternative technique is using IN:

CALCULATE ( [measure], Sales[CustomerKey] IN VALUES ( Customer[CustomerKey] ) )

Another alternative technique is using FILTER and CONTAINS, which is older and slower than INTERSECT:

CALCULATE ( [measure], FILTER ( ALL ( Sales[CustomerKey] ), -- Apply filter to Sales CONTAINS ( VALUES ( Customer[CustomerKey] ), -- Read filter from Customer Customer[CustomerKey], Sales[CustomerKey] ) ) )

A virtual relationship can be based on a range, like the technique used to implement a dynamic segmentation:

CALCULATE ( [measure], FILTER ( ALL ( Sales[Net Price] ), -- Apply filter to Sales NOT ISEMPTY ( -- filtering prices FILTER ( -- within the active ranges PriceRange, PriceRange[Min Price] <= Sales[Net Price] && Sales[Net Price] < PriceRange[Max Price] ) ) ) )

A relationship can be *strong *or *weak*. In a ** strong** relationship the engine knows that the one-side of the relationship contains unique values. If the engine cannot check that the one-side of the relationship contains unique values for the key, then the relationship is

A weak relationship is not used as part of table expansion. Power BI has been allowing composite models since 2018; In a composite model, it is possible to create tables in a model containing data in both Import mode (a copy of data from the data source is preloaded and cached in memory using the VertiPaq engine) and in DirectQuery mode (the data source is only accessed at query time).

A single data model can contain tables stored in VertiPaq and tables stored in DirectQuery. Moreover, tables in DirectQuery can originate from different data sources, generating several DirectQuery data islands.

In order to differentiate between data in VertiPaq and data in DirectQuery, we say that data is either in the *continent *(VertiPaq) or in the *islands *(DirectQuery data sources).

The VertiPaq store is nothing but another data island. We call it the continent only because it is the most frequently used data island.

A relationship links two tables. If both tables belong to the same island, then the relationship is an intra-island relationship. If the two tables belong to different islands, then it is a cross-island relationship. Cross-island relationships are always weak relationships. Therefore, table expansion never crosses islands.

Relationships have a cardinality, of which there are three types. The difference between them is both technical and semantical. Here we do not cover the reasoning behind those relationships because it would involve many data modeling digressions that are outside of the scope of this article. Instead, we need to cover the technical details of physical relationships and the impact they have on the DAX code.

These are the three types of relationship cardinality available:

**One-to-many relationships:**This is the most common type of relationship cardinality. On the one-side of the relationship the column must have unique values; on the many-side the value can (and usually does) contain duplicates. Some client tools differentiate between one-to-many relationships and many-to-one relationships. Still, they are the same type of relationship. It all depends on the order of the tables: a one-to-many relationship between*Product*and*Sales*is the same as a many-to-one relationship between*Sales*and*Product*.**One-to-one relationships:**This is a rather uncommon type of relationship cardinality. On both sides of the relationship the columns need to have unique values. A more accurate name would be “zero-or-one”-to-“zero-or-one” relationship because the presence of a row in one table does not imply the presence of a corresponding row in the other table.**Many-to-many relationships:**On both sides of the relationship the columns can have duplicates. This feature was introduced in 2018, and unfortunately its name is somewhat confusing. Indeed, in common data modeling language “many-to-many” refers to a different kind of implementation, created by using pairs of one-to-many and many-to-one relationships. It is important to understand that in this scenario many-to-many does not refer to the many-to- many relationship, but instead to the many-to-many cardinality of the relationship.

In order to avoid ambiguity with the canonical terminology which uses many-to-many for a different kind of implementation, we use acronyms to describe the cardinality of a relationship:

- One-to-many relationships: We call them
**SMR**, which stands for Single-Many-Relationship. - One-to-one relationships: We use the acronym
**SSR**, which stands for Single-Single-Relationship. - Many-to-many relationships: We call them
**MMR**, which stands for Many-Many-Relationship.

Another important detail is that an MMR relationship is always weak, regardless of whether the two tables belong to the same island or not. If the developer defines both sides of the relationship as the many-side, then the relationship is automatically treated as a weak relationship with no table expansion taking place.

In addition, each relationship has a cross-filter direction. The cross-filter direction is the direction used by the filter context to propagate its effect. The cross-filter can be set to one of two values:

**Single:**The filter context is always propagated in one direction of the relationship and not the other way around. In a one-to-many relationship, the direction is always from the one-side of the relationship to the many-side. This is the standard and most desirable behavior.**Both:**The filter context is propagated in both directions of the relationship. This is also called a**bidirectional cross-filter**and sometimes just a bidirectional relationship. In a one-to-many relationship, the filter context still retains its feature of propagating from the one-side to the many-side, but it also propagates from the many-side to the one-side.

The cross-filter directions available depend on the type of relationship.

- In an
**SMR**relationship one can always choose single or bidirectional. - An
**SSR**relationship always uses bidirectional filtering. Because both sides of the relationship are the one-side and there is no many-side, bidirectional filtering is the only option available. - In an
**MMR**relationship both sides are the many-side. This scenario is the opposite of the SSR relationship: Both sides can be the source and the target of a filter context propagation. Thus, one can choose the cross-filter direction to be bidirectional, in which case the propagation always goes both ways. Or if the developer chooses single propagation, they also must choose which table to start the filter propagation from. As with all other relationships, single propagation is the best practice.

The following table summarizes the different types of relationships with the available cross-filter directions, their effect on filter context propagation, and the options for weak/strong relationship.

Type of Relationship |
Cross-filter Direction |
Filter Context Propagation |
Weak & Strong Type |

SMR | Single | From one-side to many-side | Weak if cross-island, strong otherwise |

SMR | Both | Bidirectional | Weak if cross-island, strong otherwise |

SSR | Both | Bidirectional | Weak if cross-island, strong otherwise |

MMR | Single | Must choose the source table | Always weak |

MMR | Both | Bidirectional | Always weak |

When two tables are linked through a strong relationship, the table on the one-side might contain the additional blank row in case the relationship is invalid –VertiPaq Analyzer reports a Referential Integrity violation when this happens. Thus, if the many-side of a strong relationship contains values not present in the table on the one-side, then a blank row is appended to the table on the one-side. The additional blank row is never added to tables involved in a weak relationship.

A physical relationship can automatically propagate a filter in the filter context based on the filter propagation direction. Moreover, a relationship defines a unique constraint on columns that are on the one-side of a relationship.

A logical many-to-many relationship between two tables representing two different business entities –also known as dimension tables in dimensional modeling – cannot be implemented using a single physical relationship.

For example, consider the case of two tables, *Customer* and *Account* in a bank. Every customer can have multiple accounts, and every account can be owned by several customers. The relationship between the *Customer* and *Account* tables requires a third table, *AccountsCustomers*, which contains one row for each existing relationship between *Account* and *Customer*.

The logical many-to-many relationship between Customer and Account is implemented through two physical SMR relationships. In order to enable the propagation of the filter context from *Customer* to *Account*, a bidirectional filter must be activated in the data model on the relationship between *Customer* and *AccountsCustomers*.

However, in order to avoid ambiguity caused by bidirectional filters in the data model, it is considered best practice to enable the bidirectional filter only in DAX measures that require that filter propagation. For example, if the relationship between *Customer* and *AccountsCustomers* is defined with a single direction filter, the bidirectional filter can be activated using CROSSFILTER as in the following measure:

SumOfAmt := CALCULATE ( SUM ( Transactions[Amount] ), CROSSFILTER ( AccountsCustomers[AccountKey], Accounts[AccountKey], BOTH ) )

It is important to note that a many-to-many relationship between business entities does not use any MMR relationship, which is required to solve another modeling issue related to different granularities.

A logical relationship can exist between two tables whose granularity is not compatible with a physical SMR relationship. For example, consider the following model where the granularity of the *Budget* table is defined at the *Customer[CountryRegion]* and *Product[Brand]* level. The *Customer* table has one row for each customer, so there could be multiple rows with the same value for the *CountryRegion* column. Similarly, there is one row for each *Product* and several rows can have the same value for the *Brand* column.

The cardinality of *Product* and *Customer* is the right one for the *Sales* table. In order to create a report that aggregates all the rows in *Budget* for a given product brand or customer country, one option is to apply two virtual relationships in a measure:

Budget Amount := CALCULATE ( SUM ( Budget[Budget] ), TREATAS ( VALUES ( 'Product'[Brand] ), Budget[Brand] ), TREATAS ( VALUES ( Customer[CountryRegion] ), Budget[CountryRegion] ) )

However, any technique that transfers the filter using a virtual relationship implemented through a DAX expression might suffer a performance penalty compared to a solution based on physical relationships.

In order to use the more efficient SMR relationships, we can create a *CountryRegions* calculated table that contains all the values existing in either *Customer[CountryRegion]* or *Budget[CountryRegion]*, and a *Brands* calculated table containing all the values existing in either *Product[Brand]* or *Budget[Brand]*:

Countries = DISTINCT ( UNION ( DISTINCT ( Customer[CountryRegion] ), DISTINCT ( Budget[CountryRegion] ) ) ) Brands = DISTINCT ( UNION ( DISTINCT ( Product[Brand] ), DISTINCT ( Budget[Brand] ) ) )

The *CountryRegions* and *Brands* tables can be connected using simple SMR relationships to the *Customer*, *Budget*, and *Product* tables. By enabling the bidirectional filter on the relationship between *Product* and *Brands*, we materialize through two physical relationships the virtual relationship that transfers the filter from *Product* to *Budget* using the *Brand* column in both tables. In a similar way, *Customer* transfers the filter to *Budget* through the bidirectional filter between *Customer* and *CountryRegions*.

Although we recommend not to use bidirectional filters in a data model, this is one of the few cases where this practice is safe. Indeed, the calculated tables and two physical SMR relationships we created are just an artifact to implement a virtual relationship between two tables; the resulting virtual relationships transfer the filter in a single direction: *Customer* filters *Budget* and *Product* filters *Budget*. There are no ambiguities. It is also important to note that the *Brands* and *CountryRegions* calculated tables do not add any new information to the data model. They are simply a way to materialize a table whose cardinality allows the creation of two SMR relationships for each virtual relationship we need.

The following picture shows the granularity of the three tables involved in the virtual relationship between *Product* and *Budget*.

The virtual relationship between *Product* and *Budget* does not have a one-side. We can say that such a virtual relationship has a many-to-many cardinality. The artifact we just created can be obtained using an MMR relationship with a single filter, as shown in the following model.

The relationship between *Product* and *Budget* must be defined with a many-to-many cardinality and a Single cross-filter direction. Power BI also displays a warning when you create an MMR relationship. It is important to keep the cross-filter direction as Single in order to avoid ambiguities in the data model.

Even though the MMR relationship is a single entity in the data model, its performance is not as good as the artifact that requires one calculated table and two SMR relationships. Indeed, there are no indexes created to support an MMR relationship in the storage engine. While an MMR relationship is faster than a virtual relationship implemented using TREATAS, it is not as good as an artifact based on two SMR relationships. The difference can be ignored when the cardinality of the column used in the relationship is in the range of 10 to 100 unique values. When there are thousands or more unique values in the column defining a relationship, you should consider creating the calculated table and the two SMR relationships instead of an MMR relationship.

The Tabular model allows users to create two different types of many-to-many relationships:

- The
**many-to-many relationships between dimensions**: these are the “classic” many-to-many relationships between dimensions in a dimensional model. These relationships require a bridge table containing data coming from the data source. The bridge table holds the information that defines the existing relationships between the entities. Without the bridge table, the relationships cannot be established. - The
**relationships at different granularities**: these relationships use columns that do not correspond to the identity of the table, so the cardinality is “many” on both ends of the virtual relationship. A physical implementation of this type of relationship can use two SMR relationships connected to an intermediate table populated with the unique values of the column defining the relationship. The Tabular model enables users to create this relationship using an MMR relationship, which is always a weak relationship.

The “classic” many-to-many relationship is implemented using two SMR relationships in the order one-many/many-one. In order to implement a filter propagation from *Customer* to *Account*, the bidirectional filter must be enabled on the relationship connecting the *Account* table to the bridge table – *AccountsCustomers* in this example.

The relationship at different granularities is implemented using two SMR relationships in the order many-one/one-many, or by using a single MMR relationship in a Tabular model. In both cases, the virtual relationship should propagate the filter in a single direction. If the relationship is implemented using two SMR relationships, in order to propagate the filter from *Product* to *Budget*, the bidirectional filter must be enabled on the relationship connecting the Product table to the intermediate table – *Brands* in this example.

A single MMR relationship can simplify the creation of a relationship at different granularities. It is considered best practice to specify the correct direction of the relationship, which should be Single. By default, Power BI creates these relationships with a bidirectional filter propagation, which can create ambiguities in the data model.

There are different ways to implement a logical relationship in a Tabular model. DAX provides flexible techniques to implement any kind of virtual relationship between entities, but only physical relationships in the data model provide the best results in terms of performance. Filter propagation direction and granularity of the relationships are key concepts to create the proper physical implementation of a logical relationship.

The classic notion of “many-to-many” relationships in Dimensional modeling does not correspond to what is referred to as a “many-to-many cardinality relationship” in a Tabular model. The latter is just an alternative to an artifact – based on two physical relationships and a calculated table – that enables users to define a relationship between entities with a granularity different from the granularity of the tables. For this reason, when referring to a Tabular model it is better to use the acronyms SMR and MMR to identify Single-Many-Relationship and Many-Many-Relationship, respectively. The term “many-to-many” relationship should only be used to describe the logical relationship between dimensions, which is implemented through two SMR relationships in the physical model.

]]>This article explains the behavior of LASTNONBLANK, LASTNONBLANKVALUE, and similar DAX functions, also providing patterns for performance optimization.

The DAX language can operate on aggregated values and on individual rows of tables in the data model, but it does not offer an immediate syntax to perform visual-level calculations over the cells visible in a report. The reason is that a DAX measure must work in any condition and in any report, which is why the notion of row context and filter context exists in the DAX language.

When you need to apply the notion of “first”, “previous”, “next”, “last” while iterating a table, you might have to struggle with these concepts in order to get the result you want. There are a few DAX functions that can simplify the syntax, but they must be used with caution in order to not hurt the query’s performance:

- FIRSTNONBLANK
- FIRSTNONBLANKVALUE
- LASTNONBLANK
- LASTNONBLANKVALUE

This article explains how these functions work, how to use them correctly, and how to apply optimizations resulting in DAX code that does not use these functions at all!

The Contoso model we use contains sales transactions where a customer has different orders on different dates. Each order can include multiple transactions and there can be multiple orders from the same customers on any one date. We analyze two scenarios:

**Sales PD**: Compare the*Sales Amount*of one day with the**previous day****Sales PO**: Compare the*Sales Amount*of one order with the**previous order**.

Because these calculations are dynamic, by filtering one customer the comparison only occurs considering the days and orders where there are transactions for that filtered customer. The dynamic nature of this calculation makes it very expensive to pre-calculate such amounts.

The first example compares one day with the previous day available by filtering the transactions of one customer.

The challenge is to find the previous day, and then retrieve the value of that day. Because one day can have multiple orders for one same customer, the calculation must operate at the day granularity level.

The second example is a similar report where we have one row for each order instead of one row for each day. The following visualization only shows the first orders made by the customer.

In this case, the calculation must be done working at the order granularity level, which is still an aggregation of one or more transactions.

The **Sales PD** scenario requires implementing the following business logic in a measure:

- Retrieve the previous date when there are transactions in the current filter context.
- Evaluate Sales Amount filtering the date found in the previous step.

We can implement the first step using the LASTNONBLANK function as shown in the *Prev. Order Date v1* measure:

Prev. Order Date v1 := IF ( NOT ISEMPTY ( Sales ), VAR FirstVisibleDate = MIN ( 'Date'[Date] ) VAR PreviousDate = CALCULATE ( LASTNONBLANK ( 'Date'[Date], [Sales Amount] ), 'Date'[Date] < FirstVisibleDate ) VAR Result = IF ( NOT ISEMPTY ( Sales ), PreviousDate ) RETURN Result )

The LASTNONBLANK function iterates all the values in the column provided as the first argument. For each value it evaluates the expression provided in the second argument, returning the largest value included in the first argument that generated a non-blank result as it was evaluating the expression provided in the second argument. In this case, the *Sales Amount* measure is evaluated for each date in the filter context. Because we need to retrieve the last date with a transaction made by one same customer filtered in the report, we have to modify the filter context for the *Date* table; otherwise, for each cell of the report we would only consider the date of the current line in the report. This is the reason why the LASTNONBLANK function is wrapped inside a CALCULATE function that filters only the date before the date visible in the cell of the report:

CALCULATE ( LASTNONBLANK ( 'Date'[Date], [Sales Amount] ), 'Date'[Date] < FirstVisibleDate )

When LASTNONBLANK is executed providing a column reference as the first argument, DAX implicitly rewrites the expression by retrieving the values using DISTINCT in a potential context transition. The previous code is internally rewritten as:

CALCULATE ( LASTNONBLANK ( CALCULATETABLE ( DISTINCT ( 'Date'[Date] ) ), [Sales Amount] ), 'Date'[Date] < FirstVisibleDate )

You can also provide a table as a first argument of LASTNONBLANK. Thus, the same expression could have been written as:

LASTNONBLANK ( CALCULATETABLE ( DISTINCT ( 'Date'[Date] ), 'Date'[Date] < FirstVisibleDate ), [Sales Amount] )

The FIRSTNONBLANK function is identical to LASTNONBLANK, with the only difference that it returns the smallest value instead of the largest value included in the first argument that generated a non-blank result when evaluating the expression provided in the second argument.

The following screenshot shows the result of the *Prev. Order Date v1* measure. As expected, this is simply the date of the previous line in the report.

However, if you give some thought to the work that must be done, LASTNONBLANK and FIRSTNONBLANK could potentially iterate a large number of rows. When we filter all the dates before a certain date, the number of rows considered could be in the order of thousands if you have a Date table with 10 years, and this must be repeated for every cell in the report. In order to avoid this useless task, the *Prev. Order Date v1 *measure offers an important optimization: it executes the search if and only if the current filter context has rows in the *Sales* table:

IF ( NOT ISEMPTY ( Sales ), ...

This means that in the previous report the LASTNONBLANK function is only executed 16 times instead of hundreds of times because of the several years available in the model. This way, we also avoid displaying a previous order date for dates when there are no transactions.

Now we can implement the second step of the calculation. The date obtained by the *Prev. Order Date v1* measure can be used as a filter to retrieve the *Sales Amount* value on that date. The *Sales PD v1* measure retrieves the amount value by using the result of LASTNONBLANK as a filter to compute *Sales Amount*:

Sales Previous Date v1 := VAR FirstVisibleDate = MIN ( 'Date'[Date] ) VAR PreviousOrderDate = CALCULATETABLE ( LASTNONBLANK ( 'Date'[Date], [Sales Amount] ), 'Date'[Date] < FirstVisibleDate ) VAR Result = CALCULATE ( [Sales Amount], PreviousOrderDate ) RETURN Result

In this case the *Sales Previous Date v1* measure does not check whether there are *Sales* rows visible in the filter context. This is because the measure is hidden and only used by other measures, so it is more efficient to only implement such a condition in the measure shown to the user. This is the definition of the *Sales PD v1* and *Diff. PD v1* measures:

Sales PD v1 := IF ( NOT ISEMPTY ( Sales ), [Sales Previous Date v1] ) Diff. PD v1 := IF ( NOT ISEMPTY ( Sales ), VAR CurrentAmount = [Sales Amount] VAR PreviousAmount = [Sales Previous Date v1] VAR Result = CurrentAmount - PreviousAmount RETURN Result )

We have seen that the result provided by LASTNONBLANK and FIRSTNONBLANK is a table with just one column and one row that can be used as a filter in a CALCULATE function. In our example, we are evaluating *Sales Amount* by filtering only the date obtained by evaluating the same *Sales Amount* for all the dates passed to LASTNONBLANK. In other words, we evaluate *Sales Amount* twice for the same date. The LASTNONBLANKVALUE function can slightly simplify the code required.

The FIRSTNONBLANKVALUE and LASTNONBLANKVALUE functions have the same behavior as FIRSTNONBLANK and LASTNONBLANK, with the exception that the value returned is the result of the expression evaluated for the item that has been found by FIRSTNONBLANKVALUE or LASTNONBLANKVALUE.

We can imagine that this code:

CALCULATE ( [Sales Amount], LASTNONBLANK ( 'Date'[Date], [Sales Amount] ) )

Can be written this way:

LASTNONBLANKVALUE ( 'Date'[Date], [Sales Amount] )

In theory, the engine could reuse the value computed for *Sales Amount* thus avoiding the double evaluation, and resulting in faster execution. For sure, we can write a smaller amount of code. However, this new syntax does not magically solve all the performance issues. The cardinality of the iterator is the same, so the possible performance bottlenecks are similar. The advantage of reusing one measure is usually minimal.

We have a new definition of the measures in the report. These measures can be identified by the suffix v2 and return the same result as in the v1 version:

Sales Previous Date v2 := VAR FirstVisibleDate = MIN ( 'Date'[Date] ) VAR Result = CALCULATE ( LASTNONBLANKVALUE ( 'Date'[Date], [Sales Amount] ), 'Date'[Date] < FirstVisibleDate ) RETURN Result Sales PD v2 := IF ( NOT ISEMPTY ( Sales ), [Sales Previous Date v2] ) Diff. PD v2 := IF ( NOT ISEMPTY ( Sales ), VAR CurrentAmount = [Sales Amount] VAR PreviousAmount = [Sales Previous Date v2] VAR Result = CurrentAmount - PreviousAmount RETURN Result )

Using LASTNONBLANK and LASTNONBLANKVALUE can provide the required result in a relatively easy way, but it might be costly. These functions are iterators and they evaluate within a row context the expression provided in the second argument; this oftentimes requires a context transition as is the case every time you evaluate a measure in an iterator. Therefore, there is an impact in execution time and in memory required at query time – because of the materialization required by the context transition.

These functions can be used without deeper knowledge of the data model, but the cost for this could be significant. In order to achieve another improvement, it is necessary to get rid of the unnecessary context transitions. This requires an implementation that does not use LASTNONBLANK and LASTNONBLANKVALUE at all!

Most of the times, we can achieve better performance by making a simple assumption: we do not need to evaluate *Sales Amount* for every date, we can simply check whether the *Sales* table has at least one row for that customer and date. If yes, we assume that *Sales Amount* returns a non-blank value. If you can make that assumption, then the following implementation results in a much faster execution that also requires a lower amount of memory at query time.

We can retrieve the date of the previous order by just applying a MAX aggregation over the *Sales[Order Date]* column. Because the Sales table only has rows for the existing orders, this implicitly provides the date of the previous order once we apply the proper filter context – just using the same filter for the previous date we also applied in v1 and v2 versions of the measures:

Prev. Order Date v3 := IF ( NOT ISEMPTY ( Sales ), VAR FirstVisibleDate = MIN ( 'Date'[Date] ) VAR PreviousDate = CALCULATE ( MAX ( Sales[Order Date] ), 'Date'[Date] < FirstVisibleDate ) VAR Result = IF ( NOT ISEMPTY ( Sales ), PreviousDate ) RETURN Result )

In order to compute the Sales Amount, we leverage TREATAS in order to change the data lineage of the result of the PreviousDate variable. We apply this filter to compute *Sales Amount*. The advantage of this approach is the reduced materialization, which is very important when you consider how a measure can scale out with multiple concurrent users:

Sales Previous Date v3 := VAR FirstVisibleDate = MIN ( 'Date'[Date] ) VAR PreviousDate = CALCULATE ( MAX ( Sales[Order Date] ), 'Date'[Date] < FirstVisibleDate ) VAR FilterPreviousDate = TREATAS ( { PreviousDate }, 'Date'[Date] ) VAR Result = CALCULATE ( [Sales Amount], FilterPreviousDate ) RETURN Result

The structure of the definition of Sales PD v3 and Diff. PD v3 is identical to the v1 and v2 versions; the only difference is in the name of the internal measures referenced.

It is a common misconception that LASTNONBLANK and LASTNONBLANKVALUES can only be used with columns containing dates. While this is probably the most common scenario, there is no such restriction in DAX: you can use any data type for the column provided in the first argument, including strings. This can be useful to implement the **Sales PO** scenario we previously described. The report to obtain has one row for each order.

In this case the first task is to retrieve the previous order number. The following implementation uses LASTNONBLANK and removes the filter from *Date* in order to make sure to find the previous order also on other dates:

Prev. Order Number v1 := IF ( NOT ISEMPTY ( Sales ), VAR FirstVisibleOrder = MIN ( Sales[Order Number] ) VAR PreviousOrder = CALCULATE ( LASTNONBLANK ( DISTINCT ( Sales[Order Number] ), [Sales Amount] ), REMOVEFILTERS ( 'Date' ), Sales[Order Number] < FirstVisibleOrder ) VAR Result = IF ( NOT ISEMPTY ( Sales ), PreviousOrder ) RETURN Result )

From a performance standpoint, we prefer the implementation that uses a simple MAX aggregation:

Prev. Order Number v3 := IF ( NOT ISEMPTY ( Sales ), VAR FirstVisibleOrder = MIN ( Sales[Order Number] ) VAR PreviousOrder = CALCULATE ( MAX ( Sales[Order Number] ), REMOVEFILTERS ( 'Date' ), Sales[Order Number] < FirstVisibleOrder ) VAR Result = IF ( NOT ISEMPTY ( Sales ), PreviousOrder ) RETURN Result )

In order to retrieve the Sales Amount of the previous order, the implementation based on LASTNONBLANKVALUE reduces the amount of code required:

Sales Previous Order v2 := VAR FirstVisibleOrder = MIN ( Sales[Order Number] ) VAR Result = CALCULATE ( LASTNONBLANKVALUE ( DISTINCT ( Sales[Order Number] ), [Sales Amount] ), REMOVEFILTERS ( 'Date' ), Sales[Order Number] < FirstVisibleOrder ) RETURN Result

This time the optimized version is more verbose, but this could provide a very important performance optimization!

Sales Previous Order v3 := VAR FirstVisibleOrder = MIN ( Sales[Order Number] ) VAR PreviousOrder = CALCULATE ( MAX ( Sales[Order Number] ), REMOVEFILTERS ( 'Date' ), Sales[Order Number] < FirstVisibleOrder ) VAR FilterPreviousOrder = TREATAS ( { PreviousOrder }, Sales[Order Number] ) VAR Result = CALCULATE ( [Sales Amount], REMOVEFILTERS ( 'Date' ), FilterPreviousOrder ) RETURN Result

The difference in performance between the different solutions depends on the cardinality applied by LASTNONBLANK (*Date[Date]* and *Sales[Order Number]* in our two scenarios) and the complexity of the measure being evaluated (*Sales Amount*). By filtering a single customer as we did in the examples shown in this article, it is hard to measure a difference because the execution time to refresh the report is below 10-20ms. For this reason, we created a benchmark query by removing the filter by customer so that we have more valid dates to compute. The measures considered are *Sales PD* and *Sales PO*, related to the corresponding scenarios. For each measure we used three versions:

**v1**: using LASTNONBLANK**v2**: using LASTNONBLANKVALUE**v3**: using custom DAX

In order to benchmark the **Sales PD** scenario, we used the following query:

EVALUATE TOPN ( 501, SUMMARIZECOLUMNS ( 'Date'[Date], "Sales_Amount", 'Sales'[Sales Amount], "Sales_PD", 'Sales'[Sales PD v1] -- Change v1, v2, v3 ), 'Date'[Date], 1 ) ORDER BY 'Date'[Date]

The result obtained for *Sales PD v1* using LASTNONBLANK is the following one. The cost of the storage engine is minimal and most of the operation are executed in the formula engine. A more complex measure or a bigger *Sales* table would affect the storage engine cost (the *Sales* table used in this example only contains 100,000 rows).

The same benchmark for *Sales PD v2* using LASTNONBLANKVALUE displays an unexpectedly slower execution time. Thus, a simpler syntax does not necessarily translate to better performance.

Finally, the benchmark for *Sales PD v3* (using custom DAX expressions) provides the best performance in terms of execution time.

The custom DAX formula reduces the execution time by 20% compared to v1 (LASTNONBLANK) and by 35% compared to v2 (LASTNONBLANKVALUE).

For the **Sales PD** scenario, we used the following query to perform the benchmark:

EVALUATE TOPN ( 501, SUMMARIZECOLUMNS ( 'Sales'[Order Number], 'Date'[Date], __DS0FilterTable, "Sales_Amount", 'Sales'[Sales Amount], "Sales_PO", 'Sales'[Sales PO v1] -- Change v1, v2, v3 ), 'Sales'[Order Number], 1, 'Date'[Date], 1 ) ORDER BY 'Sales'[Order Number], 'Date'[Date]

Because of the granularity of the *Sales[Order Number] *column, the execution is much slower in absolute terms compared to the previous scenario, showing a bigger difference between the different versions of the measures.

The result obtained for *Sales PO v1* using LASTNONBLANK is the following one.

The result obtained for *Sales PO v2* using LASTNONBLANKVALUE is the following one.

Finally, the Sales PO v3 measure provides the best execution time.

In this case the performance differences are only visible with a large granularity. The custom DAX formula reduces the execution time by 19% compared to v1 (LASTNONBLANK) and by 32% compared to v2 (LASTNONBLANKVALUE).

The differences shown might vary for different models and different queries. However, we do not expect to see cases where LASTNONBLANK or LASTNONBLANKVALUE are faster than the custom DAX solution. In some case, the differences are too small to be perceived by the report users.

The DAX functions FIRSTNONBLANK, FIRSTNONBLANKVALUE, LASTNONBLANK, and LASTNONBLANKVALUE are iterators that could materialize data at the granularity of the column provided as a first argument.

Most of the times, it is possible to get the same result by writing custom DAX code – code that might provide better performance and scalability thanks to a reduced level of materialization. In order to do so, it is necessary to make an assumption about the data model: the presence of transactions in the fact table (Sales in our example) is enough to consider the element iterated, without having to evaluate a measure to check whether it does or does not return a blank value.

While the difference in query performance may be small in simple reports, there could be a larger impact in complex reports.

]]>

DAX Formatter was created in 2013 to help users format their code based on the Rules for DAX Code Formatting published in October 2012. Because the main goal is to make the code immediately readable and recognizable, DAX Formatter purposely does not provide many options to customize the format. But we are now considering a small change.

Over the years, we only added three options:

**Separators**: the choice between the**comma**(,) and**semicolon**(;) separators is automatic, but you can override this setting manually. This option is required to accomplish the different formatting applied by Excel and Power BI with certain locale settings.**Lines**: how much code to include within a line of code. When we publish DAX samples in books and articles, for educational purposes we favor the**short lines**setting, which tries to not include more than one argument in each line. The**long****lines**setting allows short function calls to be formatted inline without breaking arguments off into different lines. We suggest considering long lines as a default in real world projects.**Light / Dark mode**: you can switch the style of the website to dark mode, which changes your background to black.

All these settings are persisted in your cookies if you allow them, so you do not have to change the configuration every time you open www.daxformatter.com.

Defining the formatting rules required weeks of discussion and allowed the adoption of a standard way to write DAX, making it easier to read DAX code written by someone else. However, when we defined the rules, most of the DAX editors available used proportional fonts, which is an unusual choice for any programming or query language. Reading DAX was harder than necessary because the opening round bracket was too close to the last letter of the function name. Therefore, we included this rule:

- Always put a space before parentheses ‘(‘ and ‘)’

The (intentional) consequence of the rule is that we have a space between the function name and the opening round bracket:

CALCULATE ( [measure], table[column] = 5 ) CALCULATE ( [measure], table[column] = 5 )

These days, most of the DAX editors (in Power BI, DAX Studio, and Tabular Editor) use non-proportional fonts, making this space unnecessary and not easy to include when writing code with IntelliSense. We are now considering changing the formatting rule into this:

- Always put a space before parentheses ‘(‘ and ‘)’, with the exception of the open bracket after a function name.

By adopting this rule we get this formatted code:

CALCULATE( [measure], table[column] = 5 ) CALCULATE( [measure], table[column] = 5 )

Should we adopt this change? We would like your feedback.

We are almost convinced of our decision, but we want to make sure we are considering all the elements at stake. For example, changing one of the rules might have an impact on people using code version and batches that automatically format the code at regular intervals or when there is a new release of a mode. We invite you to try the new format and tell us why we should not change it. The main obstacle we have found so far is simply that we are used to reading code this way, which is not enough to stop the change.

You can start using this standard today. Both DAX Formatter and DAX Studio already provide an option to change this setting.

This is the new settings area available in DAX Formatter.

The new option is:

**Spacing**: the**Best Practice**setting chooses what we consider in the official formatting rules. You can choose to manually enforce the original rule (A**space after function**) or the proposed change (**No space after function)**. At the time of writing, Best Practice corresponds to the original rule, but when we decide to apply the change to the rules, the behavior of Best Practice will .

A similar option is available in the DAX Studio settings.

By default, the checkbox is unchecked, but future versions of DAX Studio might have the checkbox checked by default after we decide to apply the change to the rules.

It is now your turn to evaluate the change and provide your feedback.

Thanks in advance – Enjoy DAX!

]]>This article describes how to write efficient DAX expressions that compute the compound interest of incremental investments made throughout the holding period, using the PRODUCTX function.

If you make an investment and hold it for a period of time, the future value of your investment depends on the rate of return over the investment holding period.

For instance, consider an investment of $100 in a simple debt instrument with an annual fixed interest rate of 10.0% and a five-year maturity and where all owed amounts (principal plus interest) are paid in one lump sum at the end of the five year period. The significance of this type of debt instrument is tied to the fact that interest income generated each year is effectively reinvested or added to the amount that is owed, thereby enabling the interest income to ** compound** over the holding period.

In fact, there is a simple math equation for determining the future value of such an instrument:

*FV = PV(1 + r) ^{n}*

or more specifically:

*Future Value = Present Value x (1 + Rate) ^{ number of periods/years}*

In our case:

*Future Value = $100 x (1 + 10%) ^{ 5 }= $161.05 *

In other words, if we paid $100 today (the “Present Value”), at the end of Year 5 we would receive $161.05 (the “Future Value”), generating $61.05 of total interest income.

If, on the other hand, we received the interest income from the debt instrument at the end of each year (i.e., the interest income is not reinvested and consequently not allowed to compound), we would receive $10 at the end of the first four years ( = 10% x $100 for each year) plus $110 at the end of the fifth year ($10 of interest income for the fifth year plus repayment of the initial $100 principal amount). Summarizing this second example, if we invested $100 today, we would receive $150 by the end of Year 5.

By effectively reinvesting the interest income in the first example, we are able to generate $11.05 more of total interest income ( = $61.05 – $50) than if we were to receive our interest income in a distribution at the end of each year even though both examples accrue interest at the same rate. Of course, in the latter case, we receive the money sooner, which has its own considerations. In any event, these very simple cases demonstrate the power of ** compounding** and

Coincidentally, both debt instrument examples are what is known as “bullet” loans, where the entire principal amount ($100) is repaid in one lump sum at maturity (at the end of Year 5). In the first example the interest income payments are deferred until maturity, thereby allowing the interest to compound over the holding period. In the second example, the interest income payments are made at the end of each year, which means that the amount of debt accruing interest each year is always the same ($100).

Now let us consider a slightly more complex investment with compounding interest where the interest rate differs year-to-year. Because the interest rate varies, you can’t use the simple formula above (or its FV function equivalent in Excel). Rather, you must effectively stack each year on top of the preceding year and calculate year-by-year. See the table below.

This scenario is straightforward using Excel, because you can simply take the value of the previous year (which is typically in the previous row) as the starting value for the next year to compute the next year’s interest income.

Performing the calculation of compound interest in DAX is challenging, because there is no way to reference the result value in the previous year as we can easily do in Excel. Moreover, in a real data model it is possible that incremental investments are made throughout the holding period. Accordingly, a report could look like the following scenario where an incremental investment is added on the first day of each year over the five-year period (different investment dates during each year would require a different calculation).

In this case, the value at the end of Year 2 is computed as the sum of $110.00 (value at the end of Year 1) and $80.00 (the value invested on day one of Year 2) increased by 9.0% (the corresponding interest rate for the period).

The sample model we are using for this demo is the following.

The *Investments* table contains one row for each investment made over the years, whereas the *Rates* table contains the different interest rates on a yearly basis. *Date* is just a simple calendar table.

The main idea to solve the scenario is to split the larger problem into simpler problems. Focus on a year, for example 2022. In 2022 there is an investment of $120. Additionally, in 2022, there are also investments made earlier in 2020 and 2021. Each of these investments has been active for a different number of years: the $100 invested in 2020 have been active for 2 years in 2022, while the $80 invested in 2021 have only been active for one year in 2022.

If we focus on a couple of years – the beginning year and the end year of the investment – the calculation is straightforward: it is enough to multiply the investment amount by *(1+interest rate)* for every year when the investment is active. This produces the interests gained by the investment over the years when it is active.

With that in mind, the formula to compute the *Value End of Year *measure can be written as follows, also using the PRODUCTX function:

Value End of Year := VAR SelectedYear = SELECTEDVALUE ( 'Date'[Year], -- Find the current year in the report MAX ( 'Date'[Year] ) -- default to the last year available ) VAR PreviousYears = -- PreviousYears contains the FILTER ( -- years BEFORE the current one ALL ( 'Date'[Year] ), 'Date'[Year] <= SelectedYear ) VAR PreviousInvestments = -- PreviousInvestments contains ADDCOLUMNS ( -- the amount of all the investments PreviousYears, -- made in previous years "@InvestedAmt", CALCULATE ( SUM ( Investments[Amount] ) ) ) VAR Result = -- For each previous investment SUMX ( -- calculate the compound interest PreviousInvestments, -- over the years and sum the results VAR InvestmentYear = 'Date'[Year] VAR InvestmentAmt = [@InvestedAmt] VAR YearsRateActive = FILTER ( ALL ( Rates ), VAR YearRate = YEAR ( Rates[Date] ) RETURN YearRate >= InvestmentYear && YearRate <= SelectedYear ) VAR CompundInterestRateForInvestment = PRODUCTX ( YearsRateActive, 1 + Rates[InterestRate] ) RETURN InvestmentAmt * CompundInterestRateForInvestment ) RETURN Result

Although quite long, the formula should be fast enough in most scenarios. Indeed, its complexity does not depend on the size of the *Investments* table, because that table is scanned only once to perform the grouping by year while computing the *PreviousInvestments* variable.

The speed of the measure depends solely on the number of years: *PreviousInvestments* contains one row per year; For each of these years, the formula performs an iteration for all the years up to the selected date.

Therefore, the complexity is around N squared, where N is the number of years of the investment – in real-world scenarios, it should perform well. This approach is possible because **the investment is made on the first day of each year** – a more complex calculation is required if investment dates are different, but the principle is similar and the calculation over the years should be made at the year level in order to be efficient.

Anyway, the interesting part of this formula is not in its performance. The most relevant detail is that in order to solve the scenario, you need to rethink the algorithm. Trying to solve the formula by thinking in an iterative way – computing the value one year after the other – does not work well in DAX. Rethinking it in terms of evaluating the current value of each investment provides a straight path to the solution that is also more efficient.

The previous calculation is valid assuming that each investment is made on the first day of each year. If you have a more complex scenario where each investment can have a different date, then you can reuse the previous logic by assuming that the amount you have at the beginning of each year is the amount of the investments made in the previous year added to the interest income at the end of the previous year. This approach also handles capital expenditure reductions, which are recorded as negative transactions in the *Investments* table. The following screenshot shows an additional investment of $80.00 made on July 1, 2021 and a disinvestment of $120.00 made on January 1, 2022.

The following screenshot shows the expected result.

There are several approaches we can use to implement this solution. A simple one could be to create a measure to compute at the end of each year, the accrued amount of the investments made in that year, and then reuse the measure using the logic of the previous formula. However, this approach results in a query plan with nested iterators that are not ideal in DAX. Therefore, we implemented a different approach that guarantees a better query plan – though this requires a single, more verbose measure.

The following implementation of the *Value End of Year* measure calculates the sum of the investments at the day level only once. Most of the calculation takes place in the formula engine. However, because the complexity is around D + (N squared) where D is the number of days with investments and N is the number of years, we experienced much better performance with this verbose approach:

Value End of Year := VAR SelectedYear = SELECTEDVALUE ( 'Date'[Year], -- Find the current year in the report MAX ( 'Date'[Year] ) -- default to the last year available ) VAR YearlyInterestRates = ADDCOLUMNS ( CALCULATETABLE ( ADDCOLUMNS ( SUMMARIZE ( Rates, 'Date'[Year] ), "@InterestRate", CALCULATE ( SELECTEDVALUE ( Rates[InterestRate] ) ) ), REMOVEFILTERS ( 'Date' ) ), "@DailyRate", VAR DaysInYear = CALCULATE ( COUNTROWS ( 'Date' ) ) VAR DailyRate = [@InterestRate] / DaysInYear RETURN DailyRate ) VAR LastDateInReport = CALCULATE ( MAX ( 'Date'[Date] ), ALLSELECTED ( 'Date' ) ) VAR InvestmentsByDate = CALCULATETABLE ( ADDCOLUMNS ( SUMMARIZE ( Investments, 'Date'[Date], 'Date'[Year] ), "@InvestmentAtDate", CALCULATE ( SUM ( Investments[Amount] ) ) ), 'Date'[Date] <= LastDateInReport ) VAR InvestmentsByDateWithRates = NATURALLEFTOUTERJOIN ( InvestmentsByDate, YearlyInterestRates ) VAR PreviousYears = -- PreviousYears contains the SELECTCOLUMNS ( -- years INCLUDING the current one FILTER ( YearlyInterestRates, 'Date'[Year] <= SelectedYear ), "@Year", 'Date'[Year] ) VAR PreviousInvestments = -- PreviousInvestments contains ADDCOLUMNS ( -- for each year the amount of PreviousYears, -- all the investments made in the "@AccruedAmt", -- previous year with interest applied VAR CurrentYear = [@Year] VAR StartOfNextYear = DATE ( CurrentYear + 1, 1, 1 ) VAR InvestmentsInCurrentYear = FILTER ( InvestmentsByDateWithRates, 'Date'[Year] = CurrentYear ) VAR AccruedAmount = -- For each investment within the year SUMX ( -- sum the amount and accrued interest InvestmentsInCurrentYear, VAR InvestedAmount = [@InvestmentAtDate] VAR DaysInvestment = StartOfNextYear - 'Date'[Date] VAR EffectiveRate = [@DailyRate] * DaysInvestment VAR AccruedInterest = InvestedAmount * EffectiveRate VAR Result = InvestedAmount + AccruedInterest RETURN Result ) RETURN AccruedAmount ) VAR Result = -- For each previous investment SUMX ( -- calculate the compound interest PreviousInvestments, -- over the years and sum the results VAR InvestmentYear = [@Year] VAR InvestmentAmt = [@AccruedAmt] VAR YearsRateActive = FILTER ( YearlyInterestRates, 'Date'[Year] > InvestmentYear && 'Date'[Year] <= SelectedYear ) VAR CompundInterestRateForInvestment = PRODUCTX ( -- Compute the compound YearsRateActive, -- interest for the 1 + [@InterestRate] -- previous years ) -- or returns 1 if there + ISEMPTY ( YearsRateActive ) -- are no previous years RETURN InvestmentAmt * CompundInterestRateForInvestment ) RETURN Result

The previous measure only works for a report at the year granularity. This also guarantees a better optimization. If the report needs to be available at different granularities such as quarter and month, the measure is more complex because it must compute the accrued interest on the last day of the period selected. The following is an example of the desired report.

The formula used to produce this report is longer and slower. We suggest only resorting to this approach when absolutely necessary, because it can produce slower reports:

Value End of Period := VAR LastDateSelected = MAX ( 'Date'[Date] ) VAR LastYearSelected = MAX ( 'Date'[Year] ) VAR YearlyInterestRates = ADDCOLUMNS ( CALCULATETABLE ( ADDCOLUMNS ( SUMMARIZE ( Rates, 'Date'[Year] ), "@InterestRate", CALCULATE ( SELECTEDVALUE ( Rates[InterestRate] ) ) ), REMOVEFILTERS ( 'Date' ) ), "@DailyRate", VAR CurrentYear = 'Date'[Year] VAR DaysInYear = CALCULATE ( COUNTROWS ( 'Date' ), REMOVEFILTERS ( 'Date' ), 'Date'[Year] = CurrentYear ) VAR DailyRate = DIVIDE ( [@InterestRate], DaysInYear ) RETURN DailyRate ) VAR LastDateInReport = CALCULATE ( MAX ( 'Date'[Date] ), ALLSELECTED ( 'Date' ) ) VAR InvestmentsByDate = CALCULATETABLE ( ADDCOLUMNS ( SUMMARIZE ( Investments, 'Date'[Date], 'Date'[Year] ), "@InvestmentAtDate", CALCULATE ( SUM ( Investments[Amount] ) ) ), 'Date'[Date] <= LastDateInReport ) VAR InvestmentsByDateWithRates = NATURALLEFTOUTERJOIN ( InvestmentsByDate, YearlyInterestRates ) VAR PreviousYears = -- PreviousYears contains the SELECTCOLUMNS ( -- years INCLUDING the current one FILTER ( YearlyInterestRates, 'Date'[Year] < LastYearSelected ), "@Year", 'Date'[Year] ) VAR PreviousInvestments = -- PreviousInvestments contains ADDCOLUMNS ( -- for each year the amount of PreviousYears, -- all the investments made in the "@AccruedAmt", -- previous year with interest applied VAR CurrentYear = [@Year] VAR StartOfNextYear = DATE ( CurrentYear + 1, 1, 1 ) VAR InvestmentsInCurrentYear = FILTER ( InvestmentsByDateWithRates, 'Date'[Year] = CurrentYear ) VAR AccruedAmount = -- For each investment within the year SUMX ( -- sum the amount and accrued interest InvestmentsInCurrentYear, VAR InvestedAmount = [@InvestmentAtDate] VAR DaysInvestment = StartOfNextYear - 'Date'[Date] VAR EffectiveRate = [@DailyRate] * DaysInvestment VAR AccruedInterest = InvestedAmount * EffectiveRate VAR Result = InvestedAmount + AccruedInterest RETURN Result ) RETURN AccruedAmount ) VAR FilterLastYearInvestments = FILTER ( InvestmentsByDateWithRates, 'Date'[Year] = LastYearSelected ) VAR LastYearInvestments = SUMX ( FILTER ( FilterLastYearInvestments, 'Date'[Date] <= LastDateSelected ), VAR InvestedAmount = [@InvestmentAtDate] VAR DaysInvestment = LastDateSelected - 'Date'[Date] + 1 VAR EffectiveRate = [@DailyRate] * DaysInvestment VAR AccruedInterest = InvestedAmount * EffectiveRate VAR Result = InvestedAmount + AccruedInterest RETURN Result ) VAR DaysLastYear = LastDateSelected - DATE ( LastYearSelected, 1, 1 ) + 1 VAR ValuePreviousInvestments = -- For each previous investment SUMX ( -- calculate the compound interest PreviousInvestments, -- over the years and sum the results VAR InvestmentYear = [@Year] VAR InvestmentAmt = [@AccruedAmt] VAR YearsRateActive = FILTER ( YearlyInterestRates, 'Date'[Year] > InvestmentYear && 'Date'[Year] <= LastYearSelected ) VAR CompundInterestRateForInvestment = PRODUCTX ( -- Compute the compound YearsRateActive, -- interest for the previous years VAR YearRate = 'Date'[Year] VAR InterestPreviousYears = [@InterestRate] * (YearRate <> LastYearSelected) VAR InterestLastYear = [@DailyRate] * DaysLastYear * (YearRate = LastYearSelected) VAR Result = 1 + InterestPreviousYears + InterestLastYear RETURN Result ) -- or returns 1 if there + ISEMPTY ( YearsRateActive ) -- are no previous years VAR FinalInvestmentValue = InvestmentAmt * CompundInterestRateForInvestment RETURN FinalInvestmentValue ) VAR Result = ValuePreviousInvestments + LastYearInvestments RETURN Result

The first part of this article presented the overall concepts of dealing with compound interest. We then presented an efficient first DAX approach in computing the compound interest for incremental investments made throughout the holding period. In order to achieve high performance, we moved the calculation logic to the year level. This required an assumption that all investments are made on the first day of each year exclusively.

Later on in the article, we show more complete calculations where every investment can be made at any time and the report can have different granularities. The fundamental idea is the same: compound interest is always calculated at the year level, whereas the value of each investment at the end of the first year is computed with a day granularity. This approach is more verbose, but way more efficient from a computational point of view.

*Special thanks to our friend **Rick Scanlon, a partner at Innovation Endeavors** who helped us write out the definition of compound interest and reviewed the business logic presented in this article.*

This article describes the reasons why an Excel pivot table may be slow when using the Analyze in Excel feature of Power BI.

Power BI uses the Analysis Services Tabular engine to process queries coming from reports. When you create a Power BI report, the query sent to the engine uses the DAX language, which usually produces more optimized query plans. However, the Analysis Services engine is also able to understand MDX queries sent to a Tabular model. Excel is a client tool that can connect to an Analysis Services database sending queries in MDX.

Thanks to the bilingual feature (DAX/MDX) in Analysis Services, Microsoft quickly introduced a new server technology in the market: the Tabular model, based on DAX. This was done by leveraging tools that existed for another technology, the Multidimensional model based on MDX. Many optimizations were introduced for DAX queries over the last few years, in order to improve the Power BI user experience; however, sometimes there are no equivalent optimizations for MDX queries, resulting in Excel reports that run slower than equivalent Power BI reports. This article describes what is causing many of these differences; it also presents possible workarounds and highlights features that Microsoft might want to make a priority, in order to improve the experience for Excel users connected to Power BI datasets or Analysis Services Tabular databases.

Power BI generates one or more DAX queries for a single matrix in a report. For example, the following matrix displays a few measures grouped by customer *Country* and sorted by *Sales Amount*.

The measures used in this report are straightforward:

Quantity # := SUM ( Sales[Quantity] ) Sales Amount := SUMX ( Sales, Sales[Quantity] * Sales[Net Price] ) Total Cost := SUMX ( Sales, Sales[Quantity] * Sales[Unit Cost] ) Margin := [Sales Amount] - [Total Cost] Margin % := DIVIDE ( [Margin], [Sales Amount] )

The single query generated by the matrix is the following one:

EVALUATE TOPN ( 502, SUMMARIZECOLUMNS ( ROLLUPADDISSUBTOTAL ( 'Customer'[Country], "IsGrandTotalRowTotal" ), "Sales_Amount", 'Sales'[Sales Amount], "Quantity__", 'Sales'[Quantity #], "Total_Cost", 'Sales'[Total Cost], "Margin__", 'Sales'[Margin %] ), [IsGrandTotalRowTotal], 0, [Sales_Amount], 0, 'Customer'[Country], 1 ) ORDER BY [IsGrandTotalRowTotal] DESC, [Sales_Amount] DESC, 'Customer'[Country]

The execution of that query in DAX Studio with a cold cache shows that the execution is fast and only requires one Storage Engine (SE) query.

This is the performance expected of a report in Power BI, even though we are querying a model that has one million of rows in the Sales table. When we try to recreate the same report in Excel, we can see important differences in the way the query is executed. Typically, you publish the Power BI report and then you create an Excel report using the Analyze in Excel feature. However, this would make it impossible to investigate the query plans; we can thus simulate the same behavior by connecting Excel to Analysis Services using the diagnostic port provided by Power BI Desktop, which is shown by DAX Studio when it is connected to Power BI Desktop.

We create the following report using a PivotTable in Excel.

Excel generates a single MDX query to populate a PivotTable, no matter how complex. In this case, the MDX query is relatively simple – the sort order is not included in the query because Excel sorts the result it receives:

SELECT { [Measures].[Quantity #], [Measures].[Sales Amount], [Measures].[Total Cost], [Measures].[Margin %] } DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS, NON EMPTY Hierarchize( { DrilldownLevel ( { [Customer].[Country].[All] },,, INCLUDE_CALC_MEMBERS ) } ) DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON ROWS FROM [Model] CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

The execution of this MDX query in DAX Studio with a cold cache shows that the execution is still fast, though slower than the previous DAX version. However, the most important difference is the increase in the number of SE queries – we now have 8 instead of 1.

The reason why the MDX query requires a larger number of SE queries compared to the equivalent DAX query is that DAX benefits from one optimization called **“fusion”**; fusion includes multiple aggregations within the same storage engine query. The reports are identical and require the aggregations of the *Quantity #*, *Sales Amount*, and *Total Cost* measures. The following is the xmSQL code of the single SE query required by the DAX version:

WITH $Expr0 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Unit Cost] AS INT ) ) , $Expr1 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Net Price] AS INT ) ) SELECT 'Analytics Customer'[Country], SUM ( @$Expr0 ), SUM ( 'Analytics Sales'[Quantity] ), SUM ( @$Expr1 ) FROM 'Analytics Sales' LEFT OUTER JOIN 'Analytics Customer' ON 'Analytics Sales'[CustomerKey]='Analytics Customer'[CustomerKey];

The query granularity is at the *Country* level, and the DAX engine computes the grand total by summing the value obtained for each country. Indeed, the measures used in the report display a purely additive behavior – they are just SUM / SUMX functions.

When the report runs in MDX, every measure requires its own SE query. This is the SE query generated for the *Sales Amount* measure:

WITH $Expr0 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Net Price] AS INT ) ) SELECT SUM ( @$Expr0 ) FROM 'Analytics Sales';

This is the SE query generated for the *Total Cost* measure:

WITH $Expr0 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Unit Cost] AS INT ) ) SELECT SUM ( @$Expr0 ) FROM 'Analytics Sales';

Although the *Margin* measure used by *Margin %* is just a difference between *Sales Amount* and *Total Cost*, there is a specific SE query to compute *Margin*:

WITH $Expr0 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Unit Cost] AS INT ) ) , $Expr1 := ( PFCAST ( 'Analytics Sales'[Quantity] AS INT ) * PFCAST ( 'Analytics Sales'[Net Price] AS INT ) ) SELECT SUM ( @$Expr0 ), SUM ( @$Expr1 ) FROM 'Analytics Sales';

Moreover, the SE queries are also duplicated because each SE query runs both at the *Country* granularity level and at the grand total level. For this reason, the queries in lines 10 to 16 of the last screenshot look like duplicated versions of the queries in lines 2 to 8. The difference is visible in the screenshot comparing line 8 with line 16: the former just sums *Quantity* grouped by *Country*.

The impact of the missing “fusion” optimization is not clearly visible in terms of perceived execution time when:

- The measures are trivial.
- There are only a few million rows in the fact table in the model.
- The report does not require more than a few dozen cells in the result.

When one or more of these conditions is no longer true, the difference in performance is much greater.

Consider for example this different, slower version of the measures, where *Slow Amount* applies a 5% discount on transactions with a *Net Price* greater than 100; The code is intentionally inefficient in order to highlight the problem caused by the absence of fusion in MDX queries:

Quantity # := SUM ( Sales[Quantity] ) Slow Amount := SUMX ( Sales, Sales[Quantity] * IF ( Sales[Net Price] > 100, Sales[Net Price] * 0.95, Sales[Net Price] ) ) Total Cost := SUMX ( Sales, Sales[Quantity] * Sales[Unit Cost] ) Slow Margin := [Slow Amount] - [Total Cost] Slow Margin % := DIVIDE ( [Slow Margin], [Slow Amount] )

The layout of the reports and the structure of the DAX and MDX queries is the same we have seen before, but simply replacing the measures with their slower version. However, the query plans are different. The DAX request now generates two SE queries, because the measures are no longer guaranteed to be fully aggregatable. Therefore, there is one SE query at the *Country* granularity and another one for the grand total, even though each SE query still computes all the measures because of the fusion optimization. The two SE queries of the DAX measures are visible in the next screenshot.

The presence of the IF condition in the iterator generates a *CallbackDataID* that does not cache the result of the SE query. While this is of limited importance for the DAX query executed by a single user, it has a bigger impact on the MDX version of the query. This is shown in the next screenshot.

All the SE queries in bold include a *CallbackDataID* and do not take advantage of the SE cache. In this simple example, the MDX version of the report is two times slower than the DAX version (126 ms instead of 55 ms), but the effect produced by the increased number of SE queries could be bigger in more complex reports.

In order to improve the performance of an Excel report based on a PivotTable, consider the following best practices:

- Reducing the number of measures included in a PivotTable.
- Optimizing the performance of the measures used in a PivotTable.
- Avoiding any
*CallbackDataID*especially in measures that are referenced multiple times in the same report.

A better solution would be an MDX implementation of the “fusion” optimization available for DAX queries. Please vote for this suggestion on idea.powerbi.com if you think this optimization is important in your scenarios, and in order to make Microsoft aware of this requirement; please ensure to also involve your account in Microsoft if you work in a larger corporation.

]]>

Microsoft released 13 new functions in 2019. The first 4 functions are related to the calculation group feature, which is now only available on Azure Analysis Services and Analysis Services 2019:

- ISSELECTEDMEASURE: Returns true if one of the specified measures is currently being evaluated.
- SELECTEDMEASUREFORMATSTRING: Returns the format string for the measure that is currently being evaluated.
- SELECTEDMEASURE: Returns the measure that is currently being evaluated.
- SELECTEDMEASURENAME: Returns the name of the measure that is currently being evaluated.

Chances are, you have never used the functions above. Calculation groups are not available in Power BI yet. They should be available once the Power BI XMLA endpoint is available in read-write mode, but there will still be limited adoption at that point. The real game changer is the availability of calculation groups in Power BI Desktop. The limited adoption of calculation groups is my biggest disappointment for 2019. But sooner or later this will happen – more about this later.

There is nothing groundbreaking about the other 9 functions: the same results were achieved with different syntax, or they cover some corner case that should not have a significant impact.

- DISTINCTCOUNTNOBLANK: This is like DISTINCTCOUNT but ignores any blank value.
- FIRSTNONBLANKVALUE: Returns the first non-blank value evaluated by an expression iterating the unique values in a column.
- LASTNONBLANKVALUE: Returns the last non-blank value evaluated by an expression iterating the unique values in a column.
- QUARTER: Returns the number of the quarter from a date.
- CONVERT: Converts an expression in a specific data type.
- REMOVEFILTERS: Removes filters like ALL, but it can be only used as a CALCULATE modifier and not as a table function.
- ALLCROSSFILTERED: Removes all the filters propagated over any kind of relationship and not only those obtained by the expanded table. It can be only used as a CALCULATE modifier, so REMOVE probably would have been a better prefix than ALL. Luckily, this function is rarely required.
- CONTAINSSTRING: Finds a string inside another string using limited wildcards; not case-sensitive.
- CONTAINSSTRINGEXACT: Finds a string inside another string using a case-sensitive comparison; no wildcards allowed.

This year at SQLBI we delivered 34 public courses on three continents, plus many sessions in conferences, SQL Saturdays, and other community events.

We released the second edition of The Definitive Guide to DAX in August. For several reasons, we experienced delays in delivering the book outside of North America, especially in Europe. Hopefully, the problem seems to be solved by now. You can find more details about the content of the book and the “behind the scenes” on the blog post I wrote when the book was published.

We also launched the second edition of our Mastering DAX video course. New content, new demos, and many new exercises to practice on. Sometimes I receive requests about an advanced course… well, Mastering DAX is advanced, indeed! If you think we missed something, let us know. Thanks to the new organization, we made it possible to add modules over time. And if you are new to DAX, just enroll in the free Introducing DAX video course and only move on to Mastering DAX after you have completed it.

We continued to improve DAX Guide thanks to the contributions of many readers – if your feedback has not been applied yet… don’t worry, we didn’t lose it, it’s in the queue.

Finally, our OKViz spinoff released the first commercial custom visual, Smart Filter Pro. There have been many new versions and improvements of this visual, whereas the work on other new visuals will only be visible in 2020…

I’ve seen the future, and it’s bright – but unfortunately, it’s under NDA!

Well, I need to explain this. Daniele Perilli and I are now part of the Power BI contributor program. We will contribute to some Power BI features. We are not alone. And we all work for the community. Now, use your imagination, dream up something good and imagine it happening.

I know, you want to know more. Well, usually big conferences are used to make important announcements. The Microsoft Business Applications Summit in April should have good news. But if you can go to London for SQLBits 2020 a few weeks before, you might get a nice preview.

Besides, here is what we will do in 2020:

- We will add a Calculation Groups module to the Mastering DAX video course. All the enrolled students will receive it for free.
- We will release a new version of DAX Patterns. We started working on it in 2019 and we will complete the work in 2020. All the existing patterns will be refreshed with Power BI examples, more optimized DAX code, and we will also add new patterns. We wanted to release it in 2019, but the work on The Definitive Guide to DAX took longer than expected, so we had to delay the work on the patterns.
- OKViz will release new custom visuals.
- We will keep on producing content for SQLBI and tools for the community.

If you want to receive updates promptly without missing any news, go ahead and register to our newsletter.

Happy 2020!

]]>

As do many languages, DAX enables people to get the same result through different techniques. However, the same semantic may produce different query plans, so it is a good idea to know different techniques because this could be useful when you want to optimize the performance of a slower expression.

**UPDATE 2020-02-27** : the COALESCE function is available in DAX, check product version compatibility on DAX Guide. You no longer need to implement the alternative approach described in this article if you have a product supporting COALESCE . However, this article is still relevant to implement an alternative to NULLIF , which is not implemented in DAX, yet.

**DISCLAIMER:** In this article, you will see a technique where DIVIDE can replace other conditional statements. Though the resulting code is less readable, sometimes there is a performance advantage to using DIVIDE instead of IF or SWITCH. We do not want to describe the technical reasons why DIVIDE could be more efficient. This depends on several factors. Therefore, you should never assume that DIVIDE is faster than IF or SWITCH; you also should only consider this different way of expressing the same semantic when the performance advantage is measurable and consistent across different queries and tests. Moreover, the DAX engine evolves over time, so any analysis of the differences in query plans could be obsolete very soon. Therefore, do your homework before applying any of these techniques in production, and read Understanding eager vs. strict evaluation in DAX for more details about query plans evaluating expressions that could produce blank values.

The NULLIF function in T-SQL returns a null value if the two specified expressions are equal. The syntax is:

NULLIF ( <expression1>, <expression2> )

NULLIF returns* <**expression1>* if the two expressions are not equal. If the expressions are equal, NULLIF returns a null value of the type of *<expression1>*. This is an equivalent DAX code:

VAR exp1 = <expression1> VAR exp2 = <expression2> RETURN IF ( exp1 == exp2, BLANK(), exp1 )

A better version in DAX avoids the use of IF:

VAR exp1 = <expression1> VAR exp2 = <expression2> RETURN DIVIDE ( exp1, NOT ( exp1 == exp2 ) )

Using DIVIDE instead of an IF can provide a better execution plan, although this is not guaranteed. In general, whenever we have an IF statement that uses only two arguments and the result is a numeric value, the DIVIDE syntax can provide an alternative that is less readable but sometimes faster. The reason for that is described in the following section.

Consider the following DAX pattern where *<expression> *returns a numeric value:

IF ( <condition>, <expression> )

You can get the same result by writing:

DIVIDE ( <expression>, <condition> )

The result of a logical condition is converted to 0 or 1 in DAX. If the condition is false, then the denominator of DIVIDE is 0 and the result is BLANK. If the condition is true, then the denominator of DIVIDE is 1 and the result is identical to *<expression>*.

The COALESCE function in ANSI SQL returns the current value of the first expression that does not evaluate to NULL. The syntax is:

COALESCE ( <expression1>, <expression2>, <expression3>, … )

We can rewrite this using IF in DAX this way:

VAR exp1 = <expression1> RETURN IF ( NOT ISBLANK ( exp1 ), exp1, VAR exp2 = <expression2> RETURN IF ( NOT ISBLANK ( exp2 ), exp2, VAR exp3 = <expression3> RETURN IF ( NOT ISBLANK ( exp3 ), exp3, … ) ) )

A more readable but potentially less efficient way to write the same code is the following:

VAR exp1 = <expression1> VAR exp2 = <expression2> VAR exp3 = <expression3> … RETURN SWITCH ( TRUE, NOT ISBLANK ( exp1 ), exp1, NOT ISBLANK ( exp2 ), exp2, NOT ISBLANK ( exp3 ), exp3, … )

The code using IF and SWITCH works for expressions of any data type. If you are handling numeric expressions and you never want to return a blank, then you can consider this alternative option:

VAR exp1 = <expression1> VAR exp2 = <expression2> VAR exp3 = <expression3> … RETURN NOT ISBLANK ( exp1 ) * exp1 + ISBLANK ( exp1 ) * ( NOT ISBLANK ( exp2 ) ) * exp2 + ISBLANK ( exp1 ) * ISBLANK ( exp2 ) * ( NOT ISBLANK ( exp3 ) ) * exp3 … )

In this case, using the multiplication over the result of ISBLANK produces a similar effect to what we described using DIVIDE . The difference is that because we are summing different expressions, the result cannot be blank. The propagation of blank in the result can be obtained by wrapping the expression in a DIVIDE function like in the following expression: ** **

VAR exp1 = <expression1> VAR exp2 = <expression2> VAR exp3 = <expression3> … RETURN DIVIDE ( NOT ISBLANK ( exp1 ) * exp1 + ISBLANK ( exp1 ) * ( NOT ISBLANK ( exp2 ) ) * exp2 + ISBLANK ( exp1 ) * ISBLANK ( exp2 ) * ( NOT ISBLANK ( exp3 ) ) * exp3 …, ( NOT ISBLANK ( exp1 ) ) * ( NOT ISBLANK ( exp2 ) ) * ( NOT ISBLANK ( exp3 ) ) * … )

Whenever all the expressions are blank, the denominator of DIVIDE is 0 and the result is blank.

The NULLIF and COALESCE functions available in different versions of SQL have equivalent expressions in DAX. These DAX alternatives are many. Small details in the syntax might have different semantic implications, producing different results. The best practice is to use the simplest possible syntax whenever possible, evaluating more complex DAX expressions only when optimal performance is at stake and making sure you validate that an alternative syntax is faster by doing a proper benchmark.

]]>This article describes how you can create a comparison with the previous time period in a visualization, regardless of whether the time periods are consecutive or not.

Time intelligence calculations in DAX are usually created considering consecutive periods in any type of comparisons. This approach might not work well when the requirement is to compare the differences between a selection of non-consecutive periods. In that case, the “previous” element in a visualization might not correspond to the previous element in the data model.

For example, consider the following report where the slicer selects an arbitrary set of months within a year: March, May, August, and October 2008.

The *Sales Diff PM* and *% Sales Diff PM* measures provide the difference between the *Sales Amount* of the month displayed in the row and the month displayed in the previous row of the matrix visualization. These two measures are based on the *Sales PM* measure, which returns the Sales Amount of the previous selected month in the same visualization.

Sales Diff PM := VAR SalesCurrentMonth = [Sales Amount] VAR SalesPreviousMonth = [Sales PM] VAR Result = DIVIDE ( SalesCurrentMonth - SalesPreviousMonth, ( NOT ISBLANK ( SalesCurrentMonth ) ) * ( NOT ISBLANK ( SalesPreviousMonth ) ) ) RETURN Result % Sales Diff PM = DIVIDE ( [Sales Diff PM], [Sales PM] )

The main goal of this article is to describe how to write the *Sales PM* measure of this example. From a DAX standpoint, the previous row of the matrix is not a concept that can be directly expressed in a formula. In order to author a measure that can do so, we have to start with an assumption: each row displays a month, and the months are sorted according to their natural sort order (January, February, March, and so on). However, the previous month in the visualization is not necessarily the previous month in the calendar. In the example we are considering, the selection made on the slicer shows just a few months. We should redefine the concept of “previous month” as “previous month in the selection made outside of the matrix”. Now we can call upon a Power BI concept with a close enough representation in DAX: the ALLSELECTED modifier allows a CALCULATE function to retrieve the filter context defined outside of a visual, which in this case is the selection made on filters and other visuals on the same report page.

Knowing the current month of a cell in the visualization, the previous month is the maximum month number available in the filter context provided by ALLSELECTED excluding the current and following months. This evaluation is made by the *PreviousYearMonth* variable in the *Sales PM* measure. The *PreviousYearMonth* variable is used to filter the *Year Month Number* in the CALCULATE function that evaluates *Sales Amount* for the previous selected month:

Sales PM := VAR CurrentYearMonth = SELECTEDVALUE ( 'Date'[Year Month Number] ) VAR PreviousYearMonth = CALCULATE ( MAX ( 'Date'[Year Month Number] ), ALLSELECTED ( 'Date' ), KEEPFILTERS ( 'Date'[Year Month Number] < CurrentYearMonth ) ) VAR Result = CALCULATE ( [Sales Amount], 'Date'[Year Month Number] = PreviousYearMonth, REMOVEFILTERS ( 'Date' ) ) RETURN Result

The technique shown in this article can be used whenever you need to retrieve a “previous item” displayed in a Power BI visualization where non-consecutive items are filtered and the requirement is to consider strictly the items that are selected and visible.

]]>A calculation group is like a table in the Tabular model and a calculation item is a value in a column of the table defined for the calculation group. In order to sort the calculation item there is an additional column called Ordinal used to manage the “Sort by Column” behavior, like any other column of a Tabular model. Even though the Ordinal column is not mandatory, it is required in order to make the custom ordering work – without the column Ordinal, the calculation item will be displayed in in alphabetical sort order regardless of the Ordinal property of the calculation item.

Tabular editor automatically creates the Ordinal column in a new calculation group. If you don’t have the Ordinal column in the calculation group because you created the model using a previous version of Tabular Editor, you can fix the model using Tabular Editor 2.9.1 by doing the following:

- Right-click the calc group table > Create New > Data Column
- Set the Source Column property to “Ordinal”
- Set the Data Type property to “Integer / Whole Number (int64)”
- Optionally make the column Ordinal invisible (right-click / Make Invisible)

- Click the column of the corresponding calculation item
- Set the Sort By Column property to “Ordinal”

After that, you can set the Ordinal property of the calculation items by directly assigning a value to that property for each calculation item, or by moving the sort order of calculation items in the calculation group using drag & drop. This way, Tabular Editor automatically adjusts the Order property so that it reflects the desired order of the column.

]]>