<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>SQLBI</title>
	
	<link>http://www.sqlbi.com</link>
	<description>We love Business Intelligence with Microsoft SQL Server</description>
	<lastBuildDate>Sat, 12 May 2012 14:15:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/sqlbi_blog" /><feedburner:info uri="sqlbi_blog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Order of Evaluation in CALCULATE Parameters</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/5syE8WlAVrI/</link>
		<comments>http://www.sqlbi.com/articles/order-of-evaluation-in-calculate-parameters/#comments</comments>
		<pubDate>Wed, 02 May 2012 04:59:10 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BISM]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[PowerPivot]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=923</guid>
		<description><![CDATA[DAX is the new language used by PowerPivot and Analysis Services in Tabular mode and it resembles the syntax of&#8230; <a href="http://www.sqlbi.com/articles/order-of-evaluation-in-calculate-parameters/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>DAX is the new language used by PowerPivot and Analysis Services in Tabular mode and it resembles the syntax of Excel formula and it can be considered a functional language. You do not have iterative statements, but you can run iterative functions like, for example, SUMX and FILTER. The most important functions in DAX are CALCULATE and CALCULATETABLE, and these functions are also the most complex to master, because of the many operations they do. <span id="more-923"></span>In order to fully understand them, you also have to well understand evaluation contexts (row context and filter context).<br />
The general idea is that these functions transform a row context (if exists) into a filter context, which is automatically propagated to related tables, then modify the filter context according to the parameters passed after the first one, and finally evaluate the expression passed as first parameter in the resulting modified filter context.</p>
<p>If you read the previous description carefully, you will discover one behavior that is not always intuitive and can be the source of confusion when you start working with DAX. The order of evaluation of the parameters of a function is usually the same as the order of the parameter: the first parameter is evaluated, then the second, then the third, and so on. This is always the case for most of the DAX functions, but not for CALCULATE and CALCULATETABLE. In these functions, the first parameter is evaluated only after all the others have been evaluated. If you come from a C# background, you can think to the first parameter as a C# callback function, which will be called only later, when its result will be really required.</p>
<p>Thus, when you write:</p>
<pre class="brush: dax; highlight: [3]; title: ; toolbar: false; notranslate">CALCULATE(
    [Measure],
    Customer[Country] = &quot;Italy&quot;
)
</pre>
<p>The FILTER statement is executed first, and then the [Measure] is executed in a filter context where the Customers visible are only those from Italy (assuming Italy is active in the filter context of the caller of the formula).</p>
<p>This seems pretty intuitive, but things are harder when you have nested CALCULATE statements. Consider the following example:</p>
<pre class="brush: dax; highlight: [4,6]; title: ; toolbar: false; notranslate">CALCULATE(
    CALCULATE(
        [Measure],
        Customer[Country] = &quot;Italy&quot;
    ),
    ALL( Customer[Country] )
)
</pre>
<p>In this case, the ALL( Customer[Country] ) is executed before the inner CALCULATE statement, so the filter context removes any existing filter existing on the Country column of the Customer table and then applies a filter to that column that has to be equal to Italy. From a functional point of view, the only difference with the previous CALCULATE formula is that Italy will be the only country selected in evaluating [Measure] <i>regardless of any filter on Country existing in the filter context of the caller</i>.</p>
<p>Now consider this other example:</p>
<pre class="brush: dax; highlight: [4,6]; title: ; toolbar: false; notranslate">CALCULATE(
    CALCULATE(
        [Measure],
        ALL( Customer[Country] )
    ),
    Customer[Country] = &quot;Italy&quot;
)
</pre>
<p>The outer filter over Italy is executed first, and then the ALL( Customer[Country] ) removes any of the effects of the external filter, resulting in a [Measure] that will be evaluated in a filter context that has removed any filter over the Country column in the Customer table.</p>
<p>The following example calculates the number of Italian customers who bought something before 2012. Again, the outer filter over Italy is executed first and it applies its effects to the FILTER function, which is executed in the expression of the outer CALCULATE. The inner CALCULATE is executed for each customer and returns the sales of that customer before 2012.</p>
<pre class="brush: dax; highlight: [3,4,5,6,7,8,9,11]; title: ; toolbar: false; notranslate">CALCULATE(
    COUNTROWS(
        FILTER(
            Customer,
            CALCULATE(
                SUM( Sales[Amount] ),
                YEAR( Sales[Date] ) &lt; 2012
            ) &gt; 0
        )
    )
    Customer[Country] = &quot;Italy&quot;
)
</pre>
<p>A possible mistake at this point is to assume that an inversion in evaluation order happens, whereas all the filter parameters of a CALCULATE are executed independently from each other. In the next expression, the result is the same (Italian customers who bought something before 2012), but the FILTER operates an iteration over all the customers, and not only the Italian ones, because it is executed in parallel with the filter over Italy.</p>
<pre class="brush: dax; highlight: [10]; title: ; toolbar: false; notranslate">CALCULATE(
    COUNTROWS( Customer ),
    FILTER(
        Customer,
        CALCULATE(
            SUM( Sales[Amount] ),
            YEAR( Sales[Date] ) &lt; 2012
        ) &gt; 0
    )
    Customer[Country] = &quot;Italy&quot;
)
</pre>
<p>By using a nested CALCULATE, we force the execution of the filter over Italy before anything else and then this filter is applied to the FILTER statement, which calculates the sales only for Italian customers. In this case the result will be the same, but you might observe different performances between the two solutions (the next nested CALCULATE faster than the previous independent filters), because of the different algorithm that we implemented with the different syntax (even if the results will be the same).</p>
<pre class="brush: dax; highlight: [4,5,6,7,8,9,10,12]; title: ; toolbar: false; notranslate">CALCULATE(
    CALCULATE(
        COUNTROWS( Customer ),
        FILTER(
            Customer,
            CALCULATE(
                SUM( Sales[Amount] ),
                YEAR( Sales[Date] ) &lt; 2012
            ) &gt; 0
        )
    )
    Customer[Country] = &quot;Italy&quot;
)
</pre>
<p>The conclusion is that the order of execution of CALCULATE and CALCULATETABLE parameters is different from other DAX functions and requires you to correctly understand side effects of the filters over the calculation of the complete expression.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=5syE8WlAVrI:zN4iVm7GHgU:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=5syE8WlAVrI:zN4iVm7GHgU:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=5syE8WlAVrI:zN4iVm7GHgU:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=5syE8WlAVrI:zN4iVm7GHgU:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/5syE8WlAVrI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/order-of-evaluation-in-calculate-parameters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/order-of-evaluation-in-calculate-parameters/</feedburner:origLink></item>
		<item>
		<title>Clever Hierarchy Handling in DAX</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/PMJph7LXY9Q/</link>
		<comments>http://www.sqlbi.com/articles/clever-hierarchy-handling-in-dax/#comments</comments>
		<pubDate>Sun, 01 Apr 2012 15:11:00 +0000</pubDate>
		<dc:creator>Alberto Ferrari</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BISM]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[SSAS]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/articles/clever-hierarchy-handling-in-dax/</guid>
		<description><![CDATA[Hierarchy handling in DAX is not very easy, due to the fact that hierarchies, unlike it was in MDX, are&#8230; <a href="http://www.sqlbi.com/articles/clever-hierarchy-handling-in-dax/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>Hierarchy handling in DAX is not very easy, due to the fact that hierarchies, unlike it was in MDX, are not first-class citizens in the DAX world. While hierarchies can be easily defined in the data model, there are no DAX functions that let you access, for example, the parent of the CurrentMember. Well, to tell the truth, there is no concept of CurrentMember in DAX either. <span id="more-907"></span>That said, there are several scenarios where it might be useful to handle hierarchies in DAX and, in this post, I am going to show some techniques to make them work in a clever way, starting from the basics until we reach a good level of hierarchy understanding in DAX.</p>
<p>Let us start with the business scenario. Using AdventureWorks, you can easily build a PivotTable like the following one, creating some measures that compute the ratio of</p>
<ul>
<li>a single product against its subcategory</li>
<li>a single subcategory over its category</li>
<li>a single category over the whole of the products</li>
</ul>
<p>In MDX, we would have simply said the “ratio to parent”. The general pattern of these formulas is pretty easy, in the following piece of code you can see the code of RatioToSubcategory:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">
RatioToSubcategory =
SUM (Sales[SalesAmount])
/
CALCULATE (
    SUM (Sales[SalesAmount]),
    ALL (Products; Products[ProductName]
)
</pre>
<p>For the RatioToSubcategory you divide the sales amount by the total sales amount of the subcategory, by simply freeing the constraint over the product name using ALL. The other two formulas follow the same pattern and are not worth showing. If you put these measures in a PivotTable, you get a result like this:</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image14.png"><img style="border-width: 0px; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb14.png" alt="image" width="561" height="444" border="0" /></a></p>
<p>While it looks nice, there are several issues with this report:</p>
<ul>
<li>The RatioToSubcategory makes sense only at the product level, there is no point to show 100% at the subcategory level and, worse, the number at the category level makes no sense at all, because it contains a wrong number.</li>
<li>It would be really nice to show all these numbers in a single column, instead of having three columns, two of which need always to be cleared.</li>
</ul>
<p>I have highlighted with the yellow color the useful information, all other numbers should be hidden, because they have no meaning.</p>
<p>Luckily, in SQL 2012, DAX has been enriched with the ISFILTERED function, which lets you detect whether a column has been filtered or not. ISFILTERED is your best friend when it comes to detect which level of a hierarchy you are browsing. For example, to detect whether you are at the product level, it is enough to check whether the product name has been filtered and make your formula behave accordingly. For example, you can make a smarter version of the RatioToSubcategory that automatically detects whether you are browsing the hierarchy at the product level and BLANKs the value if this is not the case. The updated code of RatioToSubcategory is the following:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">
IF (
    ISFILTERED (Products[ProductName]),
    SUM (Sales[SalesAmount])
    /
    CALCULATE (
        SUM (Sales[SalesAmount]),
        ALL (Products[ProductName])
    )
)
</pre>
<p>At the product level, everything works fine but, at the category and subcategory level, we need to make more complex tests in order to check whether we are exactly at the correct level. For example, to check whether we are showing some data at the subcategory level, we need to verify that there is a filter on the subcategory and, at the same time, there is no filter on product name (otherwise, if there is a filter on the product name, we are at the product level). Thus, the code is a little bit tricky:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">
IF (
    ISFILTERED (Products[Subcategory]) &amp;&amp; NOT (ISFILTERED (Products[ProductName])),
    SUM (Sales[SalesAmount])
    /
    CALCULATE (
        SUM (Sales[SalesAmount]),
        ALL (Products[Subcategory])
    )
) </pre>
<p>Making this update to all of our measures, we get this nicer results, where useless values disappeared:</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image15.png"><img style="border-width: 0px; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb15.png" alt="image" width="582" height="296" border="0" /></a></p>
<p>Not exactly sexy, but at least it does not contain any wrong value. In order to make it nicer, we can now combine everything in a single measure, which shows the percentage over the parent in a single column. The code is a bit longer, but it does its job:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">
RatioToParent :=
IF (
    ISFILTERED (Products[ProductName]),
    SUM (Sales[SalesAmount])
    /
    CALCULATE (
        SUM (Sales[SalesAmount]),
        ALL (Products[ProductName])
    ),
    IF (
        ISFILTERED (Products[Subcategory]),
        SUM (Sales[SalesAmount])
        /
        CALCULATE (
            SUM (Sales[SalesAmount]),
            ALL (Products[Subcategory])
        ),
        IF (
            ISFILTERED (Products[Category]),
            SUM (Sales[SalesAmount])
            /
            CALCULATE (
                SUM (Sales[SalesAmount]),
                ALL (Products[Category])
            )
        )
    )
)
</pre>
<p>And the result looks what we wanted from the beginning. In the following picture you can see all the measures together even if the RatioToParent will be the only visible measure in a real-world report.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image16.png"><img style="border-width: 0px; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb16.png" alt="image" width="645" height="285" border="0" /></a></p>
<p>As nice as it looks, this formula has a major drawback. If you add a slicer on ProductName to the PivotTable, for example, the formula stops working and shows meaningless values. In the next figure you can see that a slicer which selected “Mountain Bottle Cage” and “Road Bottle Cage” transformed the numbers in a hard-to-understand way:</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image17.png"><img style="border-width: 0px; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb17.png" alt="image" width="919" height="332" border="0" /></a></p>
<p>The problem is that the formula, in order to detect the hierarchy level, in reality checks for the existence of a filter on the ProductName column. Due to the presence of the slicer, the ProductName is always filtered. In fact, as you can easily note from the figure, the RatioToSubcategory column always contain a value, while RatioToCategory and RatioToAll are always empty. I leave as an interesting exercise to the reader the explanation of what numbers are shown at the category and subcategory level: they have a meaning but I bet that no user will be happy with what their represent.</p>
<p>How do we solve this issue? Well, it is pretty easy, indeed, even if not very intuitive. The problem is that we are leaving the ability to the user to filter the columns on our hierarchy and this interacts with our hierarchy detection system. A possible solution would be to hide the columns in the hierarchy but this turns out to be too restrictive: users like slicers, and they have good reason to set filters. But, thinking carefully, users want to filter product names, not necessarily the same product names we are using on the hierarchy. Thus, we are searching for a way to let users filter names without filtering our hierarchy. You can reach this goal duplicating all of the columns you use in the hierarchy and hiding the ones in the hierarchy.</p>
<p>In the next figure, you can see the new data model. Category, Subcategory and ProductName have been copied in new columns, called HCategory, HSubcategory and HProductName. These last columns have been used to create the hierarchy and have been hidden from the client tools, so that the user has no way to use them in slicers. The only way to put a filter on these columns is to use the hierarchy. On the other hand, the user can still filter the product names using the original columns.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image18.png"><img style="border-width: 0px; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb18.png" alt="image" width="413" height="391" border="0" /></a></p>
<p>It is important to note that, from the calculation point of view, filtering either Category or HCategory leads to the same result. From the filter context point of view, on the other hand, there is a big difference between filtering one or the other, and we take advantage of this.</p>
<p>The formula needed to be updated, of course:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">
RatioToParent :=
IF (
    ISFILTERED (Products[HProductName]),
    SUM (Sales[SalesAmount])
    /
    CALCULATE (
        SUM (Sales[SalesAmount]),
        ALL (Products[HProductName])
    ),
    IF (
        ISFILTERED (Products[HSubcategory]),
        SUM (Sales[SalesAmount])
        /
        CALCULATE (
            SUM (Sales[SalesAmount]),
            ALL (Products[HSubcategory])
        ),
        IF (
            ISFILTERED (Products[HCategory]),
            SUM (Sales[SalesAmount])
            /
            CALCULATE (
                SUM (Sales[SalesAmount]),
                ALL (Products[HCategory])
            )
        )
    )
)
</pre>
<p>Now, the same PivotTable as before, with the new formula, shows meaningful values:</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image19.png"><img style="border: 0px currentcolor; padding-top: 0px; padding-right: 0px; padding-left: 0px; display: inline; background-image: none;" title="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb19.png" alt="image" width="674" height="340" border="0" /></a></p>
<p>The formula now checks the HProductName column, which is invisible to the user but still available in DAX code and does the same for HCategory and HSubcategory. If you don’t tell the user about this trick, he will never see any difference between the column names in the hierarchy and the ones available for filtering, leading to a very nice user experience.</p>
<p>The only drawback is that this formula will work if and only if the user uses the hierarchy on rows or columns, othwerwise, if he places the category column on rows, the formula will not work, since placing the category on rows does not create any filter on the HCategory column. This is not usually an issue, it is just something to tell users to be aware of.</p>
<p>At this point, you might guess why I did not use the Excel built-in function to obtain the same result. The reason is quite simple: while Excel makes it easy to compute the ratio over parent, this technique has a much greater flexibility, because it defines the logic in the formula and does not require any user intervention to work. Moreover, the same technique of duplicating columns in hierarchies can prove useful in many different scenario, not natively handled by Excel.</p>
<p>As it often happens, Tabular does not have all of the features of Multidimensional but, by means of playing with filter contexts, you have the option to reproduce the same behavior of Multidimensional, still experiencing the tremendous power of Vertipaq (ops, of the xVelocity in-memory Analytics Engine).</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=PMJph7LXY9Q:LFqr7PYRlA4:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=PMJph7LXY9Q:LFqr7PYRlA4:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=PMJph7LXY9Q:LFqr7PYRlA4:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=PMJph7LXY9Q:LFqr7PYRlA4:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/PMJph7LXY9Q" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/clever-hierarchy-handling-in-dax/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/clever-hierarchy-handling-in-dax/</feedburner:origLink></item>
		<item>
		<title>Incremental Processing in Tabular Using Process Add</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/Z1fds9GOlvU/</link>
		<comments>http://www.sqlbi.com/articles/incremental-processing-in-tabular-using-process-add/#comments</comments>
		<pubDate>Sat, 11 Feb 2012 05:31:36 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[Process]]></category>
		<category><![CDATA[SSAS]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=760</guid>
		<description><![CDATA[In Analysis Services 2012 you can process a table in a Tabular model by several ways: you can process the&#8230; <a href="http://www.sqlbi.com/articles/incremental-processing-in-tabular-using-process-add/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>In Analysis Services 2012 you can process a table in a Tabular model by several ways: you can process the whole table, you can split the table in several partitions and process a single partition, you can merge partitions and you can incrementally process a single partition by using ProcessAdd, which is the topic of this article.</p>
<p><span id="more-760"></span>
<p>The ProcessAdd command allows you to add only a few rows to an existing partition and to do that you have to specify the query that reads data from the data source, applying the necessary WHERE condition in a SQL query or by using any other SQL statement to this purpose. The ProcessAdd command available in Process Partition(s) dialog box in SQL Server Management Studio (SSMS) does not allow you to specify a custom query for the process operation, in order to filter only new rows that have to be added to the partition. However, this is not a big issue: in fact, if you need to use ProcessAdd, probably you need to automate that command in a batch process. Thus, a programmatic approach is required. A XMLA script command is required and you will see how to programmatically obtain it by using AMO and PowerShell.</p>
<p>It is out of scope of this article describing how you should define the SQL command that only returns the new rows to be added to the table in a Tabular model. Remember that it is your responsibility avoiding duplicate rows in the destination table. There is no automatic detection of duplicates and if the table does not have unique columns you would obtain row duplicates in your table as a result, otherwise the process operation will stop with an error if a unique condition for a column is violated by loading new data.</p>
<p>Technically, when a ProcessAdd runs internally Analysis Services creates a new partition, process the whole partition and then merge it to the target partition (the one on which ProcessAdd command has been executed). You can use this same approach by using separate operations, but ProcessAdd can be more optimized for this specific activity.</p>
<p>In the following sections you will see how to execute and automate ProcessAdd by using different tools.</p>
<h1>ProcessAdd with XMLA Script</h1>
<p>In order to process a table partition in Tabular, you have to issue a process command to a partition of a measure group in the corresponding Multidimensional model that Analysis Services publish in order to make it queryable by any existing OLAP client tool. The Batch element contains a Process command, which specify the target partition / measure group / cube / database, followed by a Bindings element that replaces the existing query binding on the partition with a different query, which will be used just for this process command.</p>
<pre class="brush: xml; highlight: [20,21,22]; title: ; toolbar: false; notranslate">&lt;Batch xmlns=&quot;http://schemas.microsoft.com/analysisservices/2003/engine&quot;&gt;
    &lt;Process&gt;
        &lt;Object&gt;
            &lt;DatabaseID&gt;AdventureWorks Tabular Model SQL 2012&lt;/DatabaseID&gt;
            &lt;CubeID&gt;Model&lt;/CubeID&gt;
            &lt;MeasureGroupID&gt;Internet Sales_78de3956-70d9-429f-9857-c407f7902f1e&lt;/MeasureGroupID&gt;
            &lt;PartitionID&gt;Internet Sales_797b5664-d3d8-441f-ab29-b3cc76cdc1ff&lt;/PartitionID&gt;
        &lt;/Object&gt;
        &lt;Type&gt;ProcessAdd&lt;/Type&gt;
        &lt;WriteBackTableCreation&gt;UseExisting&lt;/WriteBackTableCreation&gt;
    &lt;/Process&gt;
    &lt;Bindings&gt;
        &lt;Binding&gt;
            &lt;DatabaseID&gt;AdventureWorks Tabular Model SQL 2012&lt;/DatabaseID&gt;
            &lt;CubeID&gt;Model&lt;/CubeID&gt;
            &lt;MeasureGroupID&gt;Internet Sales_78de3956-70d9-429f-9857-c407f7902f1e&lt;/MeasureGroupID&gt;
            &lt;PartitionID&gt;Internet Sales_797b5664-d3d8-441f-ab29-b3cc76cdc1ff&lt;/PartitionID&gt;
            &lt;Source xsi:type=&quot;QueryBinding&quot; xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; &gt;
                &lt;DataSourceID&gt;d58c09be-3cc9-4688-8d08-026c407f5ad7&lt;/DataSourceID&gt;
                &lt;QueryDefinition&gt;
                    SELECT * FROM FactInternetSales WHERE OrderDateKey &amp;gt;= 20120215
                &lt;/QueryDefinition&gt;
            &lt;/Source&gt;
        &lt;/Binding&gt;
    &lt;/Bindings&gt;
&lt;/Batch&gt;
</pre>
<p>The names that you can see in MeasureGroupID, PartitionID and DataSourceID elements contain a GUID in the name that has been automatically generated by the editor when you created the data model. In order to find the right values you have to investigate the object properties in SQL Server Management Studio, otherwise you have to use the AMO or PowerShell approach that are described later in this article, which automatically find the right name starting from the display name of the object.</p>
<p>The QueryDefinition element contains the query that will be used by the ProcessAdd operation. You can use any valid SQL statement here that will produce the same columns required by the partition that you are going to incrementally update. In the example a SELECT statement has been used, filtering all the orders having an order date greater than or equal to February 15<sup>th</sup> 2012. It is up to you to define a safe filter condition and to avoid loading the same rows multiple times (for example, with this filter condition if an order has a date of February 16<sup>th</sup> and the day after you filter orders greater than or equal to February 16<sup>th</sup>, that order row will be loaded twice).</p>
<h1>ProcessAdd with Integration Services</h1>
<p>The XMLA script described in the previous section can be generated by using the Analysis Services Processing Task component in Integration Services. Cathy Dumas described how to use this component in a <a href="http://cathydumas.com/2012/02/14/doing-a-processadd-in-ssis/">step-by-step article in her blog</a>. This is probably the simplest user interface that makes you able to obtain the corresponding XMLA Script for a Process Add. After you correctly defined the task in Integration Services, you can capture the XMLA Script command by running the task when a SQL Profiler trace is active on the Analysis Services server. The Command Begin event class contains in TextData property the XMLA batch command that you need. Getting XMLA in this way might be useful if you just need to replace a parameter in the SQL query that is specified in the QueryDefinition element of the XMLA script.</p>
<h1>ProcessAdd with AMO</h1>
<p>Using Analysis Management Objects (AMO) you can generate the same Process Add command you have seen in XMLA Script without worrying about internal GUIDs. You can read a description of the usage of AMO commands in a <a href="http://cathydumas.com/2012/01/23/doing-a-processadd-in-amo/">blog post written by Cathy Dumas</a>. The XMLA script you have seen before in this article can be obtained and executed by using the following C# code (you can capture the XMLA Script by using the <a href="http://technet.microsoft.com/en-us/library/ms345082(v=sql.110).aspx#CaptureLog">CaptureLog class and CaptureXML attribute</a>).</p>
<pre class="brush: cpp; title: ; toolbar: false; notranslate">namespace AmoAutomation {
    class Program {
        static void Main(string[] args) {
            Server server = new Server();
            server.Connect(@&quot;localhostTABULAR&quot;);
            Database db = server.Databases[&quot;AdventureWorks Tabular Model SQL 2012&quot;];
            DataSourceView dsv = db.DataSourceViews.GetByName(&quot;Sandbox&quot;);
            Cube cube = db.Cubes.GetByName(&quot;Model&quot;);
            MeasureGroup measureGroup = cube.MeasureGroups.GetByName(&quot;Internet Sales&quot;);
            Partition partition = measureGroup.Partitions.GetByName(&quot;Internet Sales&quot;);
            partition.Process(
                ProcessType.ProcessAdd,
                new QueryBinding(
                        dsv.DataSourceID,
                        &quot;SELECT * FROM FactInternetSales WHERE OrderDateKey &gt;= 20120215&quot;));
            server.Disconnect();
        }
    }
}
</pre>
<h1>ProcessAdd with PowerShell</h1>
<p>Once you know how to create the desired process command with AMO, you can easily translate that code in a PowerShell script. For example, the AMO code you have seen in the previous section can be translated into the following PowerShell script:</p>
<pre class="brush: cpp; title: ; toolbar: false; notranslate">[Reflection.Assembly]::LoadWithPartialName(&quot;Microsoft.AnalysisServices&quot;)
$server = New-Object Microsoft.AnalysisServices.Server
$server.connect(&quot;localhostK12&quot;)
$db = $server.Databases.Item(&quot;AdventureWorks Tabular Model SQL 2012&quot;)
$dsv = $db.DataSourceViews.GetByName(&quot;Sandbox&quot;)
$cube = $db.Cubes.GetByName(&quot;Model&quot;)
$measureGroup = $cube.MeasureGroups.GetByName(&quot;Internet Sales&quot;)
$partition = $measureGroup.Partitions.GetByName(&quot;Internet Sales&quot;)
$queryBinding = New-Object Microsoft.AnalysisServices.QueryBinding( $dsv.DataSourceID, &quot;SELECT * FROM FactInternetSales WHERE OrderDateKey &gt;= 20120215&quot; )
$partition.Process( &quot;ProcessAdd&quot;, $queryBinding )
$server.Disconnect()
</pre>
<h1>Conclusion</h1>
<p>Incremental processing of tables in a Tabular model is possible but it requires using commands on Multidimensional entities. This is because by now an API over the real Tabular model is not available. This article showed several ways to execute an incremental ProcessAdd command on a table in a Tabular model, providing a temporary query binding that identifies only the rows to load in the ProcessAdd batch, without changing underlying Tabular structure or views in SQL Server. You can use the same pattern, choosing the technique that better adapts to your periodic process batch operation.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=Z1fds9GOlvU:IJ5U8rIg_gY:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=Z1fds9GOlvU:IJ5U8rIg_gY:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=Z1fds9GOlvU:IJ5U8rIg_gY:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=Z1fds9GOlvU:IJ5U8rIg_gY:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/Z1fds9GOlvU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/incremental-processing-in-tabular-using-process-add/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/incremental-processing-in-tabular-using-process-add/</feedburner:origLink></item>
		<item>
		<title>Memory Settings in Tabular Instances of Analysis Services</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/kv8uTtRx5XE/</link>
		<comments>http://www.sqlbi.com/articles/memory-settings-in-tabular-instances-of-analysis-services/#comments</comments>
		<pubDate>Sat, 28 Jan 2012 11:36:20 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[SSAS]]></category>
		<category><![CDATA[Tabular]]></category>
		<category><![CDATA[VertiPaq]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=747</guid>
		<description><![CDATA[Memory settings in Analysis Services might be an important tuning of a correct setup, both for production and developer machines.&#8230; <a href="http://www.sqlbi.com/articles/memory-settings-in-tabular-instances-of-analysis-services/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>Memory settings in Analysis Services might be an important tuning of a correct setup, both for production and developer machines. Analysis Services 2012 can be installed in different ways and the new Tabular instance has new memory settings that are important to know.</p>
<p><span id="more-747"></span>
<p>I expect that many servers (and developer’s workstations) will have at least two instances: one for BISM Multidimensional and the other for BISM Tabular. A Multidimensional has all the memory settings that were already in previous versions of Analysis Services. You can find more info about <a href="http://msdn.microsoft.com/en-us/library/ms174514(v=SQL.110).aspx">Memory Properties</a> on MSDN Books On Line. In case you have more instances running on the same machine, you might want to consider altering the default settings, especially on a production machine where the two instances are used concurrently (it might be less important on a developer workstation where only one of the two instances is used at a particular time).</p>
<p>The more important settings (including their default value) are:</p>
<ul>
<li>Memory / HardMemoryLimit – 0</li>
<li>Memory / LowMemoryLimit – 65</li>
<li>Memory / TotalMemoryLimit – 80</li>
</ul>
<p>A Tabular instance shares these same settings and adds new properties that are specific to the Vertipaq engine:</p>
<ul>
<li>Memory / VertiPaqMemoryLimit – 60</li>
<li>Memory / VertiPaqPagingPolicy – 0</li>
</ul>
<p>The value of <b>VertiPaqPagingPolicy</b> defines how paging works with Tabular.</p>
<ul>
<li>Mode 0 – Locked – Lock all VertiPaq data in memory</li>
<li>Mode 1 – Allow system paging – Lock only hash dictionaries in memory, and allow VertiPaq data to exceed total physical memory</li>
</ul>
<p>In case you choose mode 0, the <b>VertiPaqMemoryLimit</b> defines the total amount of memory VertiPaq is allowed to lock in the working set (and hence the total that can be used for in-memory databases – remember that the Analysis Services service might use more memory for other reasons). In mode 1, it defines a limit for the physical memory that is used by VertiPaq, allowing paging for the remaining memory (virtual committed memory) above this limit.</p>
<p>The VertiPaqPagingPolicy setting provides a way to prevent VertiPaq data from interacting badly with the memory cleaning subsystem. In mode 1, it causes the cleaner subsystem to ignore memory allocated for VertiPaq data beyond VertiPaqMemoryLimit when calculating the price of memory.  In this mode, the server’s total memory usage will be permitted to exceed physical memory, constrained primarily by total virtual memory, paging data out to system pagefile.</p>
<p>If you want to reduce memory for an instance of Analysis Services, it makes sense to set VertiPaqMemoryLimit to a number that is lower than LowMemoryLimit.</p>
<p>Now, some more details about other memory settings in Tabular and their interaction with VertiPaqPagingPolicy and VertiPaqMemoryLimit regarding to cache eviction in VertiPaq. A big thank you to Brad Daniels and Akshai Mirchandani for sharing many details you can find in this article.</p>
<p><b>HardMemoryLimit</b> is the maximum memory the system can ever have. If SSAS exceed the hard memory limit, the system will aggressively kill active sessions in order to reduce memory usage. Sessions killed for this reason will receive an error about being cancelled due to memory pressure. In PowerPivot or BISM mode with a VertiPaqPagingPolicy other than 0, it is also the limit for the maximum working set of the process. If HardMemoryLimit is set to 0, it will use a default value of midway between the high memory limit and total physical memory (or total virtual address space, if you’re on a 32 bit machine where physical memory exceeds virtual memory).
<p><b>TotalMemoryLimit</b> is the high memory limit – it is used to calculate the memory price for normal cache eviction and memory cleaning. If it exceeds the total memory limit, the memory manager will evict all cached data which is not currently in use. TotalMemoryLimit must always be less than HardMemoryLimit.
<p><b>LowMemoryLimit</b> is the point where the system will begin cleaning memory out of caches. As memory usage increases above the low memory limit, SSAS get more aggressive about evicting cached data until it hits the high/total memory limit, at which point it evicts everything that isn’t pinned.
<p>Since cache eviction decisions and hard limit enforcement are normally done based on the process’s total memory usage, it has been necessary to change that calculation when allowing databases to exceed physical memory. Therefore, when VertiPaqPagingPolicy is nonzero, indicating that memory can grow beyond total physical, the system will track the total memory used by VertiPaq as a separate quantity (as reported in the MemoryVertiPaq* counters), and if the total memory used by VertiPaq exceeds the VertiPaqMemoryLimit, the memory used by VertiPaq in excess of the limit will be ignored for the purposes of determining what to evict.</p>
<p>
For example, say VertiPaqMemoryLimit is 100GB, LowMemoryLimit is 110GB, and TotalMemoryLimit is 120GB.&nbsp; Now assume that VertiPaq data structures are using 210GB of memory, and the process’ total memory usage is 215GB.&nbsp; This number is well above the TotalMemoryLimit (and probably above HardMemoryLimit), so ignoring VertiPaqMemoryLimit, the cleaning would be very aggressively, and would kill sessions. However, when PagingPolicy is 1, the memory used by VertiPaq in excess of the limit is ignored for the purpose of computing memory pressure. This means that the number used is &lt;Total Memory&gt; (215GB) – &lt;Total VertiPaq Memory&gt;(210GB) + &lt;VertiPaqMemoryLimit&gt;(100GB) = 105GB, which is below the LowMemoryLimit, so the cache is not cleaned at all.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=kv8uTtRx5XE:aXH6NZTdkJ8:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=kv8uTtRx5XE:aXH6NZTdkJ8:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=kv8uTtRx5XE:aXH6NZTdkJ8:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=kv8uTtRx5XE:aXH6NZTdkJ8:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/kv8uTtRx5XE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/memory-settings-in-tabular-instances-of-analysis-services/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/memory-settings-in-tabular-instances-of-analysis-services/</feedburner:origLink></item>
		<item>
		<title>Optimizing High Cardinality Columns in VertiPaq</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/pcif0NBg4Z8/</link>
		<comments>http://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 06:45:00 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[PowerPivot]]></category>
		<category><![CDATA[Tabular]]></category>
		<category><![CDATA[VertiPaq]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=737</guid>
		<description><![CDATA[VertiPaq is the internal column-based database engine used by PowerPivot and BISM Tabular models. High cardinality columns might be the&#8230; <a href="http://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>VertiPaq is the internal column-based database engine used by PowerPivot and BISM Tabular models. High cardinality columns might be the more expensive parts of a table. If you cannot remove a high cardinality column from a table, by using the techniques described in this article you might optimize its storage saving up to more than 90% of original space.</p>
<p><span id="more-737"></span>
<p>Because of its nature, in VertiPaq every table is stored by column instead than by row. For each column it creates a dictionary of distinct values and a sort of bitmap index that references the dictionary. Such a bitmap index can be highly compressed and high cardinality columns might have a dictionary cost that represents more than 90% of the cost for the column. You can save this space by splitting the column in two or more columns with a smaller number of distinct values. The resulting dictionary will be a fraction of the original one and you will be able to obtain the original value by combining the two values. You can use different techniques depending on the data type of the column you want to reduce. The following examples are based on a hypothetical SQL Server data source, you can easily adapt them to other data sources but you have to remember that these transformations have to be applied before data is imported by VertiPaq: creating calculated columns in a table would not save the space required by the original column.</p>
<p><strong>WARNING:</strong> By splitting a column in more column, you lose the ability to obtain a DISTINCTCOUNT calculation over a single column. You can obtain the same result by using COUNTROWS(SUMMARIZE(table,col1,col2,&#8230;,colN)) but performance can be much slower and there will be a high memory pressure at query time, especially when a large number of rows is involved in calculation.</p>
<h1>DATETIME Columns</h1>
<p>If you have a DATETIME column containing a sort of timestamp of an event (i.e. both date and time), it is more efficient to split it into two columns, one for the date and one for the time. You might use DATE and TIME data type in SQL Server, but in VertiPaq you will use always the same Date data type. The date column will have always the same time and the time column will have always the same Date. In this way you will have a maximum number of rows for date which is 365 multiplied by the number of years stored, and a maximum number of rows for time that depends on time granularity (for example, you have 86,400 seconds per day).
<p>For example, in SQL Server you might have a EventLog DATETIME column extracted in this way:
<pre class="brush: sql; title: ; toolbar: false; notranslate">SELECT
    EventLog,
    Action,
    Value
FROM Log</pre>
<p>You can split the EventLog column into two columns in this way:
<pre class="brush: sql; highlight: [2,3]; title: ; toolbar: false; notranslate">SELECT
    CAST( EventLog AS DATE ) AS EventDate,
    CAST( EventLog AS TIME( 0 ) ) AS EventTime,
    Action,
    Value
FROM Log</pre>
<p>This is an important optimization for any column of DATETIME data type. You can avoid importing the TIME column in VertiPaq in case you are not interested in this part. In the sample query above, the EventTime is rounded down to the second, so that you will not have more than 86,400 distinct values in that column.<br />
<h1>INT and BIGINT Columns</h1>
<p>A 32-bit or 64-bit integer value in SQL Server is stored in VertiPaq by using the same 64-bit integer format. The size used by VertiPaq is not relevant, because the actual storage cost is related to the dictionary, which depends on the number of distinct values. If you have a column with a large number of distinct values in an integer column, you can split it into two or more columns by using a smaller range of values. A typical scenario for this need is the transaction ID in a fact table, which might be required in order to retrieve a single transaction. For example, imagine you have a table with a TransactionID using all the numbers from 1 to 100,000,000. You have 100 millions of distinct values in the TransactionID dictionary.
<pre class="brush: sql; highlight: [2]; title: ; toolbar: false; notranslate">SELECT
    TransactionID,
    Quantity,
    Price
FROM Fact</pre>
<p>You can split this number into two number ranging from 1 to 10,000.
<pre class="brush: sql; highlight: [2,3]; title: ; toolbar: false; notranslate">SELECT
    TransactionID / 10000 AS TransactionHighID,
    TransactionID % 10000 AS TransactionLowID,
    Quantity,
    Price
FROM Fact</pre>
<p>As you can see, the TransactionID is no longer imported and you will be able to obtain it by using the following DAX measure:
<pre class="brush: dax; title: ; toolbar: false; notranslate">Fact[TransactionID] :=
IFERROR(
    VALUES( Fact[TransactionHighID] ) * 10000 + VALUES( Fact[TransactionLowID] ),
    BLANK()
)</pre>
<p>In case you want to apply a filter over a table, you have to split the TransactionID you are looking for into two parts (just replace &lt;TransactionID&gt; with the actual value in the following syntax):
<pre class="brush: dax; highlight: [3,4]; title: ; toolbar: false; notranslate">CALCULATETABLE(
    Fact,
    Fact[TransactionHighID] = INT( @TransactionID / 10000 ),
    Fact[TransactionLowID] = MOD( @TransactionID, 10000 )
)</pre>
<p>The original cost of the TransactionID column for 100 million values is near to 3GB in VertiPaq, whereas the splitted version requires less than 200MB. This is more than 90% saving!
<p>It is possible to further reduce this number by splitting the original column in more than two columns. For example, you can obtain the 3 column version by using this SQL syntax:
<pre class="brush: sql; highlight: [2,3,4]; title: ; toolbar: false; notranslate">SELECT
    TransactionID / 1000000 AS TransactionHighID,
    (TransactionID / 1000) % 1000 AS TransactionMidID,
    TransactionID % 1000 AS TransactionLowID,
    Quantity,
    Price
FROM Fact</pre>
<p>And you can also go further with the same technique, up to one column per each digit. We tested the split up to 8 columns and it is interesting to consider no only the storage cost, but also the processing time. The following table shows the results measure on a 8-core server.<br />
<table border="0" cellspacing="0" cellpadding="2" width="435">
<tbody>
<tr>
<td valign="top" width="158"><strong>Number of Columns</strong></td>
<p><strong><br />
</strong>
<td valign="top" width="107"><strong>Process Time</strong></td>
<p><strong><br />
</strong>
<td valign="top" width="88"><strong>Cores Used</strong></td>
<td valign="top" width="80"><strong>Disk Size</strong></td>
</tr>
<tr>
<td valign="top" width="158">1 (original)</td>
<td valign="top" width="107">02:48</td>
<td valign="top" width="88">1</td>
<td valign="top" width="80">
<p align="right">2,811 MB</p>
</td>
</tr>
<tr>
<td valign="top" width="158">2</td>
<td valign="top" width="107">03:21</td>
<td valign="top" width="88">up to 8</td>
<td valign="top" width="80">
<p align="right">191 MB</p>
</td>
</tr>
<tr>
<td valign="top" width="158">3</td>
<td valign="top" width="107">03:49</td>
<td valign="top" width="88">up to 8</td>
<td valign="top" width="80">
<p align="right">129 MB</p>
</td>
</tr>
<tr>
<td valign="top" width="158">4</td>
<td valign="top" width="107">04:01</td>
<td valign="top" width="88">up to 8</td>
<td valign="top" width="80">
<p align="right">97 MB</p>
</td>
</tr>
<tr>
<td valign="top" width="158">8</td>
<td valign="top" width="107">05:32</td>
<td valign="top" width="88">up to 8</td>
<td valign="top" width="80">
<p align="right">105 MB</p>
</td>
</tr>
</tbody>
</table>
<p>As you can see, the processing of a single column is a single-thread operation. Processing the multiple-column version requires more resources and only having multiple cores you will see similar execution times. With a smaller number of available cores, processing would have required longer processing times. However, more complex evaluations should be done in case partitioning is involved and this is out of the scope for this article.<br />
<h1>String Columns</h1>
<p>You can split a string columns by using the same technique you have seen for INT and BIGINT columns. The only difference is that strings have to be split by using string functions. This is a useful technique in case the TransactionID in your fact table contains non-numeric characters. For example, you can split an alphanumeric TransactionID Column with fixed 10-character length in this way:
<pre class="brush: sql; highlight: [2,3]; title: ; toolbar: false; notranslate">SELECT
    LEFT( TransactionID, 5 ) AS TransactionHighID,
    SUBSTRING( TransactionID, 6, LEN( TransactionID ) - 5 ) AS TransactionLowID,
    Quantity,
    Price
FROM Fact
</pre>
<p>The split algorithm for a string column should consider the distribution of the final result, in order to populate two smaller dictionaries in order to obtain a good space saving.<br />
<h1>Conclusion</h1>
<p>High cardinality columns in PowerPivot and BISM Tabular models can be particularly expensive. The best practice is to remove them from the model, especially when these columns are not relevant for data analysis, such as a GUID or timestamp of a SQL Server table. However, whenever the information they contain is required, you can optimize these columns by splitting the value in two or more columns with a smaller number of distinct values. This will require some more effort when accessing the column value, but the saving can be so high in large tables that it could definitely worth the effort.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=pcif0NBg4Z8:0yxxbdHbnGk:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=pcif0NBg4Z8:0yxxbdHbnGk:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=pcif0NBg4Z8:0yxxbdHbnGk:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=pcif0NBg4Z8:0yxxbdHbnGk:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/pcif0NBg4Z8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/</feedburner:origLink></item>
		<item>
		<title>Budget and Other Data at Different Granularities in PowerPivot</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/qlFFdZ7A2ns/</link>
		<comments>http://www.sqlbi.com/articles/budget-and-other-data-at-different-granularities-in-powerpivot/#comments</comments>
		<pubDate>Sat, 07 Jan 2012 12:02:23 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BISM]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[PowerPivot]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=710</guid>
		<description><![CDATA[PowerPivot and BISM Tabular handle relationships between tables by using a single column that requires to be a unique column&#8230; <a href="http://www.sqlbi.com/articles/budget-and-other-data-at-different-granularities-in-powerpivot/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>PowerPivot and BISM Tabular handle relationships between tables by using a single column that requires to be a unique column in the lookup table. In this article you will see how to handle relationships between tables that have different granularities, which is a typical scenario whenever you have a monthly budget table and a daily level sales table.</p>
<p><span id="more-710"></span>
<p>The starting point is a <a href="http://www.sqlbi.com/wp-content/uploads/Budget-PowerPivot2.zip">PowerPivot workbook</a> containing sales divided by Product, Date and Territory (it is an extraction of AdventureWorks sample database). The budget will be assigned to Product Category, Year and Month. Thus, you do not have the same granularity between Sales and Budget, because Territory is not present in budget and both Dates and Product tables are referenced at different granularities.</p>
<p>This is the <strong>Product</strong> table.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image2.png"><img style="display: inline; background-image: none;" title="Product table" border="0" alt="Product table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb2.png" width="888" height="158"></a></p>
<p>This is the <strong>Dates</strong> table.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image3.png"><img style="display: inline; background-image: none;" title="Dates table" border="0" alt="Dates table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb3.png" width="611" height="216"></a></p>
<p>This is the <strong>Territory</strong> table.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image4.png"><img style="display: inline; background-image: none;" title="Territory table" border="0" alt="Territory table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb4.png" width="498" height="268"></a></p>
<p>And this is the <strong>Sales</strong> table.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image5.png"><img style="display: inline; background-image: none;" title="Sales table" border="0" alt="Sales table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb5.png" width="474" height="213"></a></p>
<p>As you can see, every row in the Sales table is related to a product to a single date, other than to a single territory. You can define the following relationships.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/SNAGHTMLb0b444.png"><img style="margin: 0px; display: inline; background-image: none;" title="SNAGHTMLb0b444" border="0" alt="SNAGHTMLb0b444" src="http://www.sqlbi.com/wp-content/uploads/SNAGHTMLb0b444_thumb.png" width="244" height="160"></a></p>
<p>After you created the required relationships, you can browse data in a PivotTable. For instance, you can see OrderQuantity value grouped by Product Category and Month in the following screenshot.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image6.png"><img style="display: inline; background-image: none;" title="OrderQuantity by Product Category and Month" border="0" alt="OrderQuantity by Product Category and Month" src="http://www.sqlbi.com/wp-content/uploads/image_thumb6.png" width="278" height="565"></a></p>
<p>In an Excel table you define the <strong>Budget</strong> table with Category, Year, Month and Budget columns (the Budget value will be compared to OrderQuantity).</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image7.png"><img style="display: inline; background-image: none;" title="Budget table" border="0" alt="Budget table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb7.png" width="287" height="504"></a></p>
<p>At this point you cannot create a relationship between Budget and Dates or Product tables. The identity key in Dates table is the Date and in Product table is the Product, whereas you have a budget defined at an higher hierarchical level. But we do not have the notion of hierarchies in PowerPivot and BISM Tabular. An Analysis Services developer at this point might argue that a Multidimensional model can define relationships between fact and dimensions at different granularities, but this feature is not available here because the notion of attribute relationships inside a dimension simply doesn’t exist in Tabular. Thus, we have to rely on some DAX measure in order to make the comparison between OrderQuantity and Budget in a the same PivotTable, possibly without the need of modifying the data we already imported. In fact, a common solution is to create a table derived by Sales that group data by Product Category, Year and Month, resulting in a table that has the same granularity of the Budget one. However, this solution would require some form of ETL and is less flexible than the one proposed in this article.</p>
<p>If you consider the existing data, you might create a relationship between Budget[Category] and Product[Category] columns. This is not physically possible, but we will use this “virtual” relationship later in some DAX expression. A similar “virtual” relationship exists between Budget and Dates table, by using Year and Month columns. In order to to simplify the following DAX expression, it is better having a single column that allows a “virtual” relationship between these tables. You can create the YearMonth column in the Dates table by using the following DAX expression.</p>
<p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; toolbar: false; notranslate">=Dates[Year]*100+Dates[MonthNumber]</pre>
</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image8.png"><img style="display: inline; background-image: none;" title="YearMonth definition in Dates table" border="0" alt="YearMonth definition in Dates table" src="http://www.sqlbi.com/wp-content/uploads/image_thumb8.png" width="766" height="111"></a></p>
<p>And then you can create the YearMonth column in the Budget table in this way.</p>
<p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; toolbar: false; notranslate">=Budget[Year]*100+Budget[Month]</pre>
</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image9.png"><img style="display: inline; background-image: none;" title="image" border="0" alt="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb9.png" width="487" height="116"></a></p>
<p>Now you can write the <strong>BudgetCalc</strong> measure by using the following DAX expression.</p>
<p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">BudgetCalc :=
CALCULATE(
    SUM( Budget[Budget] ),
    FILTER( ALL( Budget[YearMonth] ),
            COUNTROWS( FILTER( VALUES( Dates[YearMonth] ), Dates[YearMonth] = Budget[YearMonth] ) ) &gt; 0 ),
    FILTER( ALL( Budget[Category] ),
            COUNTROWS( FILTER( VALUES( Product[category] ), Product[Category] = Budget[Category] ) ) &gt; 0 )
)</pre>
</p>
<p>The CALCULATE statement calculates the sum of Budget values for all the corresponding rows in Budget table that have the same Year, Month and Product Category. Year and Month selection are handled by using the single YearMonth column, which simplify the writing of the required filter condition. In the new version of PowerPivot (SQL Server 2012) and in BISM Tabular, you can use a simpler syntax by using the new CONTAINS functions, such as in the following example.</p>
<p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">BudgetCalc :=
CALCULATE(
    SUM( Budget[Budget] ),
    FILTER( ALL( Budget[YearMonth] ),
            CONTAINS( VALUES( Dates[YearMonth] ), Dates[YearMonth], Budget[YearMonth] ) ),
    FILTER( ALL( Budget[Category] ),
            CONTAINS( VALUES( Product[Category] ), Product[Category], Budget[Category] ) )
)
</pre>
</p>
<p>At this point you can add the BudgetCalc measure to the previous PivotTable obtaining a comparison between OrderQuantity and Budget values.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image10.png"><img style="display: inline; background-image: none;" title="BudgetCalc result browsed in a PivotTable" border="0" alt="BudgetCalc result browsed in a PivotTable" src="http://www.sqlbi.com/wp-content/uploads/image_thumb10.png" width="354" height="565"></a></p>
<p>However, you have some issue if you drill down data at the Subcategory level: the BudgetCalc is repeated for each Subcategory and this might misinterpreted by reading the result.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image11.png"><img style="display: inline; background-image: none;" title="image" border="0" alt="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb11.png" width="383" height="236"></a></p>
<p>The same problem arises if you try to drilldown data by any Territory attribute (such as Region, Country, or Group), and by Day or Date attributes of the Dates table.</p>
<p>In order to avoid this issue, it would be better to display a BLANK value for budget whenever you are navigating the PivotTable at a detail level that is not represented in the Budget table. In a Multidimensional model (formerly known as UDM in previous versions of Analysis Services), the IgnoreUnrelatedDimensions property helps you in obtaining this behavior. In order to do that in a Tabular model, you can create an <strong>IsBudgetValid</strong> measure that will return TRUE if the current selection in the PivotTable has a valid correspondent budget definition, FALSE otherwise.</p>
<p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">IsBudgetValid :=
(
    COUNTROWS( Sales )
=
    CALCULATE( COUNTROWS( Sales ),
               ALL( Sales ),
               VALUES( Dates[YearMonth] ),
               VALUES( Product[Category] )
    )
)
</pre>
<p>The IsBudgetValid measure checks that the number of rows that are active in the Sales table corresponds to the number of rows that would be active by removing from the filter context any column that is not part of the “virtual” relationship with the Budget table. Because we have just Category and YearMonth “virtual” relationships from Budget to Product and Dates table, the CALCULATE used to make the comparison replaces the filter context on Sales by removing the existing one using ALL and then inserting the filters on Category and YearMonth by using the VALUES function. In order to maintain this formula, you just have to insert a VALUES call for every column used as “virtual” relationship in the Budget table, without worrying about other changes in other part of the model.</p>
<p>As you can see in the following screenshot, the IsBudgetValid measure is FALSE for every row containing a wrong budget measure.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image12.png"><img style="display: inline; background-image: none;" title="image" border="0" alt="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb12.png" width="477" height="239"></a></p>
<p>As a side note, a much more complex alternative is to count the active rows for every dimension (every table but Sales), but this approach would require much more work to maintain the measure if other tables was added to the model.</p>
<p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">IsBudgetValidAlternative :=
(
    COUNTROWS( Product )
    = CALCULATE( COUNTROWS( Product ),
                 ALLEXCEPT( Product, Product[Category] ) )
) &amp;&amp; (
    COUNTROWS( Dates )
    = CALCULATE( COUNTROWS( Dates ),
                 VALUES( Dates[YearMonth] ),
                 ALL( Dates ) )
) &amp;&amp; (
    COUNTROWS( Territory )
    = COUNTROWS( ALL( Territory ) )
) </pre>
</p>
<p>The IsBudgetValidAlternative measure checks that the number of rows that are active in every table of the PowerPivot workbook corresponds to the number of rows in the same table that corresponds to the granularity of the Budget table in the same filter context. For example, the number of rows in Product table is calculated in a CALCULATE function that alters the filter context in the Product table by using the ALLEXCEPT function, removing any filter other than the Category column, which is the only reference to the Product table from the Budget table. Such a number is compared with the number of rows active in Product obtained by calling a simple COUNTROWS. The same process has to be repeated for every table related to Sales, with a difference for Dates table caused by a particular handling of such table by PowerPivot engine. Finally, all unrelated tables should be tested by comparing the number of rows in the filter context with the number of rows in the whole table. As you can understand, this approach is much more expensive to write and to maintain.</p>
<p>Finally, you can create the <strong>Budget</strong> measure that shows the BudgetCalc value only when IsBudgetValid returns TRUE.</p>
<p>
<pre class="brush: dax; gutter: false; title: ; toolbar: false; notranslate">Budget :=
IF( [IsBudgetValid], [BudgetCalc], BLANK() )</pre>
</p>
<p>The following screenshot show the Budget value browser by Month, Category and Model: as you can see, at the Model level the Budget is blank because the comparison cannot be made.</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/image13.png"><img style="display: inline; background-image: none;" title="image" border="0" alt="image" src="http://www.sqlbi.com/wp-content/uploads/image_thumb13.png" width="388" height="298"></a></p>
<p>In this article you have seen how to create DAX measures that compare data stored in tables at different granularities. You should use regular relationships in the PowerPivot or BISM Tabular model for the table with the higher granularity, and you can use DAX measures in order to display values from other tables that store data at an higher level of granularity. In the DAX expression you can define condition that produces the same effect of a “virtual” relationship defined using columns in a table that are not unique.</p>
<p><strong>Acknowledgment</strong>: I want to thank Gerhard Brückl for inspiring me in writing this article.</p>
<p>Download demo file: <a href="http://www.sqlbi.com/wp-content/uploads/Budget-PowerPivot2.zip">Budget-PowerPivot2.zip</a></p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=qlFFdZ7A2ns:4n3WTUhgBpk:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=qlFFdZ7A2ns:4n3WTUhgBpk:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=qlFFdZ7A2ns:4n3WTUhgBpk:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=qlFFdZ7A2ns:4n3WTUhgBpk:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/qlFFdZ7A2ns" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/budget-and-other-data-at-different-granularities-in-powerpivot/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/budget-and-other-data-at-different-granularities-in-powerpivot/</feedburner:origLink></item>
		<item>
		<title>Execute DAX queries through OLE DB and ADOMD.NET</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/OX5Vg2nOJ9M/</link>
		<comments>http://www.sqlbi.com/articles/execute-dax-queries-through-ole-db-and-adomd-net/#comments</comments>
		<pubDate>Thu, 05 Jan 2012 13:05:43 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BISM]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[MDX]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=664</guid>
		<description><![CDATA[If you want to execute a DAX query from your code to a Tabular model, you might think that an&#8230; <a href="http://www.sqlbi.com/articles/execute-dax-queries-through-ole-db-and-adomd-net/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>If you want to execute a DAX query from your code to a Tabular model, you might think that an OLE DB connection should be the right way to get your data. It works, but if you your code is written in .NET a better solution is using the ADOMD.NET library. In this article you will see why ADOMD.NET is better in this scenario.</p>
<p><span id="more-664"></span>
<p>The OLE DB MSOLAP provider can be used to send MDX, DAX and DMX queries to an Analysis Services database. The result can be only a scalar or a table, so you cannot use an MDX query returning results on more than two axes. When you use a DAX query, the result will always fit well in a table, so you can iterate a DataReader to get the result. The following is a simple program you can write in C# to read the result of a DAX query (in this example the DAX query is sent to the Adventure Works DW Tabular model, you can replace the query with any valid DAX query for your model).</p>
<p>
<pre class="brush: cpp; title: ; toolbar: false; notranslate">using System;
using System.Data.OleDb;

class Program {
    static void Main(string[] args) {
        string connectionString =
@&quot;Provider=MSOLAP;Data Source=localhost;Catalog=AdventureWorks Tabular Model SQL 2012&quot;;
        string queryString = @&quot;
EVALUATE
CALCULATETABLE(
    VALUES('Product Subcategory'[Product Subcategory Name]),
    'Product Category'[Product Category Name] = &quot;&quot;Bikes&quot;&quot; )
&quot;;
        using (var connection = new OleDbConnection(connectionString)) {
            connection.Open();
            using (var command = new OleDbCommand(queryString, connection)) {
                using (var reader = command.ExecuteReader()) {
                    while (reader.Read()) {
                        Console.WriteLine(reader[0]);
                    }
                }
            }
        }

    }
}
</pre>
</p>
<p>The result returned in the DataReader is correctly typed (which is important if you have to manipulate the value of a number returned from a query without worrying about number formatting). The issue with OLE DB MSOLAP provider is related to the parameters. In fact, if you want to pass a parameter to the previous query, you will get an error. For example, consider the following example, in which the modified/added lines have been highlighted.</p>
<p>
<pre class="brush: cpp; highlight: [12,17]; title: ; toolbar: false; notranslate">using System;
using System.Data.OleDb;

class Program {
    static void Main() {
        string connectionString =
@&quot;Provider=MSOLAP;Data Source=localhost;Catalog=AdventureWorks Tabular Model SQL 2012&quot;;
        string queryString = @&quot;
EVALUATE
CALCULATETABLE(
    VALUES('Product Subcategory'[Product Subcategory Name]),
    'Product Category'[Product Category Name] = @Category )
&quot;;
        using (var connection = new OleDbConnection(connectionString)) {
            connection.Open();
            using (var command = new OleDbCommand(queryString, connection)) {
                command.Parameters.AddWithValue(&quot;@Category&quot;, &quot;Bikes&quot;);
                using (var reader = command.ExecuteReader()) {
                    while (reader.Read()) {
                        Console.WriteLine(reader[0]);
                    }
                }
            }
        }
    }
}
</pre>
</p>
<p>As you can see, the constant “Bikes” have been replaced by the @Category parameter in the DAX query and the AddWithValue function called on command.Parameters add the corresponding parameter to the OleDbCommand object. However, if you execute this code you get an error as a result.</p>
<p>
<pre class="brush: xml; gutter: false; title: ; toolbar: false; notranslate">Unhandled Exception: System.Data.OleDb.OleDbException: The following system error occurred:  The parameter is incorrect.
Error Code = 0x80070057, External Code = 0x00000000:.
   at System.Data.OleDb.OleDbCommand.ProcessResults(OleDbHResult hr)
   at System.Data.OleDb.OleDbCommand.ApplyParameterBindings(ICommandWithParameters commandWithParameters, tagDBPARAMBINDINFO[] bindInfo)
   at System.Data.OleDb.OleDbCommand.CreateAccessor()
   at System.Data.OleDb.OleDbCommand.InitializeCommand(CommandBehavior behavior, Boolean throwifnotsupported)
   at System.Data.OleDb.OleDbCommand.ExecuteCommand(CommandBehavior behavior, Object&amp;amp; executeResult)
   at System.Data.OleDb.OleDbCommand.ExecuteReaderInternal(CommandBehavior behavior, String method)
   at System.Data.OleDb.OleDbCommand.ExecuteReader(CommandBehavior behavior)
   at System.Data.OleDb.OleDbCommand.ExecuteReader()
   at Program.Main() in C:DevsrcQueryTabularProgram.cs:line 17</pre>
</p>
<p>The problem is related to a problem in System.Data.OleDb assembly in .NET, as described in a <a href="https://connect.microsoft.com/SQLServer/feedback/details/251601/run-mdx-query-with-parameters-using-oledb">Connect bug</a> that has never been fixed despite the “Fixed” state of the issue. The only possible workaround is using ADOMD.NET instead of OLE DB. It might seem a strange approach, because MD means MultiDimensional and a query in DAX is all but Multidimensional. However, because ADOMD.NET allows you to put parameters in an AdomdCommand and to read the result in a DataReader, you can easily translate the previous example in the following one, which uses ADOMD.NET (which is a separate download, look for it in the <a href="http://it.bing.com/search?q=sql+server+feature+pack">more recent SQL Server Feature Pack</a>) instead of OLE DB.</p>
<p>
<pre class="brush: cpp; highlight: [2,14,17,18]; title: ; toolbar: false; notranslate">using System;
using Microsoft.AnalysisServices.AdomdClient;

class Program {
    static void Main(string[] args) {
        string connectionString =
@&quot;Provider=MSOLAP;Data Source=localhost;Catalog=AdventureWorks Tabular Model SQL 2012&quot;;
        string queryString = @&quot;
EVALUATE
CALCULATETABLE(
    VALUES('Product Subcategory'[Product Subcategory Name]),
    'Product Category'[Product Category Name] = @Category )
&quot;;
        AdomdConnection connection = new AdomdConnection();
        connection.ConnectionString = connectionString;
        connection.Open();
        AdomdCommand cmd = new AdomdCommand(queryString);
        cmd.Parameters.Add(&quot;Category&quot;, &quot;Bikes&quot;);
        cmd.Connection = connection;
        using (var reader = cmd.ExecuteReader()) {
            while (reader.Read()) {
                Console.WriteLine(reader[0]);
            }
        }
    }
}</pre>
</p>
<p>In conclusion, you should always consider using ADOMD.NET whenever you want to query an Analysis Services database from your own .NET code, regardless your query is written in DAX, MDX od DMX.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=OX5Vg2nOJ9M:C2nWOmQI7RA:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=OX5Vg2nOJ9M:C2nWOmQI7RA:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=OX5Vg2nOJ9M:C2nWOmQI7RA:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=OX5Vg2nOJ9M:C2nWOmQI7RA:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/OX5Vg2nOJ9M" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/execute-dax-queries-through-ole-db-and-adomd-net/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/execute-dax-queries-through-ole-db-and-adomd-net/</feedburner:origLink></item>
		<item>
		<title>Understanding Circular Dependencies in Tabular and PowerPivot</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/YfrEfEkAPqQ/</link>
		<comments>http://www.sqlbi.com/articles/understanding-circular-dependencies/#comments</comments>
		<pubDate>Tue, 03 Jan 2012 08:00:00 +0000</pubDate>
		<dc:creator>Alberto Ferrari</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BISM]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[Tabular]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/articles/understanding-circular-dependencies/</guid>
		<description><![CDATA[When you design a data model for Tabular you should pay attention to a though topic, which is that of&#8230; <a href="http://www.sqlbi.com/articles/understanding-circular-dependencies/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>When you design a data model for Tabular you should pay attention to a though topic, which is that of circular dependencies in formulas. It is very important to learn how to handle circular dependencies now because in SQL 2012 (and in PowerPivot 2012) there will be a stronger validation of circular dependencies. Some of the checks described in this article have been introduced with the release RC0 of SQL Server 2012 and were not present before. <span id="more-630"></span>Thus, you might author formulas in previous versions of SQL 2012 (or PowerPivot) and end up with a non-working solution when the final bits will be available, due to these additional (and perfectly understandable) constraints.<br />
Before to start speaking about circular dependencies, it is worth to introduce simple, linear dependencies. Let us look at an example with the following calculated column on the classical AdventureWorks DimProduct table</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate"> DimProduct[Profit] := DimProduct[ListPrice] - DimProduct[StandardCost] </pre>
<p>The formula depends on two columns. In such a case, we say that the column Profit depends from ListPrice and StandardCost. You might then create a new column, like ProfitPct with the formula:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate"> DimProduct[ProfitPct] := DimProduct[Profit] / DimProduct[ListPrice] </pre>
<p>Now, ProfitPct depends from Profit and ListPrice. Thus, when Analysis Services will need to compute the calculated columns, it knows that ProfitPct will need to be computed only after Profit has been calculated and stored. Otherwise, it will not be able to recover a valid value for the formula.</p>
<p>Linear dependency is not something you should normally worry about. It is used internally by the SSAS engine to detect the correct order of computation of calculated columns during the processing of the database. On a normal Tabular data model, with many calculated columns, the dependency of calculations turns into a complex graph which, again, SSAS handles gracefully.</p>
<p>Circular dependency, on the other hand, is a situation that happens when a loop appears in this graph. For example, a clear situation where circular dependency appears is if you try to modify the definition of Profit to this formula:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate"> DimProduct[Profit] := DimProduct[ProfitPct] * DimProduct[StandardCost] </pre>
<p>Because ProfitPct depends on Profit and, in this new formula, Profit depends on ProfitPct, SSDT refuses to modify the formula and shows the error “<em>A circular dependency was detected</em>”</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/F11PV13.png"><img style="border: 0px currentColor; display: inline; background-image: none;" title="Circular Dependency Error" src="http://www.sqlbi.com/wp-content/uploads/F11PV13_thumb.png" alt="Circular Dependency Error" width="570" height="109" border="0" /></a></p>
<p>All this describes what circular dependencies are from the point of view of columns, i.e. you have detected the existence of a dependency looking at the expression, without paying attention to the table structure. Nevertheless, there is a more subtle and complex type of dependency which is introduced by the usage of CALCULATE and/or filters inside any expression. Let us see the topic with an example, starting from a subset of columns of DimProduct in AdventureWorks:</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/F11PV14.png"><img style="border: 0px currentColor; display: inline; background-image: none;" title="Starting data model" src="http://www.sqlbi.com/wp-content/uploads/F11PV14_thumb.png" alt="Starting data model" width="423" height="194" border="0" /></a></p>
<p>We are interested in understanding the dependency list for a new calculated column which makes use of the CALCULATE function, like the following one:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate"> SumOfListPrice := CALCULATE (SUM (DimProduct[ListPrice])) </pre>
<p>At first glance, it may seem that the column depends from ListPrice only, as this is the only column used in the formula. Nevertheless, by using CALCULATE, we are asking to transform the current row context into a filter context and this modifies the dependency list. If we expand the meaning of the CALCULATE call, the formula really says:</p>
<blockquote><p>Sum the value of ListPrice for all the row in the DimProduct table which have the same value for ProductKey, ProductAlternateKey, StandardCost and ListPrice.</p></blockquote>
<p>Reading the formula in this way, it is now clear that the formula depends on all of the columns of DimProduct because, the newly introduced filter context will filter all the columns of the table, thus depending on their value. Nevertheless, creating the column, everything goes fine and you get this nice result</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/F11PV15.png"><img style="border: 0px currentColor; display: inline; background-image: none;" title="Data model with first column defined" src="http://www.sqlbi.com/wp-content/uploads/F11PV15_thumb.png" alt="Data model with first column defined" width="583" height="160" border="0" /></a></p>
<p>Now, we might try to define a new calculated column, using the very same formula, in the same table. Thus, we try to add NewSumOfListPrice with the following formula, which is identical to the previous one.</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate"> NewSumOfListPrice := CALCULATE (SUM (DimProduct[ListPrice])) </pre>
<p>Surprisingly, SSDT refuses to create this new formula and returns an error</p>
<p><a href="http://www.sqlbi.com/wp-content/uploads/F11PV16.png"><img style="border: 0px currentColor; display: inline; background-image: none;" title="Circular dependency error" src="http://www.sqlbi.com/wp-content/uploads/F11PV16_thumb.png" alt="Circular dependency error" width="608" height="116" border="0" /></a></p>
<p>Analysis Services has detected a circular dependency in the formula, which was not detected before. And, because we did not change anything in the formula, the error seems a very strange one. Something has changed indeed, and it is the number of columns in the table. If we were able to add the NewSumOfListPrice to the table, we would reach a situation where the two formulas have these meanings:</p>
<ul>
<li><strong>SumOfListPrice</strong>: “Sum the value of ListPrice for all the row in the DimProduct table which have the same value for ProductKey, ProductAlternateKey, StandardCost, ListPrice and <strong>NewSumOfListPrice</strong>”.</li>
<li><strong>NewSumOfListPrice</strong>: “Sum the value of ListPrice for all the row in the DimProduct table which have the same value for ProductKey, ProductAlternateKey, StandardCost, ListPrice and <strong>SumOfListPrice</strong>”.</li>
</ul>
<p>Having added the calculated column, these columns become part of the filter context introduced by CALCULATE and, as a consequence, they are part of the dependency list. Reading the previous definition, it is clear that there is a circular dependency between the two formulas and this is the reason why Analysis Services refuses to allow us to create the NewSumOfListPrice column.</p>
<p>Understanding this error is not very easy. But, on the other hand, finding a solution is pretty straightforward, even if not very intuitive. The problem is that any calculated column containing CALCULATE (or a call to any measure, which adds an automatic CALCULATE) creates a dependency from all of the columns of the table.</p>
<p>The scenario would be different if the table had a row identifier (a primary key, in SQL terms). If the table has a column which acts as a row identifier, then all columns containing a CALCULATE could depend from the row identifier, thus reducing their dependency list to a single column which, by the way, is not a calculated one. Guess what? Tabular is smart enough to behave this way!</p>
<p>In the DimProduct table there is such a column: it is ProductKey. In order to mark the ProductKey as a row identifier you have two options:</p>
<ol>
<li>You can create a relationship from any table into DimProduct using ProductKey as the destination column. Performing this operation will ensure that ProductKey is a unique value for DimProduct.</li>
<li>You can manually set the property of Row Identifier for ProductKey to TRUE using the Properties window inside SSDT or the corresponding feature in PowerPivot</li>
</ol>
<p>One of these operations will make the engine learn that the table has a row identifier and, in such a scenario, you will be able to define the NewSumOfListPrice column avoiding circular dependency.</p>
<p>As a final advice, it is always a good idea to set the Row Identifier property of a table if such a row exists in the data model, because the Vertipaq engine will make use of this information to optimize all calculations. Nevertheless, row identifiers occupy space in the data model and their memory usage is pretty high because they have the maximum number of distinct values (a different value for each row). Thus, if a row identifier is not needed inside a table (for example for fact tables), it is always a good idea to avoid loading it inside the Tabular data model. Then, in case you face the circular dependency issue, it might be necessary to add the row identifier column to the table, so that the problem is solved.</p>
<p>There are other techniques to avoid circular dependencies. For example, using ALLEXCEPT to remove the calculated columns from the set of columns that become part of the dependency list is a viable option, but it makes all formulas more complicated. On the other hand, using ALLEXCEPT might be useful for very big tables, where the addition of a row identifier would cause memory footprint to grow too much.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=YfrEfEkAPqQ:ahs_5ZJuhpQ:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=YfrEfEkAPqQ:ahs_5ZJuhpQ:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=YfrEfEkAPqQ:ahs_5ZJuhpQ:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=YfrEfEkAPqQ:ahs_5ZJuhpQ:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/YfrEfEkAPqQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/understanding-circular-dependencies/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/understanding-circular-dependencies/</feedburner:origLink></item>
		<item>
		<title>From SQL to DAX: String Comparison</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/xIF1ObcMKH8/</link>
		<comments>http://www.sqlbi.com/articles/from-sql-to-dax-string-comparison/#comments</comments>
		<pubDate>Thu, 29 Dec 2011 06:34:22 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=588</guid>
		<description><![CDATA[In DAX string comparison requires you more attention than in SQL, for several reasons: DAX doesn’t offer the same set&#8230; <a href="http://www.sqlbi.com/articles/from-sql-to-dax-string-comparison/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>In DAX string comparison requires you more attention than in SQL, for several reasons: DAX doesn’t offer the same set of features you have in SQL, a few text comparison functions in DAX are only case-sensitive and others only case-insensitive, and performance might have a relevant impact in your query according to the comparison technique you use.</p>
<p><span id="more-588"></span>
<p>If you compare two strings by using common operators (=, &lt;&gt;, &lt;, &gt;, &lt;=, &gt;=) the string comparison can be case-insensitive, according to the collation setting of your Analysis Services instance (for PowerPivot <a href="http://powerpivotgeek.com/2010/03/24/collation-localization-and-globalization/">it depends on the workbook</a>). However, functions like FIND and SUBSTITUTE are always case-sensitive, whereas SEARCH is always case-insensitive. FIND can be 10% faster than SEARCH just because it is case-sensitive.</p>
<p>One issue you might have if you come from SQL is the different operator you have to perform somewhat similar to a regular expression. In fact, the LIKE operators in SQL has a similar correspondent function in DAX: the <a href="http://technet.microsoft.com/en-us/library/ee634235(SQL.110).aspx">SEARCH</a> function has a slightly different semantic because it returns the position found instead of a True/False value. Moreover, it has different wildcard characters, as you will see later in this article.</p>
<p>For example, consider the following syntax in SQL:</p>
<p>
<pre class="brush: sql; gutter: false; title: ; notranslate">Name LIKE '%SQLBI%'</pre>
</p>
<p>In Tabular and PowerPivot v2 (for SQL Server 2012) you can write the following equivalent syntax:</p>
<p>
<pre class="brush: dax; gutter: false; title: ; notranslate">SEARCH( &quot;SQLBI&quot;, Table[Name], 1, 0 ) &gt; 0</pre>
</p>
<p>The condition above returns 0 if SQLBI is not found in the Name column of Table. The search starts at first character (the third parameter) and in case of no match you get 0 as a result (defined by the fourth parameter). If you are using PowerPivot for SQL Server 2008 R2 (or PowerPivot v1), you have to wrap SEARCH into an IFERROR because the fourth parameter have been introduced only in SQL Server 2012 release and an error is thrown in PowerPivot version 1.</p>
<p>
<pre class="brush: dax; gutter: false; title: ; notranslate">IFERROR( SEARCH( &quot;*SQLBI*&quot;, Table[Name], 1 ), 0 ) &gt; 0</pre>
</p>
<p>Unfortunately, the presence of IFERROR has a big impact on performance and you can observe a 4x slower execution times with this syntax. Thus, avoid IFERROR if possible.</p>
<p>In case you need to define a more complex filter, you have to adapt to the different wildcard characters. Instead of using % and _ you would use in the SQL LIKE operator, you have to use * and ? in the DAX SEARCH function.</p>
<p>For example, consider the following condition in SQL:</p>
<p>
<pre class="brush: sql; gutter: false; title: ; notranslate">Name LIKE '%SQLBI%Methodology%at%work%'</pre>
</p>
<p>The correspondent syntax in DAX is:</p>
<p>
<pre class="brush: dax; gutter: false; title: ; notranslate">SEARCH( &quot;SQLBI*Methodology*at*work&quot;, Table[Name], 1, 0 ) &gt; 0</pre>
</p>
<p>However, if you change the SQL condition to:</p>
<p>
<pre class="brush: sql; gutter: false; title: ; notranslate">Name LIKE '%SQLBI%Methodology%at%work'</pre>
</p>
<p>you do not have an equivalent syntax in DAX, because you cannot check that the string ends with “work” based only on the return value of the SEARCH call. Moreover, if you want to improve performance, you should avoid SEARCH whenever possible. For example, using LEFT instead of SEARCH in order to check whether a string begins with a particular text might improve performance of a 5x-10x factor. Here are a few hints in order to translate LIKE in the best pattern.</p>
<table border="0" cellspacing="0" cellpadding="2" width="630">
<tbody>
<tr>
<td valign="top" width="246"><strong>SQL</strong></td>
<p><strong><br /></strong></p>
<td valign="top" width="382"><strong>DAX</strong></td>
</p>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE 'SQLBI'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">Table[Name] = &quot;SQLBI&quot;</pre>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE 'SQLBI%'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">LEFT( Table[Name], 5 ) = &quot;SQLBI&quot;</pre>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE '%SQLBI'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">RIGHT( Table[Name], 5 ) = &quot;SQLBI&quot;</pre>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE '%SQLBI%'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">SEARCH( &quot;SQLBI&quot;, Table[Name], 1, 0 ) &gt; 0</pre>
<p>
for PowerPivot v1:</p>
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">IFERROR( SEARCH( &quot;SQLBI&quot;, Table[Name], 1 ), 0 ) &gt; 0</pre>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE 'SQLBI%Methodology'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">LEFT( Table[Name], 5 ) = &quot;SQLBI&quot;
&amp;&amp; RIGHT( Table[Name], 11 ) = &quot;Methodology&quot;</pre>
</p>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE 'SQLBI%Methodology%'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">LEFT( Table[Name], 5 ) = &quot;SQLBI&quot;
&amp;&amp; SEARCH( &quot;Methodology&quot;, Table[Name], 1, 0 ) &gt; 0</pre>
</p>
</td>
</tr>
<tr>
<td valign="top" width="246">
<pre class="brush: sql; class-name: noborder; gutter: false; title: ; notranslate">Name LIKE '%SQLBI%Methodology%'</pre>
</td>
<td valign="top" width="382">
<pre class="brush: dax; class-name: noborder; gutter: false; title: ; notranslate">SEARCH( &quot;SQLBI*Methodology&quot;, Table[Name], 1, 0 ) &gt; 0</pre>
</td>
</tr>
</tbody>
</table>
<p>In conclusion, in DAX it is better to avoid the use of SEARCH unless you really need to search a pattern using wildcards, and it is highly suggested to avoid using IFERROR for performance reasons (you pay the penalty for every error raised – using IFERROR when the errors are really rare is fine).</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=xIF1ObcMKH8:DPvzBckpwnI:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=xIF1ObcMKH8:DPvzBckpwnI:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=xIF1ObcMKH8:DPvzBckpwnI:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=xIF1ObcMKH8:DPvzBckpwnI:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/xIF1ObcMKH8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/from-sql-to-dax-string-comparison/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/from-sql-to-dax-string-comparison/</feedburner:origLink></item>
		<item>
		<title>From SQL to DAX: Projection</title>
		<link>http://feedproxy.google.com/~r/sqlbi_blog/~3/CMLiV2O-sOI/</link>
		<comments>http://www.sqlbi.com/articles/from-sql-to-dax-projection/#comments</comments>
		<pubDate>Tue, 27 Dec 2011 17:33:00 +0000</pubDate>
		<dc:creator>Marco Russo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[DAX]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.sqlbi.com/?p=581</guid>
		<description><![CDATA[Using DAX as a query language you will lack a fundamental feature like projection. In fact, DAX allows you add&#8230; <a href="http://www.sqlbi.com/articles/from-sql-to-dax-projection/" class="more-link">Learn More &#187;</a>]]></description>
			<content:encoded><![CDATA[<p>Using DAX as a query language you will lack a fundamental feature like projection. In fact, DAX allows you add columns to an existing table, but you cannot simply remove some column from a table. In order to do that, you have to use SUMMARIZE, or you can use ADDCOLUMNS starting from a column that you want to include in the query output and that has to be unique for each row of the result.</p>
<p><span id="more-581"></span>
<p>The solution based on SUMMARIZE is better from a performance point of view, but it is interesting also to look at a possible solution based on ADDCOLUMNS for educational purposes.</p>
<p>Let’s start with the classical SELECT in SQL:</p>
<pre class="brush: sql; title: ; toolbar: false; notranslate">SELECT *
FROM Product</pre>
<p>It corresponds to this DAX query:</p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">EVALUATE Product</pre>
<p>Now, a common projection consists in selecting just a few columns from the source table. For example, the following SQL query only gets 3 rows from Product table.</p>
<pre class="brush: sql; title: ; toolbar: false; notranslate">SELECT [Product Id], [Product Name], [List Price]
FROM Product</pre>
<p>In DAX you can obtain the same result in two ways. You can leverage on the uniqueness of the Product Id column, which uniquely identify any row in the Product table, and you can query such column adding the other columns by using the ADDCOLUMN function. In order to retrieve the corresponding value for each product, you can use CALCULATE and VALUES (so that you will get an error in case Product Id wouldn’t be a unique key of the table).</p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">EVALUATE
ADDCOLUMNS(
    DISTINCT( Product[Product Id] ),
    &quot;Product Name&quot;, CALCULATE( VALUES( Product[Product Name] ) ),
    &quot;List Price&quot;, CALCULATE( VALUES( Product[List Price] ) )
)
</pre>
<p>The technique based on ADDCOLUMNS allows you to rename the columns after the first one. However, from the point of view of performance and easiness of read, a better solution is the one based on SUMMARIZE, even if you lose the ability to rename columns in this way.</p>
<pre class="brush: dax; title: ; toolbar: false; notranslate">EVALUATE
SUMMARIZE(
    Product,
    Product[Product Id],
    Product[Product Name],
    Product[List Price]
)
</pre>
<p>Technically, the solution based on SUMMARIZE is equivalent to a SELECT DISTINCT statement like the following one:
<pre class="brush: sql; title: ; toolbar: false; notranslate">SELECT DISTINCT [Product Id], [Product Name], [List Price]
FROM Product
</pre>
<p>In general, I always suggest using SUMMARIZE in order to perform a projection over a table, because it is easier to read and it has better performance than other options you have in DAX.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=CMLiV2O-sOI:uPmGd7ysKWQ:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=CMLiV2O-sOI:uPmGd7ysKWQ:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/sqlbi_blog?a=CMLiV2O-sOI:uPmGd7ysKWQ:WJiHqkZLMAk"><img src="http://feeds.feedburner.com/~ff/sqlbi_blog?i=CMLiV2O-sOI:uPmGd7ysKWQ:WJiHqkZLMAk" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/sqlbi_blog/~4/CMLiV2O-sOI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.sqlbi.com/articles/from-sql-to-dax-projection/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.sqlbi.com/articles/from-sql-to-dax-projection/</feedburner:origLink></item>
	</channel>
</rss>

