<?xml version="1.0" encoding="UTF-8" standalone="no"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:blogger="http://schemas.google.com/blogger/2008" xmlns:gd="http://schemas.google.com/g/2005" xmlns:georss="http://www.georss.org/georss" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:thr="http://purl.org/syndication/thread/1.0" version="2.0"><channel><atom:id>tag:blogger.com,1999:blog-36696550</atom:id><lastBuildDate>Mon, 02 Sep 2024 06:29:29 +0000</lastBuildDate><category>Analytics</category><category>Efficiency</category><category>Power BI</category><category>Qlik</category><category>ETL</category><category>Filter</category><category>Kettle</category><category>List Box</category><category>Pentaho Data Integration</category><category>Power of Gray</category><category>PowerShell</category><category>SSIS</category><category>Slicer</category><category>andrew maguire; cftc; silver manipulation; wall street journal; wsj; citizen journalism</category><category>data governance metadata</category><title>HepburnData: Essential Complexity</title><description>A blog dedicated to exploring The Information Age through Enterprise Architecture and Data Management</description><link>http://hepburndata.blogspot.com/</link><managingEditor>noreply@blogger.com (Neil Hepburn)</managingEditor><generator>Blogger</generator><openSearch:totalResults>33</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-8548149410246773191</guid><pubDate>Sun, 23 Oct 2022 20:20:00 +0000</pubDate><atom:updated>2022-10-23T16:20:22.942-04:00</atom:updated><title>Happy Paths: Why I am Looking Forward to Azure Synapse Gen3</title><description>&lt;p&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb" style="caret-color: rgb(0, 0, 0);"&gt;Before staring I should mention that my main contract ends in December this year&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen" style="caret-color: rgb(0, 0, 0);"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen" style="caret-color: rgb(0, 0, 0);"&gt;(2022)&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb" style="caret-color: rgb(0, 0, 0);"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;and I am looking for contract work for the 2023 year. I am a lifelong learner and data &amp;amp; enterprise architect with 28 years experience. I also love tinkering with new technologies.&lt;/span&gt;&lt;/p&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I can be contacted at: &amp;nbsp;neilhepburnjob@gmail.com or on LinkedIn at: www.linkedin.com/in/costie&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;On with the article…&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;If you have been paying attention in the data and analytics space you may have noticed a shift towards a concept often referred to as&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“Data&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Lakehouse”. The technology essentially allows a structured database to be scaled with no limits by perfectly isolating compute from storage. This solves old problems and opens up a host new possibilities including AI and ML.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;If you think it’s a flash in the pan or something akin to the Data Lakes which have proliferated since the release of Hadoop going back to the late 2000s, then you should take another look at what is happening now.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;There are three technology trends that are converging for the purpose of solving some big problems that have plagued data management since, well, the invention of the DBMS going all the way back to Charles Bachman’s Information Data Store from the early 1960s.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;What are these technology trends and what are these problems you ask?&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;The trends are:&lt;/span&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Data Lakehouse&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(which&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;we have already mentioned)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Analytic Workspace&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Infrastructure as Code&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;The challenges are mostly around bureaucracy and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;&lt;a class="attrlink" data-target-href="https://en.wikipedia.org/wiki/Happy_path" href="https://en.wikipedia.org/wiki/Happy_path" rel="noreferrer nofollow noopener" target="_blank"&gt;happy&lt;/a&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://en.wikipedia.org/wiki/Happy_path" href="https://en.wikipedia.org/wiki/Happy_path" rel="noreferrer nofollow noopener" target="_blank"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;paths&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;’.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;The hard technical problems themselves have been solved since at least the early 2010s - anyone who genuinely needs&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“Big&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Data” capabilities for whatever purpose can obtain these technologies to solve their problem. Companies like Google and Meta simply couldn’t exist if they hadn’t been solved.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;So what are the challenge I speak of then?&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;It all comes down to a lack of&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;&lt;a class="attrlink" data-target-href="https://en.wikipedia.org/wiki/Happy_path" href="https://en.wikipedia.org/wiki/Happy_path" rel="noreferrer nofollow noopener" target="_blank"&gt;happy&lt;/a&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://en.wikipedia.org/wiki/Happy_path" href="https://en.wikipedia.org/wiki/Happy_path" rel="noreferrer nofollow noopener" target="_blank"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;paths&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;’&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(or&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;too many dilemmas or choices).&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;We need:&lt;/span&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A better&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘happy&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;path’ for deploying Analytic Workspaces&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Currently the act of deploying a new Analytic Workspace is at best a dreaded&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“Change&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Request” or is at worst a big one-off project&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A better&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘happy&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;path’ for onboarding users into those Analytic Workspaces&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Currently it can be difficult and time consuming to onboard a new user into a&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“big&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;data” environment.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent3 list-number3" start="1" style="list-style-type: lower-roman;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Often fiddly desktop software configurations are required or if the user has special needs resulting in a long uphill trek. Those that can get to the top of the hill are often praised and seen as brave bureaucracy warriors - and a warning to others to avoid the same journey&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A better&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘happy&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;path’ for securely obtaining, collaborating, and sharing data sets&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;This might be the hardest challenge of all. Sure you might get some copy of the data. But good luck on getting it refreshed or better yet ensuring you are working with an authoritative source respected by the business data owners&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(who&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;are often disinclined to share their most precious assets and knowledge)&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;So why am I looking forward to Azure Synapse Gen3 then?&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Well, before I answer that question I should point out that there are plenty of other alternatives out there&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(including&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Azure Synapse Gen2).&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I will list these alternatives and then explain what I think the alternatives are.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Oh and I should point out - I have no idea what Azure Synapse Gen3 will actually entail. Everything I am writing here is based on pure speculation and conjecture. But my hypothesizing is informed by these factors:&lt;/span&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Microsoft has a history of getting things right on their 3rd attempt and learn from their mistakes&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;the shortcomings of others)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Microsoft understands corporate governance&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(i.e.&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;bureaucracy) in ways that other big players like Google and Amazon seem to lack&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(or&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;see as beneath them)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;All of the big players are aggressively investing in&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.forbes.com/sites/bernardmarr/2022/01/18/what-is-a-data-lakehouse-a-super-simple-explanation-for-anyone/?sh=7c3e601d6088" href="https://www.forbes.com/sites/bernardmarr/2022/01/18/what-is-a-data-lakehouse-a-super-simple-explanation-for-anyone/?sh=7c3e601d6088" rel="noreferrer nofollow noopener" target="_blank"&gt;Data Lakehouse&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;technologies:&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Meta&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(Facebook)&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;is the pioneer here and has been investing in Apache Hive, Apache Presto, and related open source technologies since 2010 and this stack continues to improve&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Amazon is investing in&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://hudi.apache.org" href="https://hudi.apache.org/" rel="noreferrer nofollow noopener" target="_blank"&gt;Apache Hudi&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;possibly&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://iceberg.apache.org" href="https://iceberg.apache.org/" rel="noreferrer nofollow noopener" target="_blank"&gt;Apache Iceberg&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Google is investing in&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://cloud.google.com/biglake" href="https://cloud.google.com/biglake" rel="noreferrer nofollow noopener" target="_blank"&gt;Google BigLake&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Snowflake is investing in, well&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.snowflake.com/en/" href="https://www.snowflake.com/en/" rel="noreferrer nofollow noopener" target="_blank"&gt;Snowflake&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.sigmacomputing.com" href="https://www.sigmacomputing.com/" rel="noreferrer nofollow noopener" target="_blank"&gt;Sigma&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;) and probably other stuff I’ve yet to be made aware of&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent3 list-number3" start="1" style="list-style-type: lower-roman;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I won’t mention Oracle here except to say that I tend to think of Snowflake as Oracle 2.0&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(I&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;have a lot of respect of Oracle - and many feelings of cognitive dissonance which extend to Snowflake)&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Microsoft&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(along&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;with&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.databricks.com" href="https://www.databricks.com/" rel="noreferrer nofollow noopener" target="_blank"&gt;Databricks&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;) has for quite some time been investing in Databricks’&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://delta.io" href="https://delta.io/" rel="noreferrer nofollow noopener" target="_blank"&gt;Delta Lake&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I think it’s entirely possible that those other vendors will have something that eclipses Azure Synapse rendering it obsolete. The possibility of disruption is always around the corner.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;But let me explain why Synapse Gen2 is quite impressive but also slightly lacking:&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;The Synapse Workspace is a browser based environment. Once configured&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(along&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;with requisite&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://azure.microsoft.com/en-us/products/active-directory/#overview" href="https://azure.microsoft.com/en-us/products/active-directory/#overview" rel="noreferrer nofollow noopener" target="_blank"&gt;AAD&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;security groups), users may be onboarded to singular Synapse Workspace by simply adding them to a single AAD group.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;In one simple request, a Synapse Gen2 Workspace gives the user:&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Access to an&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.techtarget.com/searchdatamanagement/definition/MPP-database-massively-parallel-processing-database" href="https://www.techtarget.com/searchdatamanagement/definition/MPP-database-massively-parallel-processing-database" rel="noreferrer nofollow noopener" target="_blank"&gt;MPP&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://en.wikipedia.org/wiki/Relational_database#RDBMS" href="https://en.wikipedia.org/wiki/Relational_database#RDBMS" rel="noreferrer nofollow noopener" target="_blank"&gt;RDBMS&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(i.e.&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;a super powerful SQL database optimized for analytical workloads)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Access to a Data Lake and Data Lakehouse&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(Delta&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Lake)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Access to Azure Data Factory&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(EL/TL)&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;including Data Flow&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(ETL)&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;and Wrangling Dataflows&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(business&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Data Prep based on Power BI’s Power Query&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘M’&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;language)&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent3 list-number3" start="1" style="list-style-type: lower-roman;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;These are truly best-of-breed tools which come with a&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“deep&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;bench”&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Access to Spark Python Notebooks along with horizontally scaleable clusters which can be scaled to virtually any size&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(assuming&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;you have the $$$)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Access to Power BI&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent3 list-number3" start="1" style="list-style-type: lower-roman;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Another best-of-breed BI tool&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(the&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;only tool that is better is Qlik Sense - as I have written about in the past. Both Power BI and Qlik share the more flexible&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“linked&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;models”&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lbracket"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lbracket"&gt;[as&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;opposed to cube based models like what Tableau and Microstrategy rely on])&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Azure tools like Azure Data Factory, Azure SQL Database, and Azure Databricks all have a committed&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(some&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;may zealot - which is a positive here) developer base - that’s a good thing&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Unless you have a bit of a religion going with your technology, you will find yourself bowled over by people who&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;i&gt;are&lt;/i&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;religious about their technology&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(in&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;the tech industry has always been the case, but it’s more explicit&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://youtu.be/2N2LG_pkQnU" href="https://youtu.be/2N2LG_pkQnU" rel="noreferrer nofollow noopener" target="_blank"&gt;these days&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;.)&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;So what’s wrong with Azure Synapse Gen2 then?&lt;/span&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Deploying a new Synapse Workspaces is a bit complicated and requires a lot of decisions around whether to use the SQL database or the Delta Lake&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Dilemmas are the enemy of the Happy Path&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Onboarding new users can be made easy if you have set up the&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://azure.microsoft.com/en-us/products/active-directory/#overview" href="https://azure.microsoft.com/en-us/products/active-directory/#overview" rel="noreferrer nofollow noopener" target="_blank"&gt;AAD&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;groups correctly, but I think there is room for improvement here. Again there are more choices than I think are necessary&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;It’s difficult to share data with external parties&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;It’s not obvious as to whether we should be using the&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“severless”&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Data Lakehouse SQL database”&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(i.e.&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Delta Lake) or the more mature MPP Dedicated SQL Pool&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;This is in my view the biggest challenge of all for Microsoft to solve and the one I have highest hopes for&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;So what am I expected for Gen3 then?&lt;/span&gt;&lt;/div&gt;&lt;ol class="listtype-number listindent1 list-number1" start="1" style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Quickstart templates&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(maybe&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;as&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://learn.microsoft.com/en-us/cli/azure/" href="https://learn.microsoft.com/en-us/cli/azure/" rel="noreferrer nofollow noopener" target="_blank"&gt;Azure CLI&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;scripts or&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://www.terraform.io" href="https://www.terraform.io/" rel="noreferrer nofollow noopener" target="_blank"&gt;Hashicorp Terraform&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;scripts) for common Azure Synapse Workspace patterns&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;One thing that would be great is to have as inputs the Data Lake folders and Delta Lake tables that users should have access, along with the appropriate permissions&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Another thing would be some way of better managing all the AAD groups that need to be created to accomodate the various roles within the Workspace&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(e.g.&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Data Engineer, Data Scientist, etc).&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A simpler onboarding experience for new users&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;If we could do away with the requirement that Power BI Desktop be required&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(and&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;any other lingering desktop software requirements) that would be great&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent3 list-number3" start="1" style="list-style-type: lower-roman;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Hey I like Windows - but many Data Scientists work on Macs these days&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A better solution for sharing data&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(like&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Databricks’ Delta Share)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;A single unified DBMS based on Delta Lake - no secondary copies of data in MPP Dedicated SQL Pools&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(“single&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;version of the truth”)&lt;/span&gt;&lt;/li&gt;&lt;ol class="listtype-number listindent2 list-number2" start="1" style="list-style-type: lower-latin;"&gt;&lt;li&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;This is the biggest challenge of all&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;On that last point, I have a feeling MS is already moving in this direction.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I believe this because they have already built out something they are calling&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;&lt;a class="attrlink" data-target-href="https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-hyperspace?pivots=programming-language-python" href="https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-hyperspace?pivots=programming-language-python" rel="noreferrer nofollow noopener" target="_blank"&gt;HyperSpace&lt;/a&gt;&lt;/span&gt;&lt;span class="attrlink url author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;a class="attrlink" data-target-href="https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-hyperspace?pivots=programming-language-python" href="https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-hyperspace?pivots=programming-language-python" rel="noreferrer nofollow noopener" target="_blank"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Indexes&lt;/a&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;”.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Backing up a bit, in case you forgot what a Data Lakehouse is, it is basically the pure separation of compute from storage. At its core, everything must be managed through documents and trusted actors.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;It’s a great idea, but comes with some trade-offs.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Sure it’s possible for a single vendor like Databricks or Microsoft to ensure data consistency by coordinating within themselves. But I am a wee bit skeptical if we have achieved this goal when it comes to multiple vendors writing to the same table at the same time at high frequency. Yes it’s possible to take advantage of HDFS file locking and whatnot, but I have yet to see a good demonstration that is as&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-ldquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-ldquo"&gt;“bet&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;your business” as a traditional SQL RDBMS&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(like&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;Azure SQL Database or Oracle or PostgreSQL) system can provide.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;In a sense this has always been the dirty little secret of NoSQL databases: NoSQL DBMSs lack managed integrity controls and rely on trusted client applications to manage integrity for them. This gives better performance, but you can’t manage data as a separate concern. Similar challenges follow the Data Lakehouse.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;To repeat: One of the primary goals of data management is to be able to manage data&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;u&gt;as a separate concern&lt;/u&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;. So the Delta Lake needs to up its game and Microsoft&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;i&gt;is&lt;/i&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;doing this.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Those HyperSpace tables however kind of muddy the waters a bit.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Yes, they are indexes, and yes they will improve query performance.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;But there is no guarantee that other vendors will maintain these indexes because they are not part of the core Delta Lake protocol.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Nevertheless they do point us in the right direction&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(assuming&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;there is only one choice).&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Competition is good, but dilemmas… not so much.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;But what about all those other technologies I just mentioned?&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Well the honest answer is I have done&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;i&gt;some&lt;/i&gt;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;hands-on evaluation of those tools, and in the case of the open source Apache stack I have very much lived in that world for quite some time.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I’ve also dabbled a bit with Snowflake and Sigma and am very impressed with their data sharing capabilities.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;Quite frankly I could see Snowflake winning this game&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(if&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;it is a winner-take-all game) based solely off their approach to data sharing.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;And for the record, I would be more than happy to work with Snowflake and Sigma.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I suspect though all of the five vendors I mentioned above will continue to push their own vision and tech stack and again, and as a realist I would be happy to work with any and all of these technology stacks. Would even love the opportunity to make them work together&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lparen"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lparen"&gt;(with&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;the appropriate&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb s-lsquo"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb h-lsquo"&gt;‘happy&lt;/span&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;paths’).&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;As a reminder, I am currently seeking work for the 2023 year starting in January.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;My contract ends in December, and would love to hear from you if you are curious about this stuff or want to hire me on contract.&lt;/span&gt;&lt;/div&gt;&lt;div style="caret-color: rgb(0, 0, 0); text-size-adjust: auto;"&gt;&lt;span class=" author-d-1gg9uz65z1iz85zgdz68zmqkz84zo2qowz80zsz86z4z75zz88zaxf4z73zz71zz71zpi6z68zjz86zwaydlb3z69zz76zrz72zb"&gt;I can be contacted at: &amp;nbsp;neilhepburnjob@gmail.com or on LinkedIn at: www.linkedin.com/in/costie&lt;/span&gt;&lt;/div&gt;</description><link>http://hepburndata.blogspot.com/2022/10/happy-paths-why-i-am-looking-forward-to.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-1955366479085637484</guid><pubDate>Sun, 18 Jul 2021 18:41:00 +0000</pubDate><atom:updated>2021-07-18T14:41:34.672-04:00</atom:updated><title>Cletus and Koriolis</title><description>&lt;h1 style="text-align: left;"&gt;Cletus and Koriolis (by Neil Hepburn)&lt;/h1&gt;&lt;h3 style="text-align: left;"&gt;Preface&lt;/h3&gt;&lt;div&gt;This is a story about truth, beauty, and reality.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h3 style="text-align: left;"&gt;Cletus and Koriolis&lt;/h3&gt;&lt;div class="separator" style="clear: both;"&gt;A long long time ago, around after 1000 BCE, there was a Greek youth who went by the name of "Cletus". Cletus like his father was hard working and ambitious and a fisherman. But Cletus was also a fiercely independent thinker who was always searching for his own light - a curious autodidact who enjoyed tinkering and&amp;nbsp;figuring things out on his own as much as possible, even if that meant making some mistakes.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;One day while practicing some sailing maneuvers on his father's boat he sailed a bit too far from shore and was blown far from home all the way to the north African coast. After spotting land and a natural beach he moored his boat and swam ashore to find someone who could help him find his way back home. With a few fish in hand [for payment in kind] he eventually found an Egyptian merchant who luckily happened to speak a little Greek and was willing to help him. Cletus asked the man if he could tell him where the source of the gods were is so he could get his bearings. Cletus lived in a part of the Mediterranean where the winds constantly blew from the west in an eastward direction. So naturally the source of these gods was in the west. Cletus assumed the merchant knew his meaning. The Egyptian man looked confused and pointed in a certain direction and said "This is where the gods come from. This is where the sun rises. It is on the other side of the Nile." In this moment Cletus realized that this was indeed the direction he wanted to go and realized that the sun generally did rise in the opposite direction of the wind. But since he was mostly focussed on the winds while sailing he had not paid as much attention to this fact about the sun rising in the east. Cletus even vaguely remembered his own father mentioning something about this. Cletus thanked the man and returned to his boat.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;After reflecting on the advice he had just received, Cletus could feel the hairs standing up on his back as it sunk in that the sun was a much better way to navigate than the using the winds. Maybe he should have listened to his father more. But by figuring this out on his own he felt a certain connection to this knowledge that resonated with him. He felt this knowledge had brought him closer to the gods.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;While sailing on his way back home Cletus noticed a person treading water in the middle of the sea.&amp;nbsp; As he sailed closer he could see it was a woman who appeared both frightened and exhausted. Cletus pulled the woman to safety and gave her some time to catch her breath and relax. After some time the woman began to speak in a raspy voice that Cletus could barely make out. She was Phoenician but was surprisingly fluent in Greek - much better than the Egyptian merchant. She asked Cletus where he was going and Cletus told her the name of his town. The woman recognized it immediately and said, "don't worry I will get you home. But first, I want to repay you. Would you be willing to stop off on a small island where I know there are some supplies and we can rest for the night; tomorrow you will be home."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus considered her offer and thought of his worried father. As it started to rain Cletus - in the moment - made a snap decision and made up his mind that he was going to go to take the woman to the island after all and receive his gift. Cletus knew in his heart-of-hearts that the rain was just an excuse for his father; He could not resist the woman's offer.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Upon arriving at the island the woman quickly ran off to a dense thorny area not far from shore. In that moment Cletus thought he had been duped. Fortunately, his luck changed again and the rain had begun to subside with the sun coming back out. But it was late in the day and Cletus began making plans for camping on the island by himself. Then, out of the corner of his eye he noticed something: The woman was returning, and in her arms was a small basket. She brought Cletus the basket and opened it to reveal an assortment of biscuits, and dried dates and figs. At once Cletus felt his appetite sated and immediately his mood changed from feeling bitter to feeling thankful and even joyful. "Thank you" he said "I thought you had abandoned me." "Oh," said the woman, "I would never abandon someone who just saved my life. This food is not the gift I was referring to earlier. This is just from a cache we Phoenicians keep in times like these in case of emergency. Your gift is coming later tonight, but it's not what you think it is - as you can see, I am too old for that&amp;nbsp; - so keep your clothes on. What I am about to show you is much much better than any of your childish fantasies."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Later, after the sun went down and twilight had passed into dusk the woman stood up from the camp fire and said. "Are you ready for your gift?"&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus nodded yes. The woman then said, "What I am about to show you is perhaps the greatest piece of wisdom the Phoenicians have ever received. You must swear an oath to the gods that if you ever reveal this secret then you will punished by Koriolis. Cletus had never heard of Koriolis. It sounded Greek but maybe it was Phoenician or perhaps Etruscan? In this moment it did not matter, Cletus had always wanted to partake in such a ceremony and quickly and solemnly swore the oath of secrecy to the woman and in witness to Koriolis.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;She then took Cletus by the hand and walked him away from the fire towards the sandy shore where the night sky could be seen in all its majestic glory.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The woman looked up and then over to Cletus and said "Please stand in this direction." Cletus stood as she said. She then told him to "turn a little to the left." He turned.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Now just a bit to the right."&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;He turned again.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Perfect. Now look directly up in front of you. Do you see that cluster of stars over there?"&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus took a moment and pointed to the Big Dipper (Ursa Major).&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Do you mean that?"&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Yes, exactly. That is Big Bear."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Big Bear?"&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Yes, that is just a way of remembering it. You can see its body is there and its head is there. But call it what you want."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The woman continued: "Now, follow a line in this direction" - she drew a line in the sky from the Big Dipper (Ursa Major) to the Little Dipper (Ursa Minor) - "and this will take you to Little Bear".&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus was feeling both excited and impatient. He just wanted the woman to get to the point.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Now you see Little Bear is like Big Bear in reverse. Now that we have found Little Bear we are almost done. You just need to look for these two stars." The woman's finger moved back and forth so Cletus could follow.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Now the last thing you need to do is focus on those two stars and find the point half way between them. That point in the sky will always point you north. If you move towards this point you will be going north and it will get colder and you will begin to feel cold and you may even find snow. If you move away from that point you will find heat and sand. Oh, and the higher in the sky Little Bear is the farther north you are. And when you go south, Little Bear will be lower in the sky."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The woman then explained that because of this Little Bear was more reliable than the sun and the moon or even the wind for navigation. It was what allowed the Phoenicians to navigate and trade at great distances.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;Cletus had heard about the Phoenician's mastery of the seas, but until then did not realize that it was connected to this. He once again felt the hairs raise on his back, but this time it was more intense. He could barely catch his breath. His mind was raging with possibilities.&amp;nbsp; He not only felt closer to the gods but felt as though he was communing with them in this very moment as he was staring at Little Bear. He felt like a demigod; He felt like Hercules.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The next day Cletus reaffirmed his oath to the woman (and Koriolis) and returned home to console his father.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Over the years Cletus used his new-found knowledge to fish and trade at greater distances. He became legendary among his Greek companions but refused to give up his secret. He knew that this secret was both a blessing and a curse but refused to betray his oath. He wasn't actually afraid of the Phoenicians but &lt;i&gt;was&lt;/i&gt; afraid of the gods, which kept him honest. It was also incredibly difficult to live with such a profound secret. This secret had become a curse but Cletus was still too afraid to break his oath for fear of retribution from Koriolis.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;In his frustration he did something he had not done in a long time - he decided he was going to sail as far away as possible. Instead of sailing past the Pillars of Hercules and then north (which was where the Phoenicians often headed), he considered sailing west or east. The idea of sailing west was in many ways the most intriguing to him, but he had never heard of a single sailor - even a Phoenician - who had ever returned; rumour had it that there was some deadly current or capricious god that prevented return. Instead, he turned his attention to the east.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;He had heard many interesting stories about the far east and made up his mind to head in that direction. He decided to follow coast lines as opposed to sailing as directly as possible. This was because although he knew the direction he was moving in and what his north-south position was using Little Bear, he could only determine that he was &lt;i&gt;heading&lt;/i&gt; east and could not know by way of Little Bear how &lt;i&gt;far&lt;/i&gt; he had actually gone. Eventually after several months (and many interesting stops) he found himself in a very distant land with very unusual customs and languages unlike anything he had heard or seen before. He wondered if this is where the exotic shiny and smooth piece of cloth he had seen earlier on this journey had come from? The people in this land also looked different from anyone he had ever encountered.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;His arrival upon shore was met with bemusement. Indeed this was the the first time anyone had seen someone who looked like Cletus, and everyone was staring at him. Some were even touching his arms which in that moment he realized looked quite sweaty and hairy. Cletus being who he was just went with the flow and found himself being led towards what appeared to be some kind of temple where he was offered some food and drink. The food was also a bit unusual - more steamed and boiled than he was used to - but tasty and satisfying all the same.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Over the course of the next few days Cletus began to learn the local language as they learned his Greek. Eventually an old man with a long white hair appeared holding a box about the size of a melon. The old man asked Cletus if he knew what was inside the box. Cletus responded by shaking his head. The old man said "But how did you get here then?". Cletus was stunned. He had no idea what the old man was talking about. The old man then asked plainly "How did you get here then? What is your secret?"&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus froze. He was confused and frightened and didn't know what to do.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Were the gods playing a trick on Cletus?&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus took a moment to collect his thoughts. He then responded to the old man: "Lord, I do indeed have a secret I will admit. However, I had sworn an oath to the gods that I would not reveal this secret. But if you already know my secret, then tell me what you think it is and I will confirm."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The old man considered this and took Cletus to be an honest man. So the old man said to Cletus "All right then, I will tell you this: 'I have a tool that can point us to the true source of all energy.'" Cletus didn't entirely understand what he meant by this but asked the man if he could use this tool to point in the direction of this energy source. The old man then opened the box to look inside and then pointed in a northerly direction.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus could feel his chest tighten. He never imagined this would ever happen. The old man clearly knew his secret or at least how to extract its bounty which was good enough for him. Cletus felt for the first time that he had been released from his oath. Oh how long he had waited for this.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus then said to the man "I think you know my secret. But I will make it clear to you how I came to this place." They both waited until the sun had gone down and the sky had cleared. Cletus then revealed Little Bear and how he used it to navigate such great distances. Cletus made it clear that although he could point himself east or west, he could never be sure how &lt;i&gt;far&lt;/i&gt; east or west he had actually traveled. But Little Bear's guidance was enough to allow him to confidently make this long journey east and then back home again.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The old man looked up and then down at Cletus and exclaimed "Thank you! Thank you!" Then the old man began to weep. Cletus was confused. After a few minutes the old man regained his composure and explained his emotional state: "Cletus, we have always wondered if this day would come. You have solved the mystery of the lodestone - a most powerful and mysterious tool. You have guided us to its true source. We always knew these lodestones were divine. But we never realized until now that it was pointing to this Little Bear in the sky. It all makes perfect sense now. Thank you Cletus."&lt;br /&gt;The old man then opened the box to reveal a small jar filled with water and a tiny needle floating on top. He then demonstrated that no matter what direction he turned the box the needle would always point in the same direction: back to the general direction of Little Bear.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus could hardly believe his eyes. The hairs stood up on his back and shivers ran down his spine. In that moment Cletus knew he had just discovered something with immense power, but more importantly he had discovered a tool he could hold in his hands that would guide him to the source of the gods' power even during the day when Little Bear could not be seen. His hands were now shaking.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;After learning how to locate more lodestones Cletus returned home with his newfound compass and the knowledge behind how to make more.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Upon returning home Cletus organized an assembly and proceeded to demonstrate and explain how the compass and Little Bear worked and also how he had come to discover it through an exchange of secrets in a distant land. The crowd was skeptical at first but was quickly convinced after a few demonstrations coupled with observations of Little Bear. This day would never be forgotten. It was the day that changed history. From hence forward dates were recorded as either "Before Cletus" or "After Cletus".&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;A new Age of Discovery commenced with many expeditions heading north - the most obvious direction given that this is where the source of the gods was originating from. While numerous sailors became lost or froze in the extreme cold, many others returned with exquisite treasures. Amber jewels were most highly prized but there were many other exotic items that also commanded peoples attention. A sled with skis was particularly fascinating to those who lived in the hills. Most people just loved the smoked fish and delicious honey that quickly became new culinary staples. The economy was booming with trade.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;One day Cletus heard a story from another town that concerned him. A young sailor, after drinking too much wine during a local festival blurted out that he thought the gods had erred by giving away their secrets and that he was happy for this mistake. What the sailor did not realize - having been at sea for so long - was that not only had the economy changed but people's core beliefs and daily rituals had too; There were new stories and ceremonies and festivals that paid tribute to Little Bear. Priests and philosophers seemed to be in agreement for once with priests extolling the poetry and beauty of the compass and Little Bear and philosophers remarking on how elegant and simple the compass and Little Bear were. Harmony had been achieved and anyone disrupting this harmony was an unwelcome muckraker.&amp;nbsp; The young sailor who had too much to drink learned this the hard way and was drowned at sea for his blasphemy against the gods.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;As time passed Cletus began to suspect the gods were playing tricks on him. He wondered if the god Koriolis&amp;nbsp;was punishing him for revealing the secret of Little Bear. It wasn't Cletus' fault - he had good reason, and still does - for believing the old man already knew his secret. So Cletus began to wonder if Koriolis&amp;nbsp;was playing a more devious trick on him. He constantly thought back to that fateful day when he swore that oath to the Phoenician woman and Koriolis. He thought back to the earlier part of the day when he had make a connection between the winds blowing from the west and the sun rising from the east. He wondered if the gods had deliberately distracted him from these other phenomena. He thought back to the story of Jason and the Argonauts and how Jason's shipmates were nearly seduced and killed by the Sirens and their entrancing song. Perhaps, he thought that Little Bear and the compass - with their elegant symmetry - was a kind of elaborate siren song.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Over the next year Cletus returned to his trusty boat and set off on a new journey. He began travelling around with a special log book and started taking notes about the wind and sun. In particular Cletus noted down wind direction in various locations. He also began measuring the shadow cast by the mast of his boat and noticed its length changed even during the sun's highest point depending on how far north or south he happened to be. After several months surveying a new picture formed in Cletus's head and he returned once again home to relay his findings.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Being famous, Cletus was able to gather a much larger assembly than ever before. Most people were just glad to see him back in town after being away for long. Cletus began.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"Good people who have joined me on this fine day. I have once again been on an incredible journey. But this journey was different from my first journey - which was indeed the will of the gods. However, this journey came from the will of myself. I do not know if the gods will even approve of what I am about to tell you, but as you know the gods can be deceptive and I believe I have uncovered their greatest deception."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus took a deep breath and continued.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;"You see I have been measuring the height of shadows and direction of wind from the farthest points north I could sail - even hitting ice - to the hottest parts of Africa I could tolerate. I have come to believe that we are not sitting on a flat plane as it would appear but rather the world is more shaped like an egg or melon perhaps, and that this egg or melon is rotating in a single direction. That is why the sun rises and falls and why the wind blows as it does. With this new understanding, we may be able to find a way of going east and west with the same confidence we have traveling north and south. We may even be able to venture into the great western mystery."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;A man in the crowd shouted back - "Well show us the proof then!"&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus then held up a large round melon covered in knife-marked dashes, all in various directions.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;He then held up an orange and began to rotate the melon.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;He explained that this is why the farther north you go the longer the shadow and the farther south you go the shorter the shadow. He then explained that this would also cause the wind to blow in various directions, but that there was a pattern. Although he admitted that he didn't understand exactly why the pattern was the way it was - only that it probably had something to do with the rotation of the world - or melon in his demonstration.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;The congregants were becoming restless. They could not understand what Cletus was saying and could not make these connections. One brave girl stepped forward and said in the most respectful voice: "Cletus, you are indeed a wise man. And we have benefited much from your wisdom. We want to believe you. But the gods have a purpose for us. We can see that purpose in Little Bear and the compass and behold its divine elegance and symmetry. But we are struggling to see the purpose or even beauty to what you are showing us here. Perhaps you are correct, but to our eyes this just looks like clutter and noise - not the divine elegance you once revealed to us.&amp;nbsp; The gods are beautiful, your pock-marked melon is not."&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Cletus felt hurt. He had expected his discovery to be embraced with the same curiousity&amp;nbsp;&lt;i&gt;he&lt;/i&gt;&amp;nbsp;felt years ago after realizing there was a connection between the wind and the sun. But as he looked at the melon in his hand - now beginning to rot under the heat of the sun - he realized in that moment Koriolis&amp;nbsp;had got the better of him.&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both;"&gt;Fin.&lt;/div&gt;</description><link>http://hepburndata.blogspot.com/2021/07/cletus-and-koriolis.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-3453507397854839951</guid><pubDate>Sun, 30 May 2021 19:30:00 +0000</pubDate><atom:updated>2021-05-30T15:31:00.579-04:00</atom:updated><title>Analytic Efficiency Part 3: Why Qlik's Associative Index is the Biggest Little Analytics Invention Over the Past 30 Years</title><description>&lt;p&gt;The headline of this post is a quote borrowed from the Canadian author&amp;nbsp;Witold Rybczynski who when asked by the New York Times what the greatest invention over the past millennium was, responded that it was the screwdriver (I would have chosen computer - by I'm not very good with home renovations), and went on to say that the Robertson screwdriver in particular was the "biggest little invention" over the past 100 years.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/Robertson_screw_ad.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="363" data-original-width="600" height="242" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/Robertson_screw_ad.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;If you live outside Canada, or do not often handle with screws you may never have heard of the &lt;a href="https://www.thecanadianencyclopedia.ca/en/article/robertson-screwdriver-the-biggest-little-invention-of-the-20th-century-so-far-feature"&gt;Robertson screwdriver&lt;/a&gt; or the story behind it. I will get into the details of that story near the bottom of this post. But for now all you need to know is that the Robertson screwdriver is widely used in Canada and only just recently is picking up traction in the US, even after being on the market for over 100 years. This is very odd since the Robertson screwdriver does the job of screwing much better than other screwdriver out there.&lt;/p&gt;&lt;p&gt;Qlik I would say does a better job at analytics than any other tool out there, but like the Robertson has also failed to take over the world (or at least its underlying innovation), which is also very odd.&lt;/p&gt;&lt;p&gt;I believe the reason for this can be summed up as a lack of connoisseurship. Most people cannot concisely explain what analytics is or how it differs from related concepts like statistics and 'data science'.&lt;/p&gt;&lt;p&gt;Analytics is in essence a game of '&lt;a href="https://en.wikipedia.org/wiki/Twenty_questions"&gt;20 questions&lt;/a&gt;', an endless stream of why why why, which can be boiled down to yes/no questions. While most analytic tools allow the analyst to interactively see the 'yes' answer to any given question, only Qlik's Associative Index allows the analyst to get to the 'no' answer. This allows the analyst to keep asking questions without having to pivot to Data Wrangling and Data Prep work, and may thus sustain a "flow" state of mind.&lt;/p&gt;&lt;p&gt;It is for this reason I see Qlik's Associative Index like the Robertson screwdriver: They were both purpose built for their task, more so than the competitor's offering. Why is it then that both of these tools have not done better?&lt;/p&gt;&lt;p&gt;At this point you might be a bit confused. I hope by the end of this article this will make a bit more sense to you, and in the process I also hope to share with you how I have come to understand the meaning and essence of what "analytics" actually is.&lt;/p&gt;&lt;p&gt;This is the third and final post on this topic. In this post I wish to discuss three sub-topics:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;What history can teach us about the nature of tools&lt;/li&gt;&lt;li&gt;The psychology of analytics – as I have seen it and experienced it&lt;/li&gt;&lt;li&gt;How #1 and #2 can be understood through lens of Power BI vs. Qlik&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h1 style="text-align: left;"&gt;What History Can Teach Us About the Nature of Tools&lt;/h1&gt;&lt;p&gt;I have published this as a standalone article &lt;a href="http://hepburndata.blogspot.com/2021/05/what-history-can-teach-us-about-nature.html"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Later in this post I will elaborate on my analogy to screwdrivers, but if you read this other article I published a few weeks back I hope to also convince you of how Qlik is more like “Wordstar” and the “mongolian horse” whereas Power BI is more like the “MS Word” and the “European horse”, respectively.&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h1 style="text-align: left;"&gt;The Psychology of Analytics – In My Travels&lt;/h1&gt;&lt;p&gt;Harkening back to the first article in this series, I explained how I see Analytics as a game of twenty questions -or even a debate -&amp;nbsp; that continually feeds back on itself: You ask a question which provides an answer that leads to another question and another answer and so on until you see no point in asking more questions. Is there a pattern to these questions that can help us choose a tool that fits these patterns?&lt;/p&gt;&lt;p&gt;The overarching pattern in analytics is that when asking and answering questions, we are constantly making comparisons to find what we’re looking for. The syllogism itself – which underpins analytical thinking - is the process of comparing the Major Premise (e.g. ‘All persons are mortal.’) to the Minor Premise (e.g. ‘Socrates is a person’) and comparing the two premises based on their middle term (e.g. ‘Person’) to see if they connect to trigger the conclusion.&amp;nbsp; All questions can be boiled down to ‘Yes’ or ‘No’ (this is known as the “law of excluded middle” in the so-called “&lt;a href="https://en.wikipedia.org/wiki/Law_of_thought"&gt;laws of thought&lt;/a&gt;”).&amp;nbsp; All computer code basically works like this.&lt;/p&gt;&lt;p&gt;However, unlike computer code – which tends to follow the logical trajectory of “true” statements, while usually ignoring false or negative conditions&amp;nbsp; – analytics is equally concerned with the negative outcomes as positive ones.&lt;/p&gt;&lt;p&gt;The game of “20 Questions” is played out by making a series of comparisons where both the positive and negative answers guide us. Business Analytics is no different.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Is there a pattern to the types of comparisons people make?&lt;/p&gt;&lt;p&gt;I have observed and participated in hundreds of analytical exercises – everything from quick ad hoc reporting to working alongside statisticians, to building predictive analytics systems, to building enterprise data warehouses – and having observed the types of questions business decision makers ask and answer (how they play the game of Analytics), I believe there are two main patterns which stand out which and are important to understand.&lt;/p&gt;&lt;p&gt;These two main analytical patterns roughly align with management level:&lt;/p&gt;&lt;div&gt;&lt;div&gt;First, the “&lt;u&gt;Drill-down and Compare&lt;/u&gt;” or simply “&lt;b&gt;&lt;u&gt;Drill&lt;/u&gt;&lt;/b&gt;” pattern: Managers and above are generally interested in making comparisons between aggregates.&amp;nbsp; Typically, a manager will start with the “big rocks” (e.g. Total Revenue) and then begin drilling down by whatever dimension (e.g. time, location, product category, etc.). When drilling down, the user is making comparisons between groupings.&amp;nbsp; These comparisons are typically visualized in bar charts, line charts for trending, or pie or treemap charts for share-of-total.&amp;nbsp; Most analytics at this level is concerned with either finding outliers (e.g. a spike in sales for a given date) and then drilling down to the details to learn more, but this may require a hand-off to other analysts which we will discuss in a moment.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Managers are also interested in patterns, especially trends over time. Most of the time the manager analyst just wants to know that everything is running as usual – like a heartbeat monitor. If they look at the same report on a regular cadence (e.g. monthly) and they will grow accustomed to how charts should appear so if there is shift to the pattern, they will notice this and begin asking follow-up questions. These follow-up questions often get handed over to other analysts.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Managers and directors are for this reason fond of interactive dashboards and self-serve reports, and this is where most money is spent with respect to Business Analytics and Data Warehousing. Tools like Power BI, Qlik, and Tableau, not to mention underlying data warehousing and data prep tools from vendors like Oracle, Amazon, Google, Informatica, Databricks, Cloudera, TeraData, Microsoft and so forth. This BI+DW (Business Intelligence + Data Warehouse) spending accounts for a significant percentage of overall IT spend.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While managers and directors appreciate interactive dashboards, executives and board members on the other hand often prefer more digested insights. People working at this level expect all questions to already be answered. If there is an outlier or pattern anomaly the report will usually provide an explanation upfront.&amp;nbsp; Tools used to provide these types of reports are often just presentation tools like MS PowerPoint or graphic design tools like Adobe Photoshop.&amp;nbsp; The costs associated with these reports are mostly on the human resources who develop these custom reports and infographics. Analytics at this level ultimately plays out in an executive board room with a presentation followed by a dialectic discussion and possibly debate.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Getting back to managers and directors, this is where most money is spent on analytical tools. But as mentioned above managers will routinely find outliers and pattern anomalies through a “Drill-down” analysis. But uncovering the deeper reason for the anomaly or change in pattern requires shifting gears. Managers will often either hand-off the follow-up questions to other analysts or they themselves will switch gears and begin using… Excel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Recalling our game of ‘twenty questions’, most tools do well when the answer to a question is ‘yes’, since we can keep drilling down into that selection. But when the answer is ‘no’ most of these tools run into a problem because they are not designed to examine the excluded values or ‘negative space’ as it were.&amp;nbsp;&lt;/div&gt;&lt;div&gt;What is negative space?&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The poet &lt;a href="https://en.wikipedia.org/wiki/John_Keats"&gt;John Keats&lt;/a&gt;&amp;nbsp;coined a similar term “&lt;a href="https://en.wikipedia.org/wiki/Negative_capability"&gt;Negative Capability&lt;/a&gt;” to describe negative space as a source of mystique and wonder that poets and artists could explore and contemplate, often romantically.&amp;nbsp;&lt;/div&gt;&lt;div&gt;In the world of science, filling negative space drives all endeavours. It is the reason you keep hearing about “dark matter” and “dark energy” and why we are reminded that Einstein’s theory of relativity must ultimately be false (but like Newton's laws, is still useful).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Fortunately, however, the material nature of most organizations is recorded in structured datasets. These datasets are are finite and therefore it is physically possible to probe the ‘negative space’ when asking questions. However, most analytical tools – and data warehouses - are not purpose built for designed for this task. This means that humans are doing most of this work.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This brings me to the second pattern, the “&lt;u&gt;Contrast and Compare&lt;/u&gt;” or simply “&lt;b&gt;&lt;u&gt;Contrast&lt;/u&gt;&lt;/b&gt;” pattern: The Contrast pattern normally occurs following the Drill pattern and is concerned with exploring and contrasting negative space. The Contrast pattern tends to follow one or both processes:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Isolate relevant sub-population (e.g. Toronto and Ottawa) within parent grouping (e.g. Ontario)&lt;/li&gt;&lt;li&gt;Contrast sub-population to their complement: e.g. Average Home Price in Ottawa + Toronto vs. rest of Ontario&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Allows us to see the difference between housing prices in urban metropolitan regions versus rest of Ontario&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;OR:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Take two populations (e.g. Rentals during winter versus rentals during summer)&lt;/li&gt;&lt;li&gt;Determine what is in the first population that is not in the second and vice versa&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Allows us to appropriately target summer-only customers or winter-only customers or all-season customers with specific and relevant ad campaigns&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;When it is easy to isolate sub-populations by simply filtering on a small number of fields then we need not go beyond the “Drill” pattern and many analysts may use dashboards in this way. Although they will only be able to see differences in the aggregations (e.g. total number of sales by Month) but this can still be revealing.&amp;nbsp; For example, a telecom analyst who suspects that outdoor weather precipitation (e.g. rain) might be causing a spike in call volumes might filter on all the known dates over past 12 months it rained or snowed. The analyst would then filter on the compliment – all days where it did NOT rain or snow - and then compare these sub-populations with respect to standard measures like Call Volume and Average Handle Time to understand the differences the change in weather pattern might generate. The answers to these questions could lead to cost saving improvements in staffing and revenue generating improvements in sales.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another common variant of the “Contrast” pattern is when the analyst is handed two sets of data and simply told to find the deltas and then report on those deltas based on a of comparison key.&amp;nbsp; For example, an analyst might be given two lists: A list of customers from ten years ago and a list of customers from present-day today. The analyst might be asked to produce a report comparing the loyal customers who appear on both lists to newer customers who are only on the present-day list to the previous customers who have since left. The report could lead to a reduction of costly customer “churn”.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;There is another word – “Parallax” – that also describes this pattern. In optics, the parallax effect is when an object that is nearer to us tends to move faster than an object that is far away. By exploiting this property, it is possible to measure the distance between distant objects without traveling those distances. The philosopher/comedian Slavoj Žižek uses the word parallax to explain how “truth” can only be teased out through the types of comparisons that reveal differences. More practically, when we are making assessments about the world, like reading the news or researching a new car or other big-ticket purchase, we are attempting to find contradictions between perspectives and distill from these contradictions what is going on. For example, we might read some good reviews of a new TV and then one or two bad reviews and by focusing on the differences we might find out that the good reviewer overlooked something that the bad reviewer complained about, and we might go on see that the bad review was simply complaining about a bad installation experience they had. Or maybe the good reviewer only cared about price and was not attuned to picture quality as the bad reviewer.&amp;nbsp; It is from these contradictions, this parallax effect, this contrast, that we can uncover useful insights.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While this yes/no approach might seem binary or simplistic, and to be sure Analytics can be wielded cynically (as it all too often is), but with enough questions you can begin to see a spectrum and other patterns emerging. With enough questions deep insights and epiphanies begin to emerge.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The visual that comes to mind for the Contrast pattern is the Venn diagram, but the Contrast process tends to be table-centric because the analyst is mostly relying on either Excel or a possibly a Data Prep tool like SQL, Python, Power Query, or Alteryx to perform the detailed comparisons between sets.&lt;/div&gt;&lt;div&gt;What we see then is analysts spending most of their time in Data Prep tools (especially Excel) pulling sub-populations through either delta comparisons (often using the ‘VLOOKUP’ function in Excel) or by filtering on subsets of data and then using a delta comparison (again using VLOOKUP) to extract the compliment of the sub-population they were filtering on (e.g. the compliment of all dates there were precipitation).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Most Data Prep in the business world is done using Excel using the VLOOKUP function, but much of it is done in SQL as well using the NOT EXISTS or NOT IN clause. Data Prep can also be done in more business-friendly date prep tools like Power Query (built into Power BI) which has an “anti-join” operation as part of its “merge” step.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It should be pointed out though that Data Prep is essentially programming and requires a shift in thinking. When doing Data Prep the analyst/developer can certainly enter a flow state, but they are no longer in the Analytical flow state playing 20 questions. Instead their mind is in the world of syntax and code structure – even using the VLOOKUP may require a reorganization of columns which can be time consuming and distracting.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;An analogy I use to explain the relationship between Data Prep (which is more like programming) and Analytics (which is more like a game of 20 questions) is based on this quote from Abraham Lincoln: “If you give me six hours to chop down a tree I will spend four hours sharpening my axe.”&amp;nbsp; You can think of Data Prep as axe sharpening and analytics as wood chopping.&amp;nbsp; The ratio of 2:1 between Data Prep and Analytics would not be bad, but what I see is something closer to 5:1 or even 10:1. Ideally, we want to be spending more of our time chopping and less of it sharpening if possible. The reasons are that with pure analysis (chopping) you can more easily achieve a flow state while also collaborating with others. Data Prep (sharpening) on the other hand is a more solitary activity and while you can attain a flow state your focus is directed at answering technical rather than business questions.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;This is the reason Qlik stands out, because it is possible to contrast data sets interactively and this is because Qlik’s indexing technology – the Associative Engine – was built from the ground up to support the selection of excluded values. With every other Business Intelligence tool I have used, negative values are an afterthought and not easily accessed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Using our example from above, if we want to compare days with precipitation to days where there was no precipitation in Qlik we can interactively select the complement of dates, we can even restrict our complement to be within a given subset (e.g. for a given year and region). From here we can contrast customer patterns for days when there was precipitation to days where there was not.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Furthermore, QlikView (not Qlik Sense unfortunately) allows users to save the subset of data that has been selected. This means that after selecting days with precipitation (or we could choose no precipitation), we can then reduce the dataset to what is associated to those dates. From there we could choose another dimension – say “temperature” - to partition the data by in order and inspect that contrast. This would allow us to see the contrast of above freezing or below freezing to see if a pattern emerges.&amp;nbsp; We could then go back up a level to the original dataset and segregate the original dataset using temperature to see if the same patterns hold as when contrasting within the precipitation subset.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;By contrasting these patterns with respect to precipitation and temperature, as an analyst we can begin to see the relationship between these dimensions and other dimensions. Depending on what differences emerge between the subsets, we can choose to partition, contrast, and then take a subsequent subset.&amp;nbsp; We can keep playing this game of twenty questions until we have answered our questions or at least satisfied our curiousity.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Alternatively, if we want to compare multiple Customer lists to see what customers are only in one and not in another, we can do so interactively, and again we can easily isolate subsets of customers and compare them looking for patterns.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It is of course possible to accomplish any of these tasks using other tools like Excel or Power BI (via Power Query) or SQL or Python, but one cannot do so interactively.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I should point out that it is also necessary while performing a “Contrast and compare” activity that on some occasions going back and “sharpening the axe” through Data Prep is required. But in many circumstances this is not necessary, and if the data has been well prepped you can ask and answer many questions interactively by leveraging the “excluded values” supported by Qlik’ Associative Index.&lt;/div&gt;&lt;div&gt;In my own experience I have leveraged this power to significant effect and there have been occasions where I was able to sustain a “flow state” for up to an hour of constant Q&amp;amp;A through a series of dimension selections, because I was able to easily contrast selections to one another and see what was exposed and what was hidden by doing so.&amp;nbsp; On these occasions I would often find “needles in haystacks” and the feeling was not unlike debugging a computer program, but I felt like was debugging the business. And debug I did. Some of these insights did lead to noticeable changes in how the business was run.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Taking the two patterns together – “Drill” pattern and “Contrast” pattern – there is an analogy that I find quite useful for thinking of these as part of a cycle.&lt;/div&gt;&lt;div&gt;Namely, there is a concept known as the &lt;a href="https://en.wikipedia.org/wiki/OODA_loop"&gt;OODA loop&lt;/a&gt; (credit goes to &lt;a href="https://ca.linkedin.com/in/alistaircroll"&gt;Alistair Croll&lt;/a&gt; for introducing me to this) which was developed by a US Air Force Colonel &lt;a href="https://en.wikipedia.org/wiki/John_Boyd_(military_strategist)"&gt;John Boyd&lt;/a&gt;. The OODA loop was developed by Boyd for the purpose of fighter jet combat. Boyd realized that the rules of combat had changed with heavy supersonic jets and that a different mode of thinking was required to use them effectively in combat.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;OODA stands for:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Observe&lt;/li&gt;&lt;li&gt;Orient&lt;/li&gt;&lt;li&gt;Decide&lt;/li&gt;&lt;li&gt;Act&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;In the context of fighter jet combat the pilot might “observe” an opponent and then instead of engaging them in the moment would begin “orienting” their plane and mindset such that the pilot has the best available information to “decide” and “act”. The decision might be to attack or just as likely to move to another defensive position.&amp;nbsp; If the pilot moves to a new position, they would repeat the OODA loop cycle until either defeating the opponent or retreating.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The most important step in the OODA loop – and what distinguishes it from other business cycles like the “plan-do-check-act” lifecycle is the emphasis on &lt;u&gt;orientation&lt;/u&gt;. It is the “orientation” step that brings to bear the subjective experience and talents of the pilot to maximize their strength.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Relating this back to the Analytics patterns I described above I see the “Drill” pattern as like the “Observe” step and the “Contrast” pattern similar to the “Orient” step.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It stands to reason that if we can make the Orient step more efficient – which is the bottleneck of the OODA loop – then we can make analytics more efficient.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There was another tidbit of wisdom that Alistair Croll also mentioned when discussing the OODA loop which relates back to my earlier points regarding horses and word processors: Once you cross a threshold of efficiency the game itself can begin to change.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The example Croll used was the steam engine that James Watt invented.&amp;nbsp; Watt’s new steam engine, design developed in the late 1700s, is generally regarded as one of the key factors that catalyzed the industrial revolution during the 19th century. However, when it was first introduced some people speculated that it would cause a depression for coal miners because only a quarter of the coal was required to produce the same amount of “horse power” (a unit of measure that Watt also invented).&amp;nbsp; Instead, because the Watt’s steam engine had crossed an efficiency threshold it could be used for all sorts of businesses that otherwise would not be able to afford the cost. As a result coal mining began to boom like never before.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The rest as they say is history.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;h1 style="text-align: left;"&gt;Difference Between Qlik and Power BI and Excel When Contrasting Data&lt;/h1&gt;&lt;div&gt;In this section I will be contrasting the tools themselves to show how they differ when used to contrast data.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this example we will continue with the Call Centre data, although I have created a new fictional dataset with more attributes to compare.&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;h2 style="text-align: left;"&gt;Differences in Qlik, Power BI, and Excel Explained&lt;/h2&gt;&lt;div&gt;The main difference between Qlik and Power BI when it comes to contrasting data is this:&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Qlik can be more efficient for the user than Power BI (or Excel) for contrasting data because Qlik allows the user to contrast data sets interactively by clicking buttons which can be performed while in a “&lt;a href="https://en.wikipedia.org/wiki/Flow_(psychology)"&gt;flow&lt;/a&gt;” state of consciousness and focus on the business questions.&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Power BI on the other hand can do some of this under some circumstances but in other circumstances requires the user to build new charts and in some cases it may require a change to the data model itself. Because different types of solutions may be required – some taking longer than others – it’s harder to achieve a “flow” state of consciousness while playing the game of 20 questions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Excel – through the VLOOKUP function - is more consistent in its approach to reconciliation than Power BI although it requires the Excel user to effectively perform some basic Data Prep. Because of this consistency users can get into a “flow” state and focus on business questions, but unlike with Qlik – which keeps the analyst in a Business Analytics flow state - Excel forces data analyst down to a Data Prep flow state, distracting from our game of 20 questions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let’s first explore these differences between tools using a very simple example and then move to a more realistic example in the next section.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For each of these demos we are comparing two lists [of calls] to the difference between the two sets. We are using two pairs of datasets for this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Related one-to-one&lt;/li&gt;&lt;li&gt;Related many-to-many&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let’s see how the “contrast” exercise appears for each tool, starting with Qlik then Power BI and finally Excel.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_one_to_one.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="384" data-original-width="390" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_one_to_one.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_one_to_one_b.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="373" data-original-width="392" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_one_to_one_b.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this example with Qlik need only click the “select all” icon (highlighted in yellow) as denoted by the double-checkmark icon.&amp;nbsp; Doing so instantly reveals excluded values (3 missing from CCA and 3 missing from CCB).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If the lists were not so compact and instead had thousands (or millions) of values, we could click the “select excluded” button to instantly see the excluded values. Furthermore we could simply select these excluded values to explore their distribution with respect to other metrics.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The following screen caps show that there is no difference with the relationship is many-to-many:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_many_to_many.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="372" data-original-width="425" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_many_to_many.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_many_to_many_b.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="367" data-original-width="415" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_calls_many_to_many_b.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;As you can see there is no difference in behaviour. The user doesn’t have to think and can follow the same “happy path”.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Although it is not necessary in this example, if we wanted to analyze the subset of calls that are excluded, in QlikView we can perform a “reduction” which effectively removes everything else except for our selections. The screen cap below shows how this option is selected and how the data appears following selection.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reduce_data.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="59" data-original-width="580" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reduce_data.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;Here is how it appears.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_after_reduction.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="116" data-original-width="379" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_reconcile_after_reduction.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Now let’s see how this compares with Excel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We use the VLOOKUP function which performs a lookup against another table and then retrieves a corresponding row value based on a match with the key column value.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Here is how we have implemented the function as used in the screencaps below:&lt;/div&gt;&lt;div&gt;=VLOOKUP(A4,CCB!$A$2:$B$15,1, FALSE)&amp;nbsp;&lt;/div&gt;&lt;div&gt;This shows how the one-to-one linked tables appear when contrasted in Excel using the VLOOKUP function:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/excel_reconcile_calls_one_to_one.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="391" data-original-width="618" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/excel_reconcile_calls_one_to_one.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here we go again with the many-to-many example:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/excel_reconcile_calls_many_to_many.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="429" data-original-width="544" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/excel_reconcile_calls_many_to_many.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Speaking for myself, I do not find the function intuitive to use and if I haven’t used it for a while will forget to hit ‘F4’ to make the second argument absolute so that it does not change from row to row when you auto-fill the cells below. But most Excel jockeys have no problem with these quirks and can wield it relatively quickly. Given the initial difficulty with VLOOKUP and how popular it is, I’ve often thought of VLOOKUP as a kind of analytical rite-of-passage.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Regardless of how proficient you are performing a VLOOKUP, it is essentially a Data Prep task and you may need to reorganize the column order in the data itself before you can begin to use the VLOOKUP function in order to place the key field in the correct column order.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With Qlik the user can interactively select excluded records with a single click (or maybe 2 at most) thereby reserving one’s capacitive memory (we can keep track of about 7 things in our head at once – known as Miller’s magic number). This allows the user to focus on allowing business questions to guide selections (including selecting excluded calls) allowing for a “flow” state of consciousness thus enabling optimal human performance.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Contrast this with Excel where the user must change gears and switch into Data Prep mode – even if only for a few seconds – this breaks the steam of consciousness pertaining to the business questions and forces the analyst to context switch to a Data Prep exercise.&amp;nbsp; If the Excel analyst is practiced and quick and can complete the VLOOKUP in under 10 seconds, then they can maintain their train of thought then they may be able to return to a flow state. But every VLOOKUP and Data Prep task potentially breaks this flow and thus undermines the desired flow state.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While the Qlik and Excel approaches are quite different in that Qlik can perform the task interactively maintaining business question-mode thinking whereas Excel requires changing gears into Data Prep-mode thinking, but what they both have in common that they do not share with Power BI is consistency. Namely, regardless of the data cardinality, whether the data is linked one-to-one or one-to-many or many-to-many, the operations the analyst performs are the same. It is this consistency in Excel that allows an experienced user to almost work subconsciously when performing the Data Prep tasks thus keeping the train of thought on the business questions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Having seen Qlik and Excel, let’s move to our third and final example: Power BI.&lt;/div&gt;&lt;div&gt;The two screen caps below show two pairs of Power BI Slicers respectively.&amp;nbsp; The screencap showing the first pair show how they appear without user selections.&lt;/div&gt;&lt;div&gt;In the second screencap, we can see what happens when the user selects ‘(Blank)’: Power BI reveals the missing Call IDs from the other table.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this example we can see the user need only click ‘(Blank)’ to see the missing values. In Power BI, any time there is a one-to-one or one-to-many relationship defined, the user can always click ‘(blank)’ on the “one” side to see the missing values on the other side. Just like in Qlik the user need only click a single button to reveal the deltas.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_one_to_one.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="462" data-original-width="401" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_one_to_one.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_one_to_one_blank_selected.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="446" data-original-width="389" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_one_to_one_blank_selected.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The one-to-one example is very similar to Qlik and can be performed with a single click, thus enabling the desired “flow” state of consciousness.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But what if the relationship is one-to-many or many-to-many? The differences between the relationships are shown below.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_model.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="608" data-original-width="798" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_model.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Due to the nature of Power BI’s Vertipaq indexing engine, columns that are in a table on the “many” side of any relationship will not allow users to simple select ‘(Blank)’ as we saw in the example above. Instead, the user must develop a Table Chart and configure it to show blank values.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The example shown below shows the delta between the two lists by building a simple Table Chart and then sorting the smallest values to the top to see the deltas (highlighted). You can also see that the Slicers no longer have the option of selecting ‘(Blank)’.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_many_to_many.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="386" data-original-width="800" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_many_to_many.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As noted above this chart requires an extra step of configuration that is normally not required: I had to check the “Show Items with No Data” option.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_many_to_many_config.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="614" data-original-width="592" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_many_to_many_config.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Reflecting on this we can see that for the many-to-many relationship it is a different and more complicated process we need to follow when reconciling across relationships that have a “many” sided cardinality on one or both sides.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;It should also be pointed out that it makes no difference if the link direction is Both or Single direction or which direction the link is pointing since the only way to answer must be calculated through a chart that we must build that requires a more time-consuming dense calculation - by explicitly evaluating all possible combinations - as opposed to a quick sparse calculation like when we selected ‘(Blank)’ in the earlier example.&amp;nbsp; Qlik on the other hand is only making quick sparse calculations as it leverages its Bi-directional Associative Index (BAI) to directly pinpoint deltas.&amp;nbsp; This performance hit in Power BI adds another level of friction when it comes to reconciliations. But the main difference between the tools is that Qlik provides a &lt;u&gt;consistent&lt;/u&gt;, &lt;u&gt;quick&lt;/u&gt;, and &lt;u&gt;interactive&lt;/u&gt; reconciliation experience whereas Power BI requires changing gears to build tables, and then waiting additional time for those densely calculated tables to render.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;There is another approach we can take with Power BI: If we look to Power BI’s built-in Data Prep tool Power Query, here we can see a more consistent “&lt;a href="https://en.wikipedia.org/wiki/Happy_path"&gt;happy path&lt;/a&gt;” to reconciliation much like Excel.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Namely, we can use Power Query’s &lt;a href="https://en.wikipedia.org/wiki/Relational_algebra#Antijoin_.28.E2.96.B7.29"&gt;ANTI-JOIN&lt;/a&gt; feature which can be accessed through the “Merge” step with the Join Kind set to ‘Left Anti (rows only in first)’ as shown below.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerquery_merge_antijoin.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="629" data-original-width="699" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerquery_merge_antijoin.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;After completing both a ‘left anti’ and ‘right anti’, the screencap below shows the resulting delta datasets that were correctly revealed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_antijoin_tables.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="178" data-original-width="670" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_reconcile_calls_antijoin_tables.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As it turns out this ANTI-JOIN approach is closer to a ‘happy path’ due to its consistency in procedure – since the cardinality and link direction of relationships is irrelevant -&amp;nbsp; rather than through navigating Slicers and Tables in a pre-built Tabular Model and performing different operations depending on the cardinality of relationships.&amp;nbsp; Furthermore, this approach to reconciliation can be easily automated and the resulting data can be selected and analyzed by the user. This is like how a Qlik user can “Select Excluded” values and analyze the compliment set.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Thus, if your focus is on contrasting data and you need a surefire tool, it is better to use the Power Query within Power BI as opposed to Power BI proper.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In summary, the name of the game is not only to meet business requirements (any Turing-complete software can do that), but rather the game of Analytics should be about putting the user in a “20 questions” flow state of consciousness for efficient business problem solving.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Both Excel and Power Query do provide a “happy path” when it comes to contrasting data, but that happy path requires context shifting from business analytical thinking to Data Prep tool thinking. Qlik on the other hand is well designed to perform interactive reconciliations at a record level and with some training users can achieve a flow state of consciousness.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the next and final section, we will compare Qlik and Power BI using a more realistic example.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h4 style="text-align: left;"&gt;Data and App Documents&lt;/h4&gt;&lt;div&gt;Here is the &lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/reconcile/Call_Centre_Reconcile.xlsx"&gt;data&lt;/a&gt; we used. It also includes the Excel VLOOKUP examples.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is the QlikView application &lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/app/Reconcile Calls.qvw"&gt;document&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;Here is the Power BI application &lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/app/Reconcile Calls.pbix"&gt;document&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;h2 style="text-align: left;"&gt;Differences in Qlik and Power BI in Practice&lt;/h2&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this example we will analyze a telecom’s (e.g. Bell, Rogers, AT&amp;amp;T, Sprint, etc.) Call Centre data in order to answer the question: &lt;u&gt;During February 2020 which Customers placed calls to the Call Centre answered by non-authorized Customer Service Representative and did not generate any Work Orders?&lt;/u&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can break this question down into a series of yes/no questions:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Question&lt;/b&gt;: Did the customer place a call in February 2020?&lt;/div&gt;&lt;div&gt;&lt;b&gt;Answer&lt;/b&gt;: Yes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Question&lt;/b&gt;: Was there an authorized Call Centre Representative on this call?&lt;/div&gt;&lt;div&gt;&lt;b&gt;Answer&lt;/b&gt;: No.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Question&lt;/b&gt;: Does the call have an associated Work Order?&lt;/div&gt;&lt;div&gt;&lt;b&gt;Answer&lt;/b&gt;: No.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;With Qlik and Power BI we can ask these questions for our entire population of linked customers, calls, and work orders.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With Qlik we can answer these questions quickly and interactively while maintaining a “flow” state of mind, but with Power BI we must change gears and build and configure dense pivot tables which leads to mental context switching.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Before getting into the heart of this example I should point out that I am using fictional data that I generated using a normal random number generator. This data set lacks the novelty we find in the real world. I must apologize that the “insights” revealed are not on the level we would get from a real-world data set. However, there is also a lesson here: Publicly available data sets tend to be relatively simple in structure (e.g. most open data sets are just denormalized “fat” tables), whereas real business systems produce large normalized models with multiple fact tables.&amp;nbsp; It is when we have multiple fact tables that both Power BI and Qlik really begin to shine and is also where we can see the most contrast. For privacy and confidentiality reasons most organizations understandably don’t want to reveal these data sets to the general public.&amp;nbsp; As a side note this is one aspect I love about my job: I get to see these databases across so many industries in all their glory. But I cannot share those databases for obvious reasons.&amp;nbsp; It’s for this reason Microsoft often uses the “&lt;a href="https://docs.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-ver15&amp;amp;tabs=ssms"&gt;Adventure Works&lt;/a&gt;” database for demonstration. But even that is a fictional dataset lacking real world novelty.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But if you can suggest any datasets that have multiple fact tables and that are useful for teaching, please let me know?&lt;/div&gt;&lt;div&gt;In the meantime I’m going to keep looking for or possibly developing on my own a real data set that can be used for teaching analytics. One thing I have found since I started writing this article is an App called “&lt;a href="https://apps.apple.com/gb/app/trueentropy/id1299321174"&gt;TrueEntropy&lt;/a&gt;” that generates random data using your Smart Phone’s camera. I haven’t had a chance to really dig into this yet, but it looks like an improvement over the standard random number generator I used for this example.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Sorry for all the preamble, let’s get into the examples now.&lt;/div&gt;&lt;div&gt;The data model we are going to be exploring is larger (in terms of the numbers of attributes and entities) than the data model we explored in parts &lt;a href="http://hepburndata.blogspot.com/2021/01/analytic-efficiency-part-1-power-bi-vs.html"&gt;one&lt;/a&gt;, and &lt;a href="http://hepburndata.blogspot.com/2021/02/analytic-efficiency-part-2-user.html"&gt;two&lt;/a&gt;&amp;nbsp;of this three-part blog series.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The original data model had two tables: MyCalls and MyWorkOrders.&amp;nbsp; This new data model extends this by adding in:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;MyCustomers: The customers who placed the calls and whom the work orders were created for&lt;/li&gt;&lt;li&gt;MasterCal: A master calendar for allowing for drill down within a period hierarchy (i.e. Year -&amp;gt; Quarter -&amp;gt; Month -&amp;gt; Week -&amp;gt; Date)&lt;/li&gt;&lt;li&gt;MyCSRs: The Customer Service Representative (CSR) working for the telecom. The CSR answers customer calls and interacts with customers resulting in work orders for the customer&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an aside: You may have noticed There is also a table called “WorkOrderTypeAND” which I originally included to demonstrate a feature that is specific to QlikView; It allows the analyst to make ‘and’ based user selections (as opposed to ‘or’ based user selections) and is not found in Power BI (nor Qlik Sense).&amp;nbsp; While this feature does allow us to ask more questions (e.g. Who has Television AND NOT Internet), it is not inherent to the point I am making here. For this reason, I have since decided not to leave discussion out of this blog post, but if you’re curious about how it works you can download and experiment with the demo to see its functionality.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is how the model appears in QlikView:&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/Qlik_Associative_Model.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="413" data-original-width="800" height="206" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/Qlik_Associative_Model.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is how the same model (minus the ‘AND’ table) appears in Power BI:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/PowerBI_Tabular_Model.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="612" data-original-width="800" height="306" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/PowerBI_Tabular_Model.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now let’s get into the business questions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;To recap, our main question is: &lt;u&gt;During February 2020 which Customers placed calls to the Call Centre without an authorized/known Customer Service Representative and did not generate any Work Orders?&lt;/u&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For context, in our example there was a botched Chatbot roll-out issue in February 2020 that resulted in customers (who learned of the flaw through social media) creating non-authorized Work Orders using a special promo code. We want to know who penetrated the vulnerability but did NOT take advantage of the flaw. When we find these Customers, we would like to thank them and speak with them to help uncover other security issues communicated through social media.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To get there we need to find Calls with no CSR and from there find Customers with no Work Orders in February 2020.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;We are going to break this down into two exploratory (practice) questions to show how those can be answered and then combine what we have learned to answer the third and final question &lt;u&gt;During February 2020 which Customers placed calls to the Call Centre without an authorized/known Customer Service Representative and did not generate any Work Orders?&lt;/u&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;By breaking down our big question into smaller exploratory questions we can begin to get a sense of what is going in the data before diving into our third an final question.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Our two exploratory questions are:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;What is the distribution of calls with no authorized CSR?&lt;/li&gt;&lt;li&gt;What Customer did not have any Work Orders February (of any year)?&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Question #1&lt;/b&gt;: What is the distribution of calls with no authorized CSR?&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;In Qlik, to answer this question we:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Select [All] in “CSR Name”&amp;nbsp;&lt;/li&gt;&lt;li&gt;Select ‘Excluded’ in “Call ID”&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_q1_custs_wo_csr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="445" data-original-width="800" height="223" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_q1_custs_wo_csr.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;In Power BI, we simply:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Select ‘(Blank)’ in “CSR Name”&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_q1_custs_wo_csr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="470" data-original-width="800" height="235" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_q1_custs_wo_csr.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As can be seen in the above example, both Qlik and Power BI allow for interactive selection here with Power BI being better than Qlik in this instance given that you only need to click one button in PBI versus two clicks in Qlik.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now let’s move to our next question.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Question #2&lt;/b&gt;: Who are the customers with no Work Orders for February (of any year)?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In Qlik we can answer this question by making three selections in real time:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Select ‘February’ in “Month”&lt;/li&gt;&lt;li&gt;Select [All] in “Work Order ID”&lt;/li&gt;&lt;li&gt;Selected [Excluded] “Customer Names”&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;The answer is revealed as ‘Noah Q. Moore’.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_wo_feb_workorder.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="568" data-original-width="396" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_wo_feb_workorder.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;In Power BI however, because of the one-to-many relationship, we cannot rely on Slicers alone. Instead we must take extra time to construct a Table (remembering to set the “Show items with no data” setting), wait for the table to render, then sort the table from smallest to largest (or we could have filtered on blank “Num Workers”). The resulting set cannot be selected as a Slicer selection, so the user is tightly limited in the follow-up questions they may ask.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_custs_wo_feb_workorder.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="794" data-original-width="345" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_custs_wo_feb_workorder.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Having answered both of our exploratory questions we are now ready to begin answering our third and final question.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;b&gt;Question #3&lt;/b&gt;: &lt;u&gt;During February 2020 which Customers placed calls to the Call Centre without an authorized/known Customer Service Representative and did not generate any Work Orders?&lt;/u&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Or stated differently: Who are my most honest customers; The customers who when calling without a legitimate CSR in February 2020 never created a Work Order?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To answer this question in Qlik we can do so through 2 simple iterations:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Iteration 1:&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Find all calls without a CSR. Two clicks:&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Select [All] CSRs&lt;/li&gt;&lt;li&gt;Select [Excluded] Calls&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;Filter on year and month:&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Select on ‘Feb’ in “Month” and&amp;nbsp;&lt;/li&gt;&lt;li&gt;Select ‘2020’ in “Year”&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;Isolate this filtered subset by selecting from main menu File-&amp;gt;Reduce Data-&amp;gt;Keep Possible Values&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_calls.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="603" data-original-width="372" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_calls.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Iteration 2 (using subset derived for ‘Possible Values’ in Iteration 1):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Find all Customers with no Work Order&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Select [All] in “Work Order ID”&lt;/li&gt;&lt;li&gt;Select [Excluded] in “Customer Name”&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_custs_wo_workorder.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="608" data-original-width="379" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_custs_wo_workorder.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;FINISHED. We have answered our question. The five customers are:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Brent Z. Skinner&lt;/li&gt;&lt;li&gt;Charlotte E. Snow&lt;/li&gt;&lt;li&gt;Hilel A. Maddox&lt;/li&gt;&lt;li&gt;Malachi Perez&lt;/li&gt;&lt;li&gt;Noah Q. Moore&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we want to see these Customers in their original context (as opposed to in the isolated Dataset), we can refresh the data which reverses the data reduction while keeping the selections.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_custs_wo_workorder_all_calls.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="675" data-original-width="381" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/qlik_custs_feb_2020_no_csr_custs_wo_workorder_all_calls.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;How does Power BI compare?&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Power BI can certainly answer this question but requires more Chart/Table building. In this scenario, it is not enough to simply list all the Customers and sort by Num Work Orders, because we need to be sure we are only seeing Customers with Calls. Even if we “Select All” Calls our Table will continue to show Customers who did not receive any calls in Feb 2020 because we have not explicitly filtered our Customer list by the number of Calls.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This means we must add a Visual Filter on the Table itself where “Num Calls” ‘is greater than 0’. Given that we are already filtering on “Num Calls” we can also filter on “Num Work Orders” ‘is empty’ to create a cleaner looking table.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_custs_feb_2020_no_csr_custs_wo_workorder.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="614" data-original-width="733" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/images/powerbi_custs_feb_2020_no_csr_custs_wo_workorder.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Even with our clean table, it is still not easily possible to convert this list into a Slicer selection and then explore and ask follow-up questions about these five (5) Customers further. Whereas in QlikView we have the Customers selected and can easily continue our game of ’20 Questions’.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Normally in this situation the Power BI developer would likely go up to the Data Prep level and use either Power Query or perhaps Excel to answer this question.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Going back to our sharpening and chopping analogy taken from Abraham Lincoln this means that in Qlik you are chopping more than sharpening and in Power BI you are sharpening more than chopping.&lt;/div&gt;&lt;div&gt;The ideal tool in my view behaves more like a musical instrument that you play like a piano.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;h4&gt;Data and App Documents&lt;/h4&gt;&lt;/div&gt;&lt;div&gt;Here are the data files:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul style="text-align: left;"&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/MasterCal.csv"&gt;MasterCal&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/MyCalls.csv"&gt;MyCalls&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/MyCSRs.csv"&gt;MyCSRs&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/MyCustomers.csv"&gt;MyCustomers&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/data/MyWorkOrders.csv"&gt;MyWorkOrders&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here are the application documents:&lt;/div&gt;&lt;div&gt;&lt;ul style="text-align: left;"&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/app/Analyze Fictional Call Centre Data v3.qvw"&gt;Qlik&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlki_p3/app/Analyze Fictional Call Centre Data v3.pbix"&gt;Power BI&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;h1 style="text-align: left;"&gt;Conclusions&lt;/h1&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Given that Analytics involves exploring both positive and &lt;u&gt;negative&lt;/u&gt; space and Qlik’s Associative Index explicitly supports “Excluded Values”, and given that most other BI tools - like Power BI – do not inherently support Excluded Values, then why has Power BI and other tools like Tableau for that matter managed to outperform Qlik from a sales perspective?&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;More to the point, why is ability to query “Excluded Values” – a core element of Analytics - not even even brought up as a software requirement?&lt;/div&gt;&lt;div&gt;The simplest answer to this question is that the buyers of these tools lack &lt;u&gt;connoisseurship&lt;/u&gt;.&lt;/div&gt;&lt;div&gt;I can break this answer down to three specific problems:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;IT tends to favour homogenous technology platforms and deal with as few vendors as possible. “One throat to choke” as they say.&lt;/li&gt;&lt;li&gt;The “drill” pattern tends to be visual in nature and is what most senior managers – and laypersons - can quickly grasp. These are the same people who control spending.&lt;/li&gt;&lt;ol&gt;&lt;li&gt;The “contrast” pattern on the other hand is more obscure in nature and does not lend itself to flashy sales demos&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;The “contrast” pattern as I described it above is generally regarded as a Data Prep (as opposed to Business Intelligence/Data Viz) activity&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Qlik’s answer to the “contrast” defies traditional product categories&lt;/li&gt;&lt;ol&gt;&lt;li&gt;There are no recognizable competitors to Qlik that also have the ability to query “Excluded Values”&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The first problem of IT favouring homogeneous technology over “best of breed” vendors is not specific to Qlik and is&amp;nbsp; fairly well understood and discussed problem. I have nothing to add that discussion here that has not already been said.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The second problem of how Business Intelligence/Data Viz tools are often sold based on slick demo is also nothing new. Although I will add some colour to this: On more than one occasion I have seen a senior executive extol the virtues of a Business Intelligence tool for the simple reason that they used a report built with this tool. Even if that report is nothing more than a rendered image file, the executive will continue to lavish praise on the tool which quickly creates buying consensus.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Oh, but you may say that there are research firms and consulting companies that executives can look to for guidance. Unfortunately, these companies tend to all be deeply compromised in one way or another. Without naming names, think of the most prestigious business &amp;amp; technology research company and advisory consulting company you can think of, and I can show you how they don’t hold a candle to &lt;a href="https://en.wikipedia.org/wiki/Consumer_Reports"&gt;Consumers Reports&lt;/a&gt;.&amp;nbsp; The main problem is these companies have deep relationships with the vendors they are supposed to referee.&amp;nbsp; We can see elements of this in journalism whereby many esteemed newspapers and media outlets will be more generous to certain institutions or individuals in order to maintain “access” to those institutions and individuals.&amp;nbsp; Consumers Reports on the other hand is entirely supported by a large consumer base. Even Consumers Reports is not perfect, but it is significantly more trustworthy than any of the research and consulting firms in the B2B world when it comes to evaluating technology.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;The third problem can best be understood through this famous quote from Henry Ford: “If I had asked people what they wanted; they would have said faster horses.”&lt;/div&gt;&lt;div&gt;Putting our analytical hat on, if we contrast Ford and Qlik in this light we can see that Ford has the clear upper hand:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Their cars will work on any roadway (unlike the closed railway lines)&lt;/li&gt;&lt;li&gt;The purchaser of the product is also the user and a slick demo is not enough – the vehicle must pass a test drive&lt;/li&gt;&lt;li&gt;While Ford was an early manufacturer of automobiles, he was not the first or only manufacturer and the presence of competitors created consumer demand&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As a result, was Ford less of a “think different” underdog and more of a monopolistic bully.&amp;nbsp; There is another lesson to be learned from Ford, a lesson that many Canadians are aware of:&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;During the early days of automobile production Ford clashed with an inventor based here in Canada: P.L. Robertson the inventor of the Robertson screwdriver, also known as the ‘square’ screwdriver. You can read the story &lt;a href="https://www.thomasnet.com/articles/hardware/robertson-screwdriver-history/"&gt;here&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To quickly summarize, in 1908 the Canadian P.L. Robertson invented a new type of screw head and screwdriver that was square shaped which allowed for maximal torque while minimizing stripping. This allowed the Robertson screws to be more securely fastened. Henry Ford realized the importance of this invention and demanded Robertson relinquish the patent rights to Ford for a single lump sum. Robertson refused and so Henry Ford went to a competitor, Henry Phillips, who had invented a similar but inferior screwdriver, the Phillips screwdriver. Phillips did maintain some patent rights though and thus the still inferior flat-head screwdriver continued and continues to proliferate.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;If we all switched to Robertson screws, we would need fewer screwdrivers and screws, and construction quality would improve.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And yet, if you asked an American they would at first be oblivious and then confused as to why the United States would be committed to a demonstrably inferior product. The simple answer is the same reason why we use Microsoft: Vendor lock-in through network effects. But the fact that Robertson remains in use also tells us that it is possible to survive and even thrive based on the merit of the tool itself.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;I come back to the word &lt;u&gt;connoisseurship&lt;/u&gt;, and for reasons of environment and circumstance connoisseurship differs among people and organization.&amp;nbsp;&lt;/div&gt;&lt;div&gt;How does change come about then and why should anyone care?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For me the answer is simply emotion through experience: The experience of &lt;a href="https://en.wikipedia.org/wiki/Awe"&gt;awe&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Wonder_(emotion)"&gt;wonderment&lt;/a&gt; and feelings of mastery and contentment from achieving flow states.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you have ever discovered a powerful tool and have wielded it to great affect, then you will know these feelings I am talking about and you will yearn for these feelings.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But if you have never had such a breakthrough and not experienced awe and a sense of mastery then everything I am saying will sound like noise. But I think if you have learned to ride a bicycle or swim or ski down a hill, then maybe you can relate?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I have thus decided that my &lt;a href="https://en.wikipedia.org/wiki/Polaris"&gt;North Star&lt;/a&gt; going forward is to figure out ways I – Neil Hepburn - can bring about a feeling of awe and wonderment for Analytics. I may or may not use Qlik to help me here and have not decided. But I have no doubt that Qlik has helped me understand the essence and spirit of Analytics and I am going to challenge myself to see if I can bring about that experience of awe to others.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If this is something you are interested in hearing more about, you are welcome to drop me a line.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</description><link>http://hepburndata.blogspot.com/2021/05/analytic-efficiency-part-3-why-qlik.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-3040458381596035240</guid><pubDate>Sat, 15 May 2021 18:20:00 +0000</pubDate><atom:updated>2021-05-15T14:20:59.838-04:00</atom:updated><title>What History Can Teach Us About the Nature of Tools</title><description>&lt;p&gt;A recent &lt;a href="https://www.sciencedaily.com/releases/2021/05/210510133204.htm"&gt;study&lt;/a&gt;&amp;nbsp; published out of The University of East Anglia has shown – what many have long suspected - that the human brain treats well designed tools (like a spoon or a hammer) as an extension of the human body. More importantly, this study shows us that well-designed tools are interpreted by the brain differently than poorly designed tools which the brain simply treats as an external object. For more context, this Twitter &lt;a href="https://twitter.com/provisionalidea/status/1392854515347513351"&gt;thread&lt;/a&gt; does a good job of summarizing the study’s findings.&lt;/p&gt;&lt;p&gt;To understand the implications of this insight we can look to history.&lt;/p&gt;&lt;p&gt;History teaches us that tools and systems with &lt;u&gt;consistency&lt;/u&gt;, &lt;u&gt;quickness&lt;/u&gt;, and &lt;u&gt;interactivity&lt;/u&gt; can be more easily mastered, and this tool mastery leads to psychological flow states of consciousness that can bring about rapid change.&lt;/p&gt;&lt;p&gt;This insight is relevant to our current moment because what I am seeing in the world of business data &amp;amp; analytics is a never-ending tension between tools that users love and can produce great outcomes, and inferior tools which users are forced to use out because of vendor lock-in homogeneous architecture constraints.&amp;nbsp; I will return to this theme in an upcoming article (Part 3 of my Analytical Efficiency series), but in this article will explore this topic head-on.&lt;/p&gt;&lt;p&gt;How does history teach us about tools and mastery?&amp;nbsp;&lt;/p&gt;&lt;p&gt;Let me tell you some stories …&lt;/p&gt;&lt;p&gt;During my undergrad years at university in the early-mid 1990s – majoring in Computer Science – the computer workstations we were provisioned all ran on a UNIX based OS (Solaris). Whether you used one of the newer “X-Window” terminals or one of the older text-based terminals you would have to learn one of the command lines and associated text editor.&lt;/p&gt;&lt;p&gt;I recall being quite frustrated by this at first. I had my own PC, an 80386 SX. While I could use this PC for preparing assignments in WordPerfect - a popular “&lt;a href="https://en.wikipedia.org/wiki/WYSIWYG"&gt;WYSIWYG&lt;/a&gt;” (‘What you see is what you get’) word processor at the time - I could not use my home PC for preparing computer science work (i.e. writing code) because I needed to have access to shared tools and datasets. Since all the Comp Sci workstations ran on Solaris I was suggested by a UNIX-savvy peer to use a text editor called “&lt;a href="https://en.wikipedia.org/wiki/Vi"&gt;VI&lt;/a&gt;” (an abbreviation of “visual”).&amp;nbsp;&lt;/p&gt;&lt;p&gt;Using VI at first felt like psychological torture. There was nothing intuitive about it mainly because VI requires the user to toggle between two modes: “Insert mode” where you can freely enter text as you would do normally; and “command mode” which is a bit trickier to explain but as its name implies allows you to navigate and manipulate blocks of text using keyboard commands with ability to program new commands. Many of the commands in “command mode” are activated by hitting a single character on your keyboard and the most commonly used commands (for scrolling through the text) were characters on the home keys themselves. For example: ‘j’ moves the cursor up a line and ‘k’ moves you down, while ‘h’ moves the cursor left and ‘l’ moves right.&lt;/p&gt;&lt;p&gt;Over time I became more proficient with “VI” and found that I could work much efficiently than the old Borland IDE text editor I had been using before. But what about my other subjects? What about WordPerfect?&lt;/p&gt;&lt;p&gt;By 2nd or 3rd year a friend showed me how he prepared all his essays using “VI” but compiled them into PostScript files which could then be printed out and handed in.&amp;nbsp; To do this I had to use another tool called “&lt;a href="https://en.wikipedia.org/wiki/LaTeX"&gt;LaTeX”&lt;/a&gt;. This tool was a compiler for a markup language called “&lt;a href="https://en.wikipedia.org/wiki/TeX"&gt;TeX&lt;/a&gt;” which is like HTML but is more of a typesetting system with precision around page layout. In other words, a possible replacement for WordPerfect. But this also required learning LaTeX and TeX which I never mastered but did learn enough of what I needed like font changes, indented paragraphs, tables, number lists, bullet lists, and so forth. On top of this I would also borrow “code” from others to add more impressive flourishes. I also followed my friend’s advice and created a little “&lt;a href="https://en.wikipedia.org/wiki/Make_(software)#Makefile"&gt;makefile&lt;/a&gt;” which I used to compile my LaTeX files into PostScript files that I could view before printing which adequately compensated for the lack of WYSIWYG functionality and provided quick feedback.&lt;/p&gt;&lt;p&gt;By the end of my final year in university I was able to write and edit term papers using “VI” faster than I could in WordPerfect while at the same time the quality of the printed page looked like it had come from a professional publisher.&lt;/p&gt;&lt;p&gt;Today, I still use “VI” occasionally, but not for word processing – formats like “LaTeX” are too obscure for the business world and Microsoft Office (or compatible copycats) are the order of the day now. I don’t have a big problem with this, but there is a part of me that yearns for that feeling of mastery I once felt when editing text. It’s probably the reason I do use “VI” when I can – it’s not just more efficient, it feels good.&lt;/p&gt;&lt;p&gt;George R. R. Martin, author of the Game of Thrones series apparently also feels good when using older and simpler tech like “VI” except in his case he uses another tool called &lt;a href="https://en.wikipedia.org/wiki/WordStar"&gt;WordStar&lt;/a&gt;. WordStar is an older word processor that only runs on MS-DOS and has not been updated since the 1990s. Like “VI” it is oriented around the keyboard and home keys in particular making it easy to navigate and manipulate a document without taking your hands off the keyboard. I heard from this podcast &lt;a href="https://podcasts.apple.com/ca/podcast/interview-interlude-robert-j-sawyer-y2k-the-long-view/id1455676429?i=1000478353417"&gt;episode&lt;/a&gt;&amp;nbsp;that Robert Sawyer – a popular science fiction writer- also uses WordStar for the same reason. Namely, he prefers a tool that minimizes friction between the thoughts in his head and the words that get put down.&amp;nbsp; By using a streamlined tool like WordStar, Sawyer explains that he can more easily keep his train of thought and remain “in the zone” while writing.&lt;/p&gt;&lt;p&gt;What Sawyer is referring to is known as the “&lt;a href="https://en.wikipedia.org/wiki/Flow_(psychology)"&gt;flow state&lt;/a&gt;” and is now generally regarded to be the ideal state of human performance when we are both happiest and most masterful in our work. If you have ever biked, skateboarded, or skied down a big hill you will know this feeling.&lt;/p&gt;&lt;p&gt;Why then is it that tools like MS Word (which Sawyer eschews in favour of WordStar) do not work like WordStar? Robert Sawyer’s specific answer to this question is that MS Word did not evolve around writers and creatives but rather the main users of the tool are secretaries and assistants who appreciate the extra features that allow them to perform advanced typesetting and other sophisticated tasks with relatively little training.&lt;/p&gt;&lt;p&gt;In other words, the professional writers depend on tool consistency – even if there are fewer features – since mastery is of writing itself is paramount.&amp;nbsp; Executive assistants and other staff involved in document preparation are less concerned with the content of writing itself and rather the formatting, presentation, and distribution of documents – something modern word processors like MS Word are replete with.&lt;/p&gt;&lt;p&gt;This tension between tools like VI and WordStar suited for power users versus tools like WordPerfect and MS Word for more casual users.&lt;/p&gt;&lt;p&gt;As a side-note many creative writers often use another tool called “&lt;a href="https://en.wikipedia.org/wiki/Scrivener_(software)"&gt;Scrivener&lt;/a&gt;” which is more purpose built for story writing (e.g. there are features to keep track of plots and characters) than say MS Word, but I wouldn’t be surprised if folks George R. R. Martin stick to Wordstar for the simple reason that they can wield it with more efficiency.&lt;/p&gt;&lt;p&gt;A user-friendly tool can spread quickly and aids efficiency. But occasionally in history we see other tools appear that are not so user-friendly but when mastered can change the world.&lt;/p&gt;&lt;p&gt;An example that comes mind is the horse. A horse being an animal is not normally thought of as technology. Starting in the bronze age around 5,000 years ago horses domesticated on the Eurasian Steppe led to massive waves of migration and change – it is why so many languages fall under the umbrella of “&lt;a href="https://en.wikipedia.org/wiki/Indo-European_languages"&gt;Indo-European&lt;/a&gt;”. To get a clearer picture of what made the horse such a formidable tool and why it remained such a powerful tool on the Eurasian Steppe for so long I want to point out a few things about Genghis Kahn’s &lt;a href="https://en.wikipedia.org/wiki/Mongol_Empire"&gt;Mongol Empire &lt;/a&gt;during the 13th century.&lt;/p&gt;&lt;p&gt;The &lt;a href="https://en.wikipedia.org/wiki/Mongolian_horse"&gt;Mongolian horse&lt;/a&gt; differed from the stockier European horse – like those horses The &lt;a href="https://en.wikipedia.org/wiki/Normans"&gt;Normans&lt;/a&gt; favoured. Similar to the Mongol’s, the Norman’s were also known for striking fear through their armoured knights and horses who were not only large and powerful but had been bred for combat and could not be easily spooked.&amp;nbsp;&lt;/p&gt;&lt;p&gt;The Mongolian horse on the other hand was smaller and not as powerful as the Norman’s European horse. But the Mongolian horse did have some advantages: The European horse required grain feed in order to thrive which made it more expensive to feed. This meant that the European horse also needed to be on or near a farm where the grain could be supplied. During times of war the European horse was highly effective in the battlefield but was expensive and challenging to maintain because keeping supply lines safe was crucial for success due to the horse’s dependence on grain from farms.&amp;nbsp; On the other hand, the Mongolian horse could feed off the grass in the steppe or any meadow. Even during winter, the Mongolian horse can punch through a top layer of frost and ice to get at the grass (unlike cows which could not penetrate the frost and would starve).&amp;nbsp; A Mongolian soldier did not need to worry about supply lines for the horses and would have additional horses (each soldier usually kept four horses) that could be eaten if necessary. This provided more agility during times of war.&lt;/p&gt;&lt;p&gt;Because of this self-contained sustainability of the Mongolian horse in conjunction with the grass food source the steppe provides, the main advantage of the Mongolian horse is that it allowed for a nomadic/itinerant culture to sustain itself through a very horse-centric lifestyle. If you are born into a nomadic horse culture then it is said you are “born on the saddle”. In other words, through the consistency of being able to take horses anywhere and everywhere, the mounted archer quickly develops mastery and can take their mastery to higher levels than a European soldier whose experience with horses would be more sporadic. This is the true power of all great steppe cultures including the &lt;a href="https://en.wikipedia.org/wiki/Xiongnu"&gt;Xiongnu&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/Scythians"&gt;Scythians&lt;/a&gt; that came before the Mongols (even perhaps as far back as the late 4th millennia BCE during the bronze age when the &lt;a href="https://en.wikipedia.org/wiki/Yamnaya_culture"&gt;Yamnaya&lt;/a&gt; culture spread out from the &lt;a href="https://en.wikipedia.org/wiki/Pontic–Caspian_steppe"&gt;Pontic-Caspian Steppe &lt;/a&gt;and seeded all Indo-European cultures). Growing up as a steppe nomad like a Mongol you would begin as a mounted archer as young as 5 or 6 years old, or maybe younger. A child would then start on a sheep with a small bow and hunt small animals like rabbits and squirrels. Once they get into their early teens they move to a slightly larger bow on a small horse and begin hunting larger animals like foxes. Once adulthood is achieved the grown adult moves on to a full-grown Mongolian horse with the full composite bow and begins hunting large game like deer.&amp;nbsp; The Mongolians would also play these “encircling” games on horseback whereby a group of mounted archers cooperates to form huge circles – as large as 10 kilometres in diameter and then begin to spiral and close the circle by herding the animals towards its centre eventually trapping all the animals in the middle. As is now well known these mounted archers could punch far above their weight and inflicted great terror during their raids which eventually allowed them to form empires like the Mongolian and Kahn Empires which at the time was the largest empire ever formed.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Women also played a key role in these steppe nomad cultures since the horse was very much a leveler of human strength, like the gun is today (which is also why gunpowder eventually displaced horses as the primary war technology). Those “&lt;a href="https://www.nationalgeographic.com/history/article/141029-amazons-scythians-hunger-games-herodotus-ice-princess-tattoo-cannabis"&gt;Amazonian women&lt;/a&gt;” you may have heard about - possibly through the classical that the Greeks who wrote about them or maybe you saw the Wonder Woman movie - were likely Scythian steppe warrior women who were mounted archer warriors that were part of a much older tradition that goes back to the bronze age. In the &lt;a href="https://en.wikipedia.org/wiki/Xiongnu"&gt;Xiongnu&lt;/a&gt; culture (predecessors to the Mongols), women routinely held the highest status often more so than men which we know from the items they were buried with (they prized large ornate belt buckles), starkly in contrast to neighbouring civilizations at the time. Technology with consistency can be a great equalizer too.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/history_teach_tools/world_nomad_games_archer.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="800" data-original-width="503" src="https://cradleofanalytics.blob.core.windows.net/blog/history_teach_tools/world_nomad_games_archer.jpeg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;[&lt;i&gt;An acrobatic archer competing in the World Nomad Games&lt;/i&gt;]&lt;/p&gt;&lt;p&gt;The word “&lt;a href="https://en.wikipedia.org/wiki/Centaur"&gt;Centaur&lt;/a&gt;” – the Greek mythological half-person half-horse creature - comes to mind here.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Centaur is also the word used to describe “&lt;a href="https://en.wikipedia.org/wiki/Advanced_chess"&gt;Advanced Chess&lt;/a&gt;” players. What is Advanced Chess? Advanced Chess was invented by chess grandmaster Garry Kasparov after losing to Deep Blue, a supercomputer developed by IBM. Advanced Chess allows humans to use computers to aid their decision making (e.g. by testing out a move to see what may happen) but do not rely entirely on the computer’s judgement.&amp;nbsp; This allows the human player to focus on higher order strategy while reducing accidental errors. Centaurs tend to beat both humans (without computers) and computers (without humans) at chess.&lt;/p&gt;&lt;p&gt;These improved chess results are in large part because the human can consistently focus on strategy, as opposed to checking for errors or flaws. The thought process becomes more consistent and more efficient.&lt;/p&gt;&lt;p&gt;New user-friendly tools that can be used by many and spread far and quickly and change people and by extension change the world, like how the horse and text editors/word processors changed the world through their very transmission. Over time tools begin to accrete additional features and functions. These extras can be useful for certain scenarios, but when they clutter the original purpose, they can sometimes undermine mastery. Other times if the tool has been extended that does not disrupt the consistency of the original experience then the changes are usually welcomed.&lt;/p&gt;&lt;p&gt;In conclusion, tools that are highly &lt;u&gt;consistent&lt;/u&gt;, &lt;u&gt;quick,&lt;/u&gt; and &lt;u&gt;interactive&lt;/u&gt; in their usage that allow for a wide range of possibilities for the user can be more easily mastered than tools which may have more features but are less consistent in how those features are put together. It is primarily through this consistency that one can achieve mastery and through mastery that one can achieve a flow state of consciousness. And let us not forget that flow states are most often associated with happiness and contentment up there with spending time with friends and family.&amp;nbsp;&lt;/p&gt;&lt;p&gt;This is the reason people love their favourite tools.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Postscript&lt;/b&gt;: History is messy, and no analogy is perfect, but analogies are useful and should not always be thrown out due to a perceived contradiction. In that spirit, here are some details I left out that add colour and do not directly contradict my argument, but could be taken out of context to confuse some readers:&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style="text-align: left;"&gt;&lt;li&gt;&lt;i&gt;I am using the word ‘nomadic’ to refer to an itinerant agro-pastoralist migration pattern as opposed to the more random wandering pattern that the early hunter-gatherers exhibited&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;Although the Mongols did dominate through their mastery of horses and archery, the horse was not the only piece of defining technology; The invention of stirrups which provides stability for mounted archers and the invention of the composite bow allowed for a smaller bow to produce more power were also part of this “mounted archer” package.&amp;nbsp; Furthermore, the defining feature of Genghis Kahn’s Mongolian empire (in contrast to other steppe nomads) was his ability to federate large numbers of disparate tribes and cultures for a common goal, and leverage that diversity to incorporate state-of-the art warcraft like siege technology&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;Although the Norman’s were feared in large part because of their superior cavalry it must be pointed out that the most famous and iconic Norman battle of all – The Battle of Hastings – was decided more through circumstance and chance than by cavalry: The English succumbed to a one-two-punch knock-out blow, first from Viking Norwegian invaders led by Harald in the north - who were defeated in the Battle of Fulford -&amp;nbsp; quickly followed by a separate Norman invasion led by William in the south that converged at Hastings. But perhaps this is in line with my original point – the Normans lacked the mastery of the Mongols when it came to horses&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;Technologically, mounted archers “born on the saddle” were not new: The Xiongnu - possible ancestors of the &lt;a href="https://en.wikipedia.org/wiki/Huns"&gt;Huns&lt;/a&gt; - also formed a large empire centuries earlier using the power of mounted archers “born on the saddle” and like Genghis Kahn had also federated tribes – but had got caught up in a civil war. Like the Mongols, the Xiongnu and Huns were highly disruptive and nearly brought down the Han Dynasty and Roman Empire, respectively.&amp;nbsp; However, the reputation of the Mongols is better known these days and thus why I am using the Mongol example to illustrate the overwhelming power of mounted archers “born on the saddle” and how technology when fitted properly to humans can have highly disruptive consequences in a very short period of time.&lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</description><link>http://hepburndata.blogspot.com/2021/05/what-history-can-teach-us-about-nature.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-7373382713133218412</guid><pubDate>Sun, 07 Feb 2021 20:26:00 +0000</pubDate><atom:updated>2021-02-08T12:20:35.240-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Analytics</category><category domain="http://www.blogger.com/atom/ns#">Efficiency</category><category domain="http://www.blogger.com/atom/ns#">Filter</category><category domain="http://www.blogger.com/atom/ns#">List Box</category><category domain="http://www.blogger.com/atom/ns#">Power BI</category><category domain="http://www.blogger.com/atom/ns#">Power of Gray</category><category domain="http://www.blogger.com/atom/ns#">Qlik</category><category domain="http://www.blogger.com/atom/ns#">Slicer</category><title>Analytic Efficiency Part 2: User Filtering and Indexing Engine Differences</title><description>&lt;p&gt;&lt;span style="color: #244084; font-family: Carlito; font-size: 13px;"&gt;Recap&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In the last article we looked at the differences between Power BI and Qlik’s indexing engines with respect to their Link-based (as opposed to Merged Cube-based) indexing technologies.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The reason why we performed this experiment is that after reviewing a Qlik patent (granted in April 2020) it became clear to me that Power BI and Qlik’s technologies, while appearing similar from the outside - because they both support complex link-based architectures – would be similar internally.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In fact, the two indexing technologies are quite distinct and this difference is most noticeable in the behaviour and performance of user filtering: Power BI Slicers versus QlikView List Boxes (or Qlik Sense Filters).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As such, I ran an experiment to see if this difference also created trade-offs with respect to traditional Business Intelligence operations such as calculating aggregate revenue across two linked tables.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;What I discovered was that the performance – when an apples-to-apples comparison could be made – that the performance was about the same (although at first it looked like Qlik’s performance was 4x better, but this was due to a flaw in my experiment pointed out by a reader).&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;That said, I was expecting Power BI to outperform Qlik in this experiment (and did find a scenario where this was certainly the case) but was still surprised to see the performance was about the same when I ran the experiment with the main use case: High cardinality key linking.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This is a significant finding I believe because Qlik’s “Power of Gray” (a term used in Qlik’s sales &amp;amp; marketing materials) offers a massive benefit to business analysts (think Excel jockeys) who know how to leverage this “Power of Gray”.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This one minute &lt;a href="https://www.youtube.com/watch?v=1Ha2hc6zpsg"&gt;video&lt;/a&gt; demonstrates Qlik's "Power of Gray".&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;iframe allowfullscreen="" class="BLOG_video_class" height="266" src="https://www.youtube.com/embed/1Ha2hc6zpsg" width="320" youtube-src-id="1Ha2hc6zpsg"&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;br /&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;My next blog post will get into this “Power of Gray” and why it is so powerful and how it can be leveraged to cross analytic efficiency thresholds leading to improved business outcomes.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="color: #244084; font-family: Carlito; font-size: 13px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Introduction&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In this article we are going to explore both Power BI and Qlik’s filter controls – Slicers and List Boxes – as well as the underlying indexing technology which explains why the behaviour of this filtering is so different between the tools and why the “Power of Gray” comes easily to Qlik and why we do not see it in Power BI.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The purpose of this article is not to provide a complete description of the indexing and calculation technologies. Rather this article is intended to complement other materials (such as those we are linking to in this article) to show using an example how these engines differ.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;But before we get into that I want to go over some additional performance measures I have taken that will help illustrate the qualitative differences between the engines. Namely, the differences that analysts would notice when using both tools.&lt;/p&gt;&lt;p class="p4" style="color: #244084; font-family: Carlito; font-size: 13px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI Slicer versus QlikView List Box/Qlik Sense Filter&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In Power BI, the “Slicer” object - which allows users to filter fields and drill into hierarchies - is comparable to QlikView’s “List Box” object, which is effectively the same as Qlik Sense’s “Filter” object. There are some functional differences between QlikView’s List Box and Qlik Sense’s Filter (e.g. QlikView’s List Box supports an ‘AND/NOT’ selection mode), but for the features we are discussing in this post, they are identical.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Hereon we will only refer to Power BI’s “Slicer” and QlikView’s “List Box”.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;With that out of the way I need to explain how Power BI and Qlik differ with respect to Slicer versus List Box and why it is not possible to make an apples-to-apples comparison with respect to performance.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Power BI and QlikView both allow users to filter on field values (e.g. selecting the value ‘West CC’ in “Call Centre Name” field using PBI Slicer or Qlik List Box). But here are the two main reasons why we cannot compare their performance:&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Power BI Slicer shows &lt;i&gt;possible&lt;/i&gt; values (based on other applied filters) versus Qlik which shows all values including values that are excluded by other field selections&lt;/li&gt;&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In our example, this means that if we select Call Centre l‘West CC’ in Power BI we are only showing Work Order IDs that are linked to the ‘West CC’ call centre. Qlik on the other hand displays all Work Order IDs, including excluded work orders (which are shown in gray as opposed to possible Work Order IDs which are shown in white)&lt;/li&gt;&lt;ol class="ol3" style="list-style-type: lower-roman;"&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This means the QlikView List Box is showing 5x as many Work Order IDs than Power BI Slicer&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Power BI Slicer must always be sorted whereas it is possible to turn off sorting in QlikView’s List Box&lt;/li&gt;&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Sorting has a greater &lt;a href="https://en.wikipedia.org/wiki/Time_complexity"&gt;time complexity&lt;/a&gt; than retrieval and listing and is particularly noticeable when the number of values to be sorted is greater than 8 million.&lt;/li&gt;&lt;li class="li5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Without sorting, the time complexity would be linear – O(n).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But because sorting has a linearithmic time complexity of&amp;nbsp;O(n log n) and therefore is non-linear, the effort to perform sorting overwhelms any indexing efficiencies. This becomes noticeable even with more than approximately 1 million unique values in the field you are displaying in the filer (since the ‘log n’ multiplier starts to become a significant multiplier)&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Putting #1 and #2 together, a performance comparison between Power BI Slicer and Qlik List Box cannot be apples-to-apples because Qlik and Power BI are dealing with different populations of distinct values while at the same time it is not possible in Power BI to separate the sort operation (with greater time complexity) from the retrieval and display of values. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In other words, Qlik must retrieve and display more unique values than Power BI while sorting that larger population.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;With all that said I still went ahead and took some measurements to illustrate some of these differences and to get a sense of how the tools behaved.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Before I get into the measurements, it is important to be reminded that most of these measurements are really just measuring a sort operation, and that because we are selecting from 140 million rows, even a single Call Centre has 35 million rows (far more than the 8 million row threshold I mentioned earlier, where we start to see a steep decline in performance).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As an aside, it is for a similar reason why DBAs and Data Developers often choose to limit partitions to 8 million rows and why Microsoft Analysis Services Tabular partitions its own data [by default] into 8 million row “segments” and Microsoft 365 Power BI uses 1 million row “segments”.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p6" style="color: #182850; font-family: Carlito; font-size: 12px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Slicer versus List Box Observations and Measurements&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The first thing worth noting is that Power BI itself ran out of memory attempting to display the Slicer when no values were selected.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/PBI_Slicer_All_CCs.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="281" data-original-width="800" height="140" src="https://cradleofanalytics.blob.core.windows.net/blog/images/PBI_Slicer_All_CCs.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;To be fair, if this were Azure Analysis Services, this may have been successful due to the larger 8M row Segment Size (as opposed to Power BI’s 1M Segment Size). So AAS would give us more options here while still staying on the Power BI platform.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;With QlikView it did take approximately a minute, but we did successfully see all the values (as shown below) on the same laptop as Power BI.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/Qlik_ListBox_All_CC.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="800" data-original-width="428" height="400" src="https://cradleofanalytics.blob.core.windows.net/blog/images/Qlik_ListBox_All_CC.png" width="214" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Fortunately, when I began making selections in Power BI the Slicer was processed successfully as you can see below when I selected the “West CC” call centre.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/PBI_Slicer_West_CC.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="243" data-original-width="800" height="121" src="https://cradleofanalytics.blob.core.windows.net/blog/images/PBI_Slicer_West_CC.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Overall, the median query performance for Power BI was approximately 19 seconds per selection, as shown below.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/v1_slicer_power_bi_perf_analyzer.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="449" data-original-width="800" height="224" src="https://cradleofanalytics.blob.core.windows.net/blog/images/v1_slicer_power_bi_perf_analyzer.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;Here is the underlying JSON Performance data and a .pbix I used to analyze the data.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p2/PowerBIPerformanceData_v1_Slicer.json"&gt;PowerBIPerformanceData_v1_Slicer.json&lt;/a&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p2/V1 Slicer Performance Analysis.pbix"&gt;V1 Slicer Performance Analysis.pbix&lt;/a&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;With Qlik, I took different types of measurements (which are not possible with Power BI).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The table below shows measurements for an &lt;span class="s1" style="text-decoration-line: underline;"&gt;unsorted&lt;/span&gt; List Box with the same “Worker Order ID” field.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As shown in the table below we get a median response time of approximately 2 seconds.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 24px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Tool&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 24px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Size&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 24px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Selection&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 24px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;List Box Refresh Time (ms)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Huge&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2,000&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Huge&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5,390&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Huge&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2,300&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Huge&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2,140&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Huge&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2,080&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Median&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 39px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td8" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 56px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;b&gt;2140&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;But if you compare what is shown in the relatively quick unsorted display that QlikView is showing to what was shown earlier for Power BI, you can easily see the difference: Power BI is only showing the possible values, all sorted within the same block.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But in QlikView, it’s effectively just highlighting the possible values among the originally sorted population.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/Qlik_ListBox_West_CC.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="800" data-original-width="429" height="400" src="https://cradleofanalytics.blob.core.windows.net/blog/images/Qlik_ListBox_West_CC.png" width="214" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;You would have to scroll or glance through this list to see all the possible values which is obviously not practical with millions of distinct values. Although In some situations this is a helpful for making comparisons (typically when the list is short and can act like a visual heatmap), but in an example like this you would probably want to turn on both State and Text sorting which dramatically impacts performance.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Another thing you will notice if you look carefully is that it appears as though ‘10_1’ and ‘10_2’ (which are visible in Power BI) are missing from Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;If you were to scroll down the list you would find them.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The reason they are not displayed is that what is being displayed in QlikView is the original sort order based from the load script.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Often Qlik developers will pre-sort data before loading into Qlik for this reason – I did no such thing in this experiment, which is why some values appear to be missing when in fact they are just farther down the list.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I also took measurements when I turned on “State” filtering (whereby ‘possible’ values are listed above ‘excluded’ values) and I took another set of measurements when both “State” and “Text A-&amp;gt;Z” filtering was enabled. The below table shows these in comparison with Power BI.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td10" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 92px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Tool&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 155px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Test&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 177px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Slicer/List Box Refresh Time (seconds)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 92px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;QlikView&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 155px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;No filtering, all values&lt;/p&gt;&lt;/td&gt;&lt;td class="td15" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 177px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 92px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;QlikView&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 155px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Selection State filtering, all values&lt;/p&gt;&lt;/td&gt;&lt;td class="td15" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 177px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;15&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 92px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 155px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Text filtering, possible values&lt;/p&gt;&lt;/td&gt;&lt;td class="td15" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 177px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;19&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td16" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 92px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;QlikView&lt;/p&gt;&lt;/td&gt;&lt;td class="td17" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 155px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Selection State and Text filtering, all values&lt;/p&gt;&lt;/td&gt;&lt;td class="td18" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 177px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;25&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Again, I must emphasize that the measurements shown above are not for apples-to-apples comparison between Power BI and Qlik, but they do reveal contours to the underlying indexing technologies which are key to the differences between the tools and to the business user experience, and ultimately to user outcomes.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In my next post I will explain why the differences in indexing technology between the two tools make an impact to Analytic Efficiency. But first let’s dive into the example that shows the mechanics of the indexing engines themselves.&lt;/p&gt;&lt;p class="p3" style="color: #244084; font-family: Carlito; font-size: 13px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 2px; min-height: 16px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="color: #244084; font-family: Carlito; font-size: 13px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Index Engine Walk-Thru&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Before getting into this example, I will refer you back to my previous blog post which describes the data model. But to quickly recap, there are two tables: “My Calls” and “My Work Orders” and they are linked on a “Call ID”.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Users select “Call Centre Name” from a Slicer or List Box and then see the total “Work Order Amount” aggregated from the “My Work Orders” table based on possible Work Order rows filtered indirectly by the “My Calls” table.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Now let us begin with our example walk thru...&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Given our example of two tables: Calls and Work Orders, what is happening in Power BI when we select "Call Centre" = 'North CC' from the "MyCalls" table (using a Slicer) and then are presented with "SUM Work Order Amount" (from the "MyWorkOrders" table) in the form of a KPI Card?&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p6" style="color: #182850; font-family: Carlito; font-size: 12px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI (&lt;a href="https://www.oreilly.com/library/view/the-definitive-guide/9780735698383/ch13.html"&gt;Vertipaq&lt;/a&gt;) Indexing Engine Walk-thru Example&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Based on my reading of Microsoft and SQLBI's explainer presentations combined with my own experiences developing Power BI and Azure Analysis Services, here is what I believe is happening once a user makes the selection (feel free to correct me):&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;1. The cache is checked to see if this value has already been calculated.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;If it has been, then no searching or calculation is required, and the result can be shown instantly (constant time).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;2. If the value has not been explicitly calculated by the cache, it may be possible to use previously cached results to speed up calculation time.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;For example, if the entire population of Work Order Amounts has a population of $5 and we can detect the presence of $5 while scanning the indexes, we can short-circuit the calculation because we already know it's not possible to find a smaller value.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;For our example here, only the MAX and MIN can be deduced in this manner, and we can see this in the test results.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Based on reader feedback to my last post, I suspect this is only happening when the DAX expressions are explicitly bundled together using the EVALUATE and SUMMARIZECOLUMNS functions taking advantage of “DAX Fusion” optimizations.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;However, the Report developer would need to be aware of this, and if the Report has separate KPI Cards, then no such optimizations would take place (as of this time of writing).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;3. Assuming the cache cannot help us (which is the true subject of our test), then the first thing we need to do is determine the rows in MyCalls that have been selected.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;4. Power BI partitions data by default into 1 million row segments (Analysis Services partitions into 8 million row segments).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Each segment in turn has its own Dictionary which allows the column values to be optimally compressed depending on the number of distinct values within the given segment.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Since the MyCalls table has 100 million rows, we can expect approximately 100 segments (I say approximately because segment size is in fact 1,048,576 or 2^20 rows, since segments must be a power of 2).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Based on simple calculation (100 million / 2^2 = 95.3), I estimate 96 segments and will use that number going forward.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Each of these 96 segments has its own Dictionary.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The reason for having multiple dictionaries is that within in each segment only a subset of values is likely to occur. This allows for better compression and better compression allows for faster reads. Thus, each dictionary is consulted to see if it includes 'North CC'.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Now if the segments had been aligned to this column we would only need to check a subset of segments.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Depending on how the MyCalls table has been sorted, it's possible that Power BI is searching segments that only pertain to the 'North CC' call centre.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This would mean querying approximately 20 segments (as opposed to all 96 segments).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;5. Once all rows have been read (whether those rows are all lumped into a subset of segments or are spread across all segments), Power BI now needs to determine what linking keys are included in the selection scope.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In our example the "Link Call ID" which is the field that links the two tables together is a whole number meaning that Power BI can more efficiently store as a compressed symbol without requiring a dictionary to decode the value which is more efficient. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;6. Using the coded "Link Call ID" values, Power BI then determines how these linking values are mapped to the linked "MyWorkOrders" table using the "Relationship Index".&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In our example with 'North CC' being selected, the Relationship Index will point to a subset of segments in the "MyWorkOrders" table.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Hopefully the tables are aligned, but it's possible they are not. Given that we have approximately 140 million work orders, we would have at least 134 segments.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This would mean Power BI would need to scan at least a fifth of those segments (27 segments).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;That said, I suspect all MyWorkOrders segments would need to be scanned.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The reason is that the MyWorkOrders table does not have a "Call Centre Name" column and only a "Link Call ID" column which does not share the same distribution as the Call Centre names.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In all likelihood all 134 segments comprising all 140 million rows would be scanned.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;7. As mentioned earlier, integer and whole number values are not by default "hash" encoded into a dictionary and thus do not require an additional lookup, so the values of the "Work Order Amount" can be determined directly from the compressed MyWorkOrders table.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;8. From here Power BI is in the final stretch and using the "Relationship Index" can scan all the "Work Order Amount" column indexes for all rows in each of the segments that have the referenced "Link Call ID".&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;9. Power BI now calculates the Sum total of Work Order amount from these underlying values.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This leads us to the &lt;u&gt;reasons why Power BI cannot easily query and calculate excluded rows&lt;/u&gt; (i.e the complement of rows):&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;/p&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;Only text and floating-point values are dictionary encoded in Power BI (whereas in Qlik there is no exception, all values are dictionary encoded since all values are of a 'dual' type and potentially hold both numeric and text values)&lt;/li&gt;&lt;li&gt;PBI Dictionaries can be fragmented across multiple segments, requiring multiple dictionaries to be queried with results merged - a costly operation&lt;/li&gt;&lt;li&gt;Through incremental partition refreshes, PBI dictionaries can contain values that are no longer referenced by any dictionary. Power BI even allows administrators/operators to perform a "defrag" on tables using XMLA commands.&lt;/li&gt;&lt;li&gt;Power BI also allows for a DirectQuery mode that directly queries an RDMBS using generated SQL queries.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;DirectQuery is already challenged by latency to begin with, it would be even more challenging and time consuming to add additional queries to calculate the complement of a given column's values.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;So how does Qlik (QlikView and Qlik Sense) work in this scenario then?&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p6" style="color: #182850; font-family: Carlito; font-size: 12px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik Indexing Engine Walk-thru Example&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Based on my reading of Qlik's patent US: 10,628,401 B2 and experience working with QlikView and Qlik Sense, here is what I believe is happening once the user makes the selection.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I must warn you: Qlik’s engine is not as intuitive as Power BI’s Vertipaq engine and does not follow a “&lt;a href="https://en.wikipedia.org/wiki/Column-oriented_DBMS"&gt;Columnar DBMS&lt;/a&gt;” architecture. &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;For this reason, you can reference &lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p2/Qlik vs Power BI example.xlsx"&gt;this document &lt;/a&gt;which contains the underlying example data I am referring to here.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p2/Qlik vs Power BI example.xlsx"&gt;Qlik vs Power BI example.xlsx&lt;/a&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here is how it works:&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;1. Like Power BI, the cache is checked to see if this value has already been calculated.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;If it has been, then no searching or calculation is required, and the result can be shown instantly (constant time).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;2. Qlik does not appear to use intermediate cache results (e.g. like the Max of the total population).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;If the result has not explicitly been calculated before it is entirely calculated without using past queries.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Thus, caching in Power BI appears to be more sophisticated in this regard. Although I could be wrong here.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But if you read the addendum to my previous blog post I present some evidence to support this.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;3. Qlik takes the 'North CC' value from the user selection and looks it up first in the “MyCalls” BTI. BTI stands for"Bi-directional Table Index" which is a dictionary that points back to the row position of values.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Already we can see that Qlik is starting with the dictionary before even looking at rows.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td19" colspan="4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 17px; padding: 4px; width: 460px;" valign="bottom"&gt;&lt;p class="p9" style="font-family: Helvetica; font-size: 16px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: center;"&gt;MyCalls Table&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Call ID&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Link Call ID&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Call Duration&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Call Centre Name&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;1&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;North CC&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;2&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;9&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;North CC&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td24" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td25" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td26" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td27" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td20" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 55px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;td class="td21" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 99px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;8&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;9&lt;/p&gt;&lt;/td&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;span class="s2" style="background-color: #ffff0b;"&gt;North CC&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;4. The BTI points to all rows containing the value 'North CC' using what is known as a bitmap index.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;For example, if our MyCalls table had eight rows and the 'North CC' calls were on rows 1,2, and 8 then the bitmap index would look like this: 11000001, which in turn could be encoded into a single byte as 0xC3 in hex or 193 in decimal.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I want to pause for a moment and point out that it is here we can see a radical difference between how Power BI and Qlik index their data with respect to dictionaries.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In Power BI the dictionary is subordinate to the table rows and is not used for columns that contain whole and fixed decimal numbers, whereas in Qlik the dictionary is at the centre and the table rows are subordinate to the dictionary even when the column values are exclusively integers.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In other words, Power BI is more row-centric versus Qlik which is more dictionary-centric.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It is this dictionary-centric approach to indexing which allows Qlik to efficiently query and present excluded values and thus present to the user those excluded values in Filter boxes (in Qlik Sense) or List Boxes (in QlikView).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This is what Qlik's marketing refers to as "the power of grey".&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;5. Moving on with our example, once Qlik has determined which rows have values - using the BTI's bitmap index - it now needs to resolve "Link Call ID" values for the corresponding table.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;6. Qlik then uses a second index known as an "Inverted Index" or "II" to look for the value symbol references for the "Linked Call ID" column. Using an example of our table with eight rows, our inverted index might look like this: 1;2;3;4;5;6;7;8&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Each value in this single-dimension array corresponds to a given symbol value in the "MyCalls" table.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Given that we are interested in rows 1,2, and 8, we would need to reference Work Orders from the MyWorkOrders table with "Link Call ID" in (1,2,8).&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;7. We now take those three values and determine how the symbol value is mapped from the MyCalls table into the MyWorkOrders table.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This is done through an index Qlik calls a "Bidirectional Associative Index" or "BAI".&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;8. Using the "BAI" we map the index position of the "Link Call ID" values (1,2,8) for the "Link Call ID" to the corresponding index positions. In our example, some (about 25%) of MyCalls records have no corresponding value in the MyWorkOrders table.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;So those "Link Call IDs" would map to the value '-1' to indicate the value does not exist in the linked table.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The other values would have their values mapped.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The following table shows how the values could be mapped for all Call IDs, assuming Call IDs 2 and 6 do not correspond to any "Linked Call Id" in the Work Order table:&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td28" colspan="2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 17px; padding: 4px; width: 240px;" valign="bottom"&gt;&lt;p class="p9" style="font-family: Helvetica; font-size: 16px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: center;"&gt;MyCalls to MyWorkOrders BAI&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;MyCalls BTI Index&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;MyWorkOrders BIT Index&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;-1&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td30" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;-1&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 94px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;td class="td29" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 137px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;At this point we should point out that Qlik would also build a corresponding index to go in the opposite directly. While that index would not be used in this example (since are going from MyCalls to MyWorkoders and not the other way around), it is worth showing how the other BAI would be constructed.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Here is how it would look (keep in mind there are only six distinct "Linked Call ID" values in the MyWorkOrders table:&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td31" colspan="2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 17px; padding: 4px; width: 278px;" valign="bottom"&gt;&lt;p class="p9" style="font-family: Helvetica; font-size: 16px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: center;"&gt;MyWorkOrders to MyCalls BAI&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;MyWorkOrders BIT Index&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;MyCalls BTI Index&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td27" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td33" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td23" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 160px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td32" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 109px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Please note:&lt;/b&gt; The above table is shown as an example of reverse linkage. However, for the remainder of this walk-thru we will NOT be referencing this BAI index. But we will be referring back to the MyCalls to MyWorkOrders BAI index.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;It is important to note here that these link mappings are not linking from table to table, but rather from BTI to BTI.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;It is through this BAI that Qlik is directly linking Dictionaries with one another.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;It is as if the tables themselves don't really exist in any explicit format, the position of values within tables is essentially an attribute of the dictionary.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;8.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Now we have the BTI index positions of the selected "Link Call ID" values, Qlik can determine using what rows those values are found on using the BTI index for the given Call IDs.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In our example this means that Qlik would be looking up values (1,6) in the BTI for the "Linked Call ID" in the MyWorkOrders table.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td34" colspan="3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 17px; padding: 4px; width: 394px;" valign="bottom"&gt;&lt;p class="p9" style="font-family: Helvetica; font-size: 16px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: center;"&gt;MyWorkOrders BTI&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td35" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;BTI Index&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td36" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Link Call ID Value&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td37" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Value Position Bitmap&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td38" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td39" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td40" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;100000000000&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td41" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td42" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td43" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;010000000000&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td38" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td39" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td40" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;001101000000&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td41" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td42" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td43" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;000010100100&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td38" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td39" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td40" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;000000011010&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td38" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 69px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td39" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 133px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;td class="td40" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 174px;" valign="bottom"&gt;&lt;p class="p10" style="font-family: &amp;quot;Courier New&amp;quot;; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;000000000001&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Using our example if we assume that the MyWorkOrders table has twelve (12) rows our bitmap indexes for "Linked Call ID" 1 and 6 would look like this:&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;1: &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;100000000001&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;6: &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;000000000001&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Qlik then performs a bit-wise OR to produce the following bitmap index&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;1+6= 100000000001&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;A minor caveat: I mentioned earlier that a bitmap index is used by the Dictionary BTI to determine what rows the value falls on.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;However when the value is very sparse (I am making a conjecture here), I suspect the bitmaps are further compressed (probably using an offset and run-length-encoding). To be clear while I don't know for sure how the bitmap indexes are managed for sparse value distributions, it doesn't really matter all that much for the purpose of this example.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td34" colspan="3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 17px; padding: 4px; width: 394px;" valign="bottom"&gt;&lt;p class="p9" style="font-family: Helvetica; font-size: 16px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: center;"&gt;MyWorkOrders Table&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Work Order ID&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Link Call ID&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Work Order Amount&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$15&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;2&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$10&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td26" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;3&lt;/p&gt;&lt;/td&gt;&lt;td class="td46" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td47" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$20&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$15&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$5&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$10&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$20&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$5&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;9&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$15&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;10&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;5&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$10&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td22" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;11&lt;/p&gt;&lt;/td&gt;&lt;td class="td44" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7&lt;/p&gt;&lt;/td&gt;&lt;td class="td45" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$5&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td26" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 119px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;12&lt;/p&gt;&lt;/td&gt;&lt;td class="td46" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 87px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;8&lt;/p&gt;&lt;/td&gt;&lt;td class="td47" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 170px;" valign="bottom"&gt;&lt;p class="p8" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;$15&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;10. Using the BTI for "Linked Call ID" Qlik locates all rows that we need to query "Work Order Amount" from using the "Inverted Index" or "II" for "Work Order Amount" might look like this:&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;1,$5&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;2,$10&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;3,$15&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;4,$20&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;3;2;4;3;1;2;4;3;3;2;1;3.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;If we refer to the merged bitmap index above (using bitwise arithmetic), we can now use the II to determine that the values for "Work Order Amount" are: 3 and 3.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;From the above example, we can see the BTI for Work Order Amount might look like this:&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;11. Qlik can now determine that since both references are to row 3 in the BTI, it can more efficiently calculate the SUM of Work Order Amount by multiplying the value by the number of instances.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This would be 2 * $15 = $30.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;12. Qlik now presents back $30 as the total back to the user.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;span&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;/span&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span style="color: #244084; font-family: Carlito; font-size: 13px;"&gt;Conclusion&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As you can see from walking through this example the method of querying and calculating across linked tables is quite different in Power BI and Qlik.&lt;/p&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;For further reading (viewing) I suggest watching this excellent video from SQLBI featuring Alberto Ferrari explaining Power BI’s Vertipaq engine:&lt;/p&gt;&lt;p class="p11" style="color: #0b4cb4; font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;a href="https://www.sqlbi.com/tv/optimizing-multi-billion-row-tables-in-tabular-sqlbits-2017/"&gt;https://www.sqlbi.com/tv/optimizing-multi-billion-row-tables-in-tabular-sqlbits-2017/&lt;span class="s3" style="color: #0b4cb4;"&gt;&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I also recommend review Qlik’s US patent: 10,628,401 B2 here:&lt;/p&gt;&lt;p class="p11" style="color: #0b4cb4; font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;a href="https://patents.google.com/patent/US10628401B2"&gt;https://patents.google.com/patent/US10628401B2&lt;span class="s3" style="color: #0b4cb4;"&gt;&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;On a final note, I should point out that this patent pertains to using Qlik’s indexing technology in a slightly different context (with “big data” disk based systems, as opposed to Qlik’s in-memory technology), but I believe that algorithmically the two inventions are essentially the same.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This blog post was quite technical and is really for more technically curious and skeptical readers.&lt;/p&gt;&lt;p class="p5" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In my next blog post I will explain why these differences in indexing technology can lead to significantly different user experiences and business outcomes.&lt;/p&gt;</description><link>http://hepburndata.blogspot.com/2021/02/analytic-efficiency-part-2-user.html</link><author>noreply@blogger.com (Neil Hepburn)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" height="72" url="https://img.youtube.com/vi/1Ha2hc6zpsg/default.jpg" width="72"/><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-5727399723667694321</guid><pubDate>Sun, 10 Jan 2021 20:50:00 +0000</pubDate><atom:updated>2021-01-12T19:00:49.076-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Analytics</category><category domain="http://www.blogger.com/atom/ns#">Efficiency</category><category domain="http://www.blogger.com/atom/ns#">Power BI</category><category domain="http://www.blogger.com/atom/ns#">Qlik</category><title>Analytic Efficiency: Part 1: Power BI vs. Qlik Performance</title><description>&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Update 2021-01-12&lt;/span&gt;&lt;/h2&gt;&lt;p style="text-align: left;"&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;An astute reader (thank you reader!) has pointed out a flaw in part of the experiment. I have since updated and re-run this part of the experiment to address the flaw.&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;In summary, the flaw was that I had five (5) Measures in both Power BI and Qlik. But in Power BI these Measures were all being executed as separate queries whereas in Qlik they were being executed as a single query.&lt;/span&gt;&lt;/p&gt;&lt;p style="text-align: left;"&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;I have since added a new section to the bottom of this blog that describes the change to the experiment and updates the conclusions accordingly.&lt;/span&gt;&lt;/p&gt;&lt;div&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Pre-Amble&lt;/span&gt;&lt;/h2&gt;&lt;div&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This is the first of a three part blog post that will explore the concept of “Analytic Efficiency”. But before I introduce part 1 I need to take some time to explain what I mean by “Analytic Efficiency”.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“Data” and “information” as we have come to realize have always existed - it’s only recently (in large part due to the work of &lt;a href="https://en.wikipedia.org/wiki/Claude_Shannon"&gt;Claude Shannon&lt;/a&gt; in the mid-20th century resulting in Shannon’s “&lt;a href="https://en.wikipedia.org/wiki/Information_theory"&gt;Information Theory&lt;/a&gt;” that showed how everything in the universe could be encoded as a ‘1’ or ‘0’ and therefore the foundation for everything is simply “information”) that we now think of “data” and “information” as abstract concepts unto themselves, and have come to realize that “information” is the fuel and power for decision making through an ancient and consistent process that we still to this day call “analytics”.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The way in which we perform analytics has not significantly changed since &lt;a href="https://en.wikipedia.org/wiki/Aristotle"&gt;Aristotle&lt;/a&gt; first described the ‘&lt;a href="https://en.wikipedia.org/wiki/Syllogism"&gt;syllogism&lt;/a&gt;’ in his work “&lt;a href="https://en.wikipedia.org/wiki/Prior_Analytics"&gt;Prior Analytics&lt;/a&gt;” which was put down in the middle of 4th century BCE - over 2,300 years ago.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Efficiency has improved greatly since then, but fundamentally nothing has changed about the process itself.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;So powerful was Aristotle’s approach to logic (ordered thinking) that his books on logic were and are referred to as “Organon” which translates to “The Tool” - as in “The Tool of Philosophy”; Aristotle’s “Tool” was seen as something that was above philosophy itself.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;During the Middle Ages philosophers of all faiths like &lt;a href="https://en.wikipedia.org/wiki/Averroes"&gt;Averroes&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Maimonides"&gt;Maimonides&lt;/a&gt;, and &lt;a href="https://en.wikipedia.org/wiki/Thomas_Aquinas"&gt;Thomas Aquinas&lt;/a&gt;&amp;nbsp;referred to Aristotle simply as “The Philosopher”; as though there were no other philosophers were worthy of the title. Even to this day Aristotle’s shadow looms large over philosophers and mathematicians working in the “&lt;a href="https://en.wikipedia.org/wiki/Analytic_philosophy"&gt;Analytic tradition&lt;/a&gt;”: Great thinkers like &lt;a href="https://en.wikipedia.org/wiki/René_Descartes"&gt;Descartes&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Baruch_Spinoza"&gt;Spinoza&lt;/a&gt;,&amp;nbsp;&lt;a href="https://en.wikipedia.org/wiki/Gottfried_Wilhelm_Leibniz"&gt;Leibniz&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Immanuel_Kant"&gt;Kant&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/Gottlob_Frege"&gt;Frege&lt;/a&gt;, and &lt;a href="https://en.wikipedia.org/wiki/Bertrand_Russell"&gt;Russell&lt;/a&gt;. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Furthermore, &lt;a href="https://en.wikipedia.org/wiki/Euclid"&gt;Euclid&lt;/a&gt;’s “&lt;a href="https://en.wikipedia.org/wiki/Axiomatic_system#Axiomatic_method"&gt;Axiomatic Method&lt;/a&gt;” is built on the foundation of Aristotle’s syllogism, and those that are familiar with the history of mathematics and science will know that Axiomatic Method is what the modern world of science, technology, and engineering is built upon.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Analytics is essentially a recursive game of “&lt;a href="https://en.wikipedia.org/wiki/Twenty_questions"&gt;Twenty Questions&lt;/a&gt;” (more like “infinite questions”) that continually feeds back on itself: You start with an objective question (e.g.“Where are my most profitable customers?” or “Are we living in a computer simulation?”) and you break that big question down into little questions (the Greek origins of the word “&lt;a href="https://www.etymonline.com/word/analytics"&gt;analytics&lt;/a&gt;” means ‘to dissolve’ or ‘break down’) until there are no more questions to ask and at that point the answer either reveals itself or you might find out (such as the case with quantum physics) that the answer cannot be known.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The game of Analytics is simple but it is not always easy. The game can also be described as the “Analytic Lifecycle”. The Analytics Lifecycle itself can be broken down (dissolved) into these five steps (as I see it from experience):&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Identify data &lt;/b&gt;needed to answer the question (this might be easy if the data is in a standard report available to you, or this might be tricky if you are not authorized to see the data in its raw form and must work with a trusted data steward [who might have other priorities] to negotiate a data request).&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Obtain the data&lt;/b&gt; (this may be as simple as downloading a file or it may involve spending billions of dollars to build something like the Large Hadron Collider or a Gravitational-wave Observatory).&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Prepare the data&lt;/b&gt; (this might be something simple like copy-and-pasting data and performing a VLOOKUP across two tables in Excel, or this could involve training a Deep Neural Network to generate predictions as the desired dataset for further analyses into say, ethnic biases of a recommendation tool).&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Analyze the data&lt;/b&gt; (this is the interactive and core part of the game of 20 questions where the analyst works in real time “slicing-and-dicing” and “drilling” the datasets that were just prepared, to thoroughly answer the original question by also asking follow-up questions.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Put another way, this is where most of the questions are being asked and answered). This step often depends on how well the data was prepared, what tools are being used for interactive analyses, and most importantly what contextual knowledge the analyst has so they are asking the most relevant and impactful questions.&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Present the answer/insight&lt;/b&gt; back to the original audience (this can entail anything from building a Data Visualization [e.g. heat map overlaid on geographic map] to an Infographic to building a Data Story presentation, or if you have the ability or resources, a Conceptual Animation video can be impactful at scale.&lt;/li&gt;&lt;/ol&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;When you reach step 5 and present the result back to the audience, they may be satisfied with the answer, or they may ask a new “Objective Question” based on the answer from the previous question.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Aristotle believed that conclusions from syllogisms that are unexpected are the most interesting of all. Case-in-point: When the LHC detected the existence of the Higgs Boson particle, many saw this as confirmation of what was already known rather than any new insight, and so interest has died down. However, when telescopes were able to show that the actual size of the universe is not what Einstein’s equations had predicted, this has led to research surrounding Dark Matter and Dark Energy that continues to propel scientific inquiry to this very day.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Thus the process can and will keep going for as long as there is curiosity and interest and resources.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;——&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As with any lifecycle process bottlenecks may arise at any step of the process. For Parts 1 and 2 of the blog post I want to focus specifically on step #4 of the Analytic Lifecycle, “Analyze the data” with respect to two tools: Power BI and Qlik.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In part one (this post) I will summarize the differences between Power BI and Qlik’s indexing engine and go over the results of two performance tests I performed to illustrate the strengths and weaknesses of each engine.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In part two I will describe - using a simplified example - how exactly the Tabular Model processes queries versus Qlik’s approach.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In part three I will explain what this reveals about the nature of Analytic Efficiency and why MS Excel (and other spreadsheets) dominate Analytics even when we are repeatedly told that this is not how we ought to be doing analytics, and what those scolding messages about Excel are missing about the deeper nature of Analytic Efficiency.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;If you are curious about technology or business but not interested in a deep dive into the underlying mechanics, then I suggest you skip to the third post. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;If you are someone whose mantra is “show me the results”, then this post provides that hard benchmarking data.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;And, if you are curious about the “HOW” of Qlik and Power BI, then you should read all three posts.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;That’s the end of my preamble, now on with my first post…&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Introduction&lt;/span&gt;&lt;/h2&gt;&lt;div&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;For many years I have speculated on the internal architecture of Qlik’s indexing engine. If you are a regular user of QlikView or Qlik Sense you may have a feeling for what I am talking about here.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It’s a combination of performance and User Experience that is rather unique and if you know how to use the tool this can work wonders for analysis.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;But until recently I have struggled to articulate this difference and benefit that Qlik has over other BI tools.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;My thinking until recently was that Qlik’s great innovation was what I refer to as a “linking model” (as opposed to “cube model”) and that this was the sole distinction between Qlik and other BI tools. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As such, I was excited when Microsoft released PowerPivot in 2010 since I could see that Microsoft’s new approach to BI would be very similar to Qlik’s: They both shared a linking model that would allow developers to break through the “fan trap” problem that plagued so many cube-based models and led to the clunky “conformed dimensions” solutions I would occasionally see. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;To explain this problem quickly: Traditional cube-based architectures - like those pioneered by Essbase and Cognos and written about extensively by Ralph Kimball in his seminal “Data Warehouse Toolkit” books - have a problem when multiple fact tables (e.g. “Calls” and “Work Orders”) are required. The reason is that when you merge multiple fact tables into a single table (e.g. “Calls” and “Work Orders”), the table with the lower cardinality (in our example this is “Calls”), what will happen is duplicates will appear, since there must be one record for each “Work Order” record even though multiple Work Orders may belong to a single Call.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;There are a few common solutions to this problem but the prescribed industry solution is to effectively synchronize filtering on two cubes such that they appear to be acting as a the same cube. But there is a problem with this solution: Conformed Dimensions don’t work with high cardinality “Degenerate Dimensions” like a “Call ID”, and so we must employ highly skilled data modelers to choose other dimensions to link on such as the Call Date for example. But this can lead to another trade-off known as a “chasm trap”, where it is not possible for the analyst to answer simple questions like: “Show me all the Work Orders for Calls that lasted less than 20 seconds?”&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Both Qlik and Power Pivot (which would later evolve into Power BI) do not have this cube limitation because they CAN link on the degenerate dimension – on the Call ID – and because there is little skill required to figure out this, because the source data model makes this obvious – anyone with just a half-day training can now build linked models that are far superior to the cube based models of yester-year.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This is a huge step forward for Analytics and Business Intelligence.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;And for a while I was convinced that Microsoft had succeeded at copying Qlik’s technology (from the outside) and had made this linking concept mainstream.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It seemed to me that Microsoft would soon come to dominate the BI industry, and my suspicions have very much materialized into reality.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;All of that said, I also felt that there was still something different about Qlik’s indexing engine that I couldn’t quite put my finger on.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;As time went on and Power BI’s march to popularity continued, I began shifting from to both using and teaching Qlik to using and teaching Power BI. I have been doing this for the past 4 or 5 years now, with almost exclusive use of Power BI over the past 2 years.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But recently I have dusted off an older version of QlikView 11, and have begun using this tool as a kind of companion to Excel for performing analyses.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;While it would be nice to work in an organization that is all-in on Qlik (as opposed to one that is all-in on Power BI which is becoming more and more common), I have come to appreciate and embrace a side of Qlik that goes beyond the linking model which I never fully understood or appreciated until this year.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;So what is this “secret sauce” that has me going back to Qlik even when Power BI is on a tear adding new features at least once a month, and is integrated into everything Microsoft can think of including their new “Synapse” platform (which I will admit is a very compelling system)?&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Qlik’s Secret Sauce&lt;/span&gt;&lt;/h2&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;The difference between Power BI’s Tabular Model (a.k.a. Vertipaq engine) and Qlik’s Associative Model is that Power BI is &lt;span class="s1" style="text-decoration-line: underline;"&gt;table-centric&lt;/span&gt; whereas Qlik is &lt;span class="s1" style="text-decoration-line: underline;"&gt;value-centric&lt;/span&gt;. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The most salient difference from a user perspective is how Slicers/Filters perform and behave: In Power BI, Slicers can slow down Page load times and only show all values (irrespective of other selections) or all possible values (based on other user selections). In QlikView and Qlik Sense, the equivalent Filter boxes have negligible impact on Page load time and always show both the Possible and Excluded values (based on other selections), allowing the user to always see all values of any given field while knowing what is in scope and what is out of scope based on other field selections.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The reason for these differences is that when you make a Slicer selection in Power BI, Power BI must scan through the rows and columns of the underlying tables to then find the values to aggregate, or determine what values are possible in other columns.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This design is relatively intuitive and is how most people would design a linked model BI platform using a Columnar Datastore.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Qlik on the other hand take a completely different approach: Qlik uses selected values to determine which other values are possible through value-to-value mapping indexes known as a “BAI (Bi-directional Associative Index)”. If that sounds confusing, don’t worry as I will explain in the next blog post exactly how this works (if you care to read). The benefit to this approach is quite straightforward: For any given column Qlik can – without having to perform additional calculations – tell you all the distinct values that are possible and those that are excluded. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This is why QlikView’s List Box and Qlik Sense’s Slicer object can show you both possible and excluded values, and renders these lists so quickly.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It is also why in Qlik you can perform full text searches against all fields with relatively little effort.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Power BI in contrast has a rather sluggish User Experience when it comes to its Slicers performance, and does not show excluded values.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Furthermore, only recently has a full text search option been added (which has been a big help), but this is only for a specific Slicer, and will only reveal possible values (as opposed to excluded values) and as such falls short of a useful full text search engine because it only shows you results based on what is included in the Slicer’s list of possible values.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;At this point if you’re still reading you might be asking the question “Does this really matter?” and “How does different technology impact typical usage patterns, like filtering on KPIs and Charts?”&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;After all, BI tools are mostly sold on the promise of data visualization and interactive drill-down charts.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;To answer this question, I devised an experiment and tested it with two dataset variants that had characteristics that I thought would exercise both tools and illuminate the question of overall performance.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Experiment Overview&lt;/span&gt;&lt;/h2&gt;&lt;div&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The experiment pertains to a fictional Telecom company with a Call Centre (let’s call it “MyTelco”), that earns revenue through service subscriptions (e.g. television,&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Internet, etc.).&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MyTelco has a call centre which logs call centre activity to two tables:&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MyCalls&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MyWorkOrders&lt;/li&gt;&lt;/ol&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MyCalls lists all the Calls into the call centre and MyWorkOrders lists all the Work Orders generated by the call centre (through a customer call).&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The MyCalls table can be filtered on by the “Call Centre” column/dimension and there are five (5) call centres (North, South, East, West, and Central).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The calls are randomly distributed by Call Centre throughout the Call table.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The Work Order table contains Work Orders that were generated (and linked) by a Call.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The Work Order table therefore links to the Call table using the “Call ID” from MyCalls.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Linking (as opposed to merging) the Call and Work Order table is necessary to avoid “fan trap” whereby a single Call has multiple Work Orders, thereby potentially duplicating the Call record and skewing the Call Duration.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;For example, if we had a Call which lasted 4 minutes and generated 2 Work Orders and we merge the MyCalls record with the MyWorkOrders record, our call duration would be duplicated and would go from 4 minutes to 8 minutes. Hence this is why we must link instead of merge the two tables, and this is also why both Power BI and Qlik are well suited to this challenge.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Each Call may have 0,1,2, or 3 Work Orders. Therefore some Calls will not link to any Work Order.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The KPIs (i.e. Measures) are all based on the Work Order Amount (measured in dollars) and all calculate in linear time. These Measures are:&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;COUNT([Work Order Amount])&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;SUM([Work Order Amount])&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;AVG([Work Order Amount])&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MAX([Work Order Amount])&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;MIN([Work Order Amount])&lt;/li&gt;&lt;/ol&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This shows the KPIs as implemented in both Power BI and Qlik, respectively:&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_central2.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="521" data-original-width="800" height="260" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_central2.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_central.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="255" data-original-width="782" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_central.png" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Qlik&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&amp;nbsp;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;It is worth noting that all of these measures have a linear time complexity, as opposed to something like a COUNT DISTINCT or MEDIAN which has a greater than linear time complexity, specifically O(n log n).&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I did originally include MEDIAN but found that Power BI threw out-of-memory errors for this. After some reflection I realized that it was not necessary to include this for this experiment since the purpose of the experiment is to test the Relationship/Link indexing performance as opposed to the aggregation performance. To that end, linear measures more clearly reveal these differences.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I only included the MEDIAN to begin with because it is good and common practice to test performance using non-linear Measures. But I’m glad I gave this MEDIAN a second thought and removed it because it would have only added noise.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Experiment Details and Results&lt;/span&gt;&lt;/h2&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;To recap, I created two datasets each with two tables.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The datasets are based on hypothetical Call Centre data composed from two tables:&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Dataset 1 (high cardinality linking):&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“MyCalls” (100 million rows); and&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“MyWorkOrders” (140 million rows). &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;100 &lt;i&gt;million&lt;/i&gt; distinct Call IDs&lt;/span&gt; link these tables together.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The table has a one-to-many relationship going from “MyCalls -&amp;gt; “MyWorkOrders”&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;This is how the linked model appears in Power BI (for high cardinality example):&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_data_model_v1.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="461" data-original-width="627" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_data_model_v1.png" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here is how the linked model appears in Qlik (for both high and low cardinality):&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_data_model_v1_and_v2.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="259" data-original-width="683" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_data_model_v1_and_v2.png" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Qlik&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Dataset 2 (low cardinality linking):&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“MyCalls” (100 million rows); and&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“MyWorkOrders” (140 million rows). &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Dataset 2 is nearly identical to Dataset 1, but has only &lt;span class="s1" style="text-decoration-line: underline;"&gt;100 distinct Call IDs &lt;/span&gt;that link these tables together.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Also, when I changed the Call ID values, this led to a many-to-many relationship between “MyCalls” and “MyWorkOrders”. Although this change to many-to-many does not appear to have any negative impact on performance.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here is how the low cardinality Dataset v2 model appears in Power BI:&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_data_model_v2.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="475" data-original-width="588" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/power_bi_data_model_v2.png" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Most of the columns including “Work Order Type” and “Call Duration” are not used for anything and can be ignored.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The relevant columns are:&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“Call Centre Name”: the field we are filtering on in the parent table&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“Link Call ID”: the field linking both tables&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;“Work Order Amount”: the field we are aggregating in the child table&lt;/li&gt;&lt;/ol&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;What I was expecting before putting this test together was that Power BI would have outperformed Qlik in both tests. I wasn’t even planning on performing the second low-cardinality test because I inferred based on my understanding of how the indexes worked that Power BI would have outperformed Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But to my surprise I got the opposite result: Qlik’s index performed nearly four times (4x) faster than Power BI for linear aggregations across linked tables. Furthermore, Qlik’s disk and memory footprint were also approximately 3x smaller and as mentioned earlier there were non-linear aggregations I could continue to do in Qlik – like calculate the Median Work Order Amount – which in Power BI triggered an out-of-memory error.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In response to this finding, I produced another test data of the same overall volume but using a miniscule fraction of Call IDs to link the two tables together.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Specifically, I went from 100 million link keys (based on Call ID) down to 100 link keys – one millionth the original number of keys.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Through this change I was able to get better performance out of Power BI than from Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But I should point out that the performance differences, while objectively different, were harder to perceive since both tools produced sub-second response times. Contrast this second test to the first test (that used high cardinality linking keys) where would typically wait around 1.9 seconds for Qlik while waiting for 6 or 7 seconds seconds for Power BI. 1.9 seconds versus 6 or 7 seconds is noticeable, and with each passing second you can feel your mind beginning to wander.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;To conclude&lt;/b&gt;: &lt;i&gt;For high cardinality linking keys Qlik performed 4x (rounded) better than Power BI, and for low cardinality linking keys Power BI performed 15x (rounded) better and both performed in sub-second time.&lt;/i&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Experiment Biases, Supporting Calculations, and Artefacts&lt;/span&gt;&lt;/h2&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;At this point I should provide some more information about my background and biases: I have worked with Qlik since 2009 and Power BI (in the form of Power Pivot) since its release in 2010, but not in earnest until around 2016.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;From 2009 through to 2017 I was a heavy user of Qlik, but around 2017 client demands began to shift to Power BI and I followed suit and have been working almost exclusively with Power BI since 2018 (although I have begun to move back to using Qlik more these days). If I am to be honest with myself my bias is more towards Qlik mainly because I believe I have a faster Analytics Velocity with Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Although I see both tools (which both support linking models) more similar to each other than compared to the majority of other cube oriented BI tools and am happy to use either Qlik or Power BI for this reason.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;With all of that said, I want to ensure that the testing and measurement was as fair to Power BI as possible and if there were errors in measurement, they would benefit Power BI. To that end I should explain how I measured performance in both Qlik and Power BI.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In Power BI Desktop (I used November 2020) there is a built-in tool you can use called Performance Analyzer.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This tool provides precise measurements of query time, render time, and other times (e.g. waiting).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I used this tool for all Power BI measurements.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I also excluded non-query measurements from my final measurements keeping only the “Execute DAX Query.” Finally, I prepared 5 aggregation KPIs and took the fastest of the 5 for each measurement (overall Page Load time should in fact be based on the maximum load time).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This last choice I made favours Power BI the most.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In QlikView (I used version 11) there is no built-in Performance Analyzer tool so instead I took manual measurements using my phone as a stopwatch.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In my approach I would simultaneously click into the QlikView List Box while tapping my phone to start the stopwatch while keeping my eye on the Statistics Box (which contains the KPIs).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Then when I could see Qlik had finished loading new results I would hit stop on my stopwatch. There is a slight lag – up to 200 ms – that occurs when taking measurements in this way, so all my Qlik measurements are probably a bit longer than they should be.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;And because I only looked for the overall KPI/Statistics box to refresh I was not able to separate out the individual KPI load times nor was I able to separate out render time from query time. This approach slightly handicaps Qlik.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;To recap, this is how I favoured Power BI over Qlik while taking measurements:&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;Power BI&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;QlikView&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 36px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Precise measurements using Performance Analyzer&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 36px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Imprecise measurements using separate physical stopwatch. Lag time from seeing KPI render to hitting stop.&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Only measured DAX Query time&lt;/p&gt;&lt;/td&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Measured query time and render time&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Out of 5 KPIs used fasted to calculate&lt;/p&gt;&lt;/td&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 225px;" valign="top"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Out of 5 KPIs used slowest to calculate&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;All that notwithstanding and in the interest of science and transparency I will make these further disclosures: I loaded the data using Power BI September 2020 and performed tests using September 2020, October 2020, and November 2020 versions of Power BI and noticed November 2020 was a bit slower than September 2020, October 2020 and, but not significantly.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Thus Power BI may have added additional features that take up CPU.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The latest versions of Power BI are much more feature rich than QlikView 11 IR and those additional features may also contribute to performance bottlenecks. &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Finally, Power BI is using a modern HTML5 render engine whereas QlikView is based on a native Windows API which generally renders quickly.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;But with all of that out there and said, the data still points to a clear performance benefit of Qlik’s engine over Power BIs. And while I cannot promise the same result in Qlik latest Qlik Sense version, I expect it would be similar.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Let’s get to the result details.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The table below summarizes the measurements and how what our conclusions are.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Tool&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Selection&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;V1 (High Cardinality) Duration (ms)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;V2 (Low Cardinality) Duration (ms)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td8" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1480&lt;/p&gt;&lt;/td&gt;&lt;td class="td10" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1200&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1530&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1010&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1450&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1100&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1660&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;990&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1550&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1170&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td15" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Total&lt;/p&gt;&lt;/td&gt;&lt;td class="td16" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td17" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;1530&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td18" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;1100&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td8" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7759&lt;/p&gt;&lt;/td&gt;&lt;td class="td10" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 23px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;71&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6209&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;59&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;7734&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;76&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;4752&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;38&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;6231&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;89&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Total&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;6231&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;71&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Qlik Improvement factor&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;b&gt;4&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td11" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 136px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Power BI Improvement factor&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td12" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 44px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td13" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 161px;" valign="bottom"&gt;&lt;p class="p6" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td14" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 159px;" valign="bottom"&gt;&lt;p class="p7" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;b&gt;15&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;What follows below are more details including screenshots, Performance Analyzer data and a Performance Analyzer Report I built to analyze the data.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;For Power BI I have attached the Performance Analyzer Output.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/PowerBIPerformanceData_v1.json"&gt;PowerBIPerformanceData_v1.json&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/PowerBIPerformanceData_v2.json"&gt;PowerBIPerformanceData_v2.json&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;I have also attached corresponding Power BI Reports so you can explore this Performance Analyzer output.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/V1 Performance Analysis.pbix"&gt;V1 Performance Analysis.pbix&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/V2 Performance Analysis.pbix"&gt;V2 Performance Analysis.pbix&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_power_bi_performance_analyzer.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="445" data-original-width="800" height="223" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_power_bi_performance_analyzer.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;V1 Performance Analysis for Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v2_power_bi_performance_analyzer.png" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="451" data-original-width="800" height="225" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v2_power_bi_performance_analyzer.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;V2 Performance Analysis for Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;There are the software versions used for the experiment:&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;ol class="ol1"&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Power BI I used version 2.87.1061.0 64-bit (November 2020).&lt;/li&gt;&lt;li class="li3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Qlik I used QlikView version 11 IR (released in 2012)&lt;/li&gt;&lt;/ol&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here are screenshots showing my laptop specs:&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/laptop_spec_1.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="178" data-original-width="800" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/laptop_spec_1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/laptop_spec_2.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="158" data-original-width="209" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/laptop_spec_2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;&lt;p&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Finally, I have included the actual Reports themselves and the underlying data.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I should note that I am not including the High Cardinality version of the reports (which are identical in structure to the low cardinality reports), but I am including the low cardinality reports and a “tiny” sample of both high and low cardinality data files so you can see what it looks like. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In lieu of not including the huge reports I have included this screenshot so you can see how big the documents are and the difference in size for the same underlying data.&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/file_explorer.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="339" data-original-width="689" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/file_explorer.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;&lt;p&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/file_explorer2.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="141" data-original-width="364" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/file_explorer2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here are the reports for Power BI and Qlik, respectively:&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/example/Analyze Fictional Call Centre Data v2.pbix"&gt;Analyze Fictional Call Centre Data v2.pbix&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/example/Analyze Fictional Call Centre Data_v2.qvw"&gt;Analyze Fictional Call Centre Data_v2.qvw&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;Here are the data files:&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/data/MyCalls.csv"&gt;MyCalls.csv&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/data/MyCalls_v2.csv"&gt;MyCalls_v2.csv&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/data/MyWorkOrders.csv"&gt;My Work Orders.csv&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/data/MyWorkOrders_v2.csv"&gt;MyWorkOrders_v2.csv&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Experiment Update 2021-01-12&lt;/span&gt;&lt;/h2&gt;&lt;div&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;As I mentioned at the top of this post that an astute reader has pointed out that the experiment is no entirely fair because I am invoking five queries in Power BI - one for each KPI Card visualization - while in Qlik there is only one visualization and hence only one query.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;As a result of this feedback I have since re-run the test using the V1 dataset (which uses high cardinality linking keys) but instead of testing with five Measures I am now testing with a single SUM measure.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica;"&gt;&lt;span style="font-size: 11px;"&gt;You can see the updated dashboards for Power BI and Qlik below, respectively.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica;"&gt;&lt;span style="font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Helvetica;"&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_power_bi_all_sum.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="224" data-original-width="800" height="113" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_power_bi_all_sum.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Power BI&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_qlik_all_sum.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="206" data-original-width="769" height="107" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_qlik_all_sum.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Qlik&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="p1" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The table below summarizes the measurements and how what our totals have changed significantly.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;/span&gt;&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Tool&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Selection&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;V1 SUM only (High Cardinality) Duration (ms)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1600&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1160&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1310&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1610&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1520&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Total&lt;/p&gt;&lt;/td&gt;&lt;td class="td8" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;1520&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1694&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1546&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1553&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1625&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Power BI&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1667&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Total&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;1625&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Qlik Improvement factor&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;b&gt;1&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Power BI Improvement factor&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;&lt;b&gt;1&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;Here is the Power BI performance data visualized in my Performance Analyzer dashboard:&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_SUM_power_bi_performance_analyzer.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="445" data-original-width="800" height="223" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/v1_SUM_power_bi_performance_analyzer.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;V1 SUM only Power BI Performance Output&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;For transparency I have attached both the Performance Analyzer JSON data and the PBIX I used to analyze the data.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/PowerBIPerformanceData_v1_SUM.json"&gt;PowerBIPerformanceData_v1_SUM.json&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/performance/V1 SUM Performance Analysis.pbix"&gt;V1 SUM Performance Analysis.pbix&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;As you can see from the table above, the difference between Power BI and Qlik are imperceptible (to a human) with Qlik showing very slightly better performance. But the delta is within margins of error that I would call it a tie. I'm sure it would be possible to re-run the experiment and get a better result for Power BI under the right conditions. Again,I see the updated result as effectively a tie.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;This materially changes the conclusion from: "Qlik's engine can demonstrate performance 3x better than Power BI for large high cardinality linking relationships" to "Power BI and Qlik's engine show equivalent performance for high cardinality linking relationships."&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;That notwithstanding, my original expectation was that Power BI would outperform Qlik in this test.&amp;nbsp; The reason I expected this is that Qlik's engine is optimized for its Slicer/Filter experience which I discussed above in the section "Qlik's Secret Sauce".&amp;nbsp;in the original context this is tie is still somewhat of an unexpected finding.&amp;nbsp; But to be clear let me re-state the conclusion:&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p1" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;b&gt;2021-01-12, to conclude (superseding previous conclusion above)&lt;/b&gt;: &lt;i&gt;For high cardinality linking keys Qlik performed approximately the same as Power BI, and for low cardinality linking keys Power BI performed 15x (rounded) better and both performed in sub-second time.&lt;/i&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;Harkening back to the preamble of this post, Analytics is a feedback loop mostly driven by curiosity and interest.&amp;nbsp; In that spirit I was curious as to how Qlik would perform if I separated out the single visualization object into five (5) separate objects, similar to the five separate Power BI KPI Cards.&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;You can see below how the modified QlikView dashboard looks after this change.&lt;/p&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_all_separated.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" data-original-height="320" data-original-width="613" height="209" src="https://cradleofanalytics.blob.core.windows.net/blog/pbi_v_qlik_p1/images/qlik_all_separated.png" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;QlikView with separate Visualizations for each KPI/Measure&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;After separating out the Visual objects into five separate objects I re-performed the test selections and measured the latency.&lt;/p&gt;&lt;p class="p1" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;The table below summarizes the measurements in this new test with five separate visualizations, one for each Measure/KPI.&lt;/p&gt;&lt;table cellpadding="0" cellspacing="0" class="t1" style="border-collapse: collapse;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="td1" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Tool&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td2" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;Selection&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;td class="td3" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 11px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&lt;b&gt;V1 SUM only (High Cardinality) Duration (ms)&lt;/b&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Central CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1890&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;East CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1710&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;North CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1640&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;South CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1480&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td4" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Qlik&lt;/p&gt;&lt;/td&gt;&lt;td class="td5" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;West CC&lt;/p&gt;&lt;/td&gt;&lt;td class="td6" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 10px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;1790&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="td7" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 175px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;Total&lt;/p&gt;&lt;/td&gt;&lt;td class="td8" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 58px;" valign="bottom"&gt;&lt;p class="p2" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/td&gt;&lt;td class="td9" style="border-color: rgb(0, 0, 0); border-style: solid; border-width: 1px; height: 12px; padding: 4px; width: 207px;" valign="bottom"&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: right;"&gt;&lt;span class="s1" style="text-decoration-line: underline;"&gt;1710&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;As you can see from these results they are very similar to what we found when we Measured for the original combined visualize as well as for the revised "SUM only" version of the experiment.&amp;nbsp; In other words, it would appear as though Qlik is doing a significantly better job at coordinating the separate queries than Power BI and we are not paying much of a price for the additional Measures.&amp;nbsp; This is not to say that it's not possible to combine the five Measures in Power BI and get a similar result.&amp;nbsp; But it is a common design pattern to use multiple KPI Cards on a given Page, so I feel the Qlik scenario is more typical.&lt;/span&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;It was not my intent to test this aspect of the query engine, but thought it worthwhile to share this insight all the same as it reveals something about both tools.&lt;/span&gt;&lt;/p&gt;&lt;p class="p4" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px; min-height: 13px;"&gt;&lt;span class="Apple-converted-space"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;h2 style="font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; text-align: left;"&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;Next Blog Post&lt;/span&gt;&lt;/h2&gt;&lt;div&gt;&lt;span style="font-family: inherit; font-size: small; font-weight: normal;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;p class="p3" style="font-family: Helvetica; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;In my next blog post I am going to explain – using an example – how each of the underlying indexing engines physically work and why they are so different.&lt;/p&gt;&lt;p&gt;&lt;span style="font-family: Helvetica; font-size: 11px;"&gt;Stay tuned.&lt;/span&gt;&lt;/p&gt;</description><link>http://hepburndata.blogspot.com/2021/01/analytic-efficiency-part-1-power-bi-vs.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total><georss:featurename>Toronto, ON, Canada</georss:featurename><georss:point>43.653226 -79.3831843</georss:point><georss:box>15.342992163821151 -114.5394343 71.963459836178842 -44.226934299999996</georss:box></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-3836159278136024519</guid><pubDate>Fri, 09 Oct 2020 00:11:00 +0000</pubDate><atom:updated>2020-10-08T20:11:47.372-04:00</atom:updated><title>Snowflake’s Market Valuation and Network Effects</title><description>&lt;p&gt;&lt;b&gt;TL/DR: &lt;/b&gt;&lt;i&gt;Snowflake’s $63 billion valuation shortly after their IPO is primarily based on user network effects rather than its database innovations. Tech journalists and other analysts don’t want to sound cynical by reminding people of “lock in” and would rather discuss innovative features. But technical innovation is a distraction here. Whether you are an investor or a customer, it behooves us to understand the subtle nature of network effects and lock in or else we are doomed to lurch from one locked-in vendor silo to the next. But there is reason for hope.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;A few weeks ago I was listening to a podcast interview with Bill Gates.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;At one point the conversation turned to Bill’s relationship with Warren Buffett. Gates recalled an early conversation he had with Buffet where he mentioned that the ‘Oracle of Omaha’ was struggling to understand how Microsoft could compete with IBM in the long run.&amp;nbsp; This was in 1991 and IBM at that time was still considered the juggernaut of IT and the conventional wisdom was still “nobody got fired for buying IBM”.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;While Gates didn’t come out and say it explicitly during the interview, the reason of course has to do with network effects or more specifically what we now refer to as user or vendor “lock in”.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;For the same reason that Microsoft was able to compete with and eventually displace IBM, was the reason that Microsoft was nearly broken up in the late 1990s. Once a critical mass of users and applications is reached, it is nigh impossible to break that network without some kind of technology disruption occurring, in which case a new technology or vendor takes over and the cycle repeats itself.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Network effects, lock-in, and barriers to entry for competitors are something many people are aware of but the tech journalists and other analysts don’t always remind us of this when I think they should.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;My introduction to the network effect concept came in the 1980s when I learned that the inferior VHS videotape standard had displaced the superior BetaMax format that Sony had developed.&amp;nbsp; The story goes something like this (I am recollecting from memory, apologies for any errors ): Sony had developed two video cassette technologies: VHS and BetaMax.&amp;nbsp; VHS was deemed a prototype and not worth protecting IP.&amp;nbsp; BetaMax was the quality product Sony was aiming for.&amp;nbsp; So VHS was out there as a viable standard but initially had no backing to speak of.&amp;nbsp; When the adult film industry realized they could sell movies directly to customers through home video, their efforts were stymied when Sony refused to allow its BetaMax to be used for distributing “lewd and unsavoury” content.&amp;nbsp; With no access to BetaMax the adult film industry simply took the path of least resistance and distributed through VHS.&amp;nbsp; Since there were no IP restrictions for VHS, this opened the standard to other companies like RCA, Philips, Panasonic and other home electronics manufacturers to manufacture and sell their own VCRs.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Within a few short years VHS became the de-facto standard for home video.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;As an epilogue I do recall BetaMax hanging on for a while by emphasizing the quality aspect of their invention.&amp;nbsp; However, even that was displaced by the LaserDisc format which could provide both quality and features that neither VHS nor BetaMax could provide.&amp;nbsp; By the mid nineties there were two home video “networks”: Low-end VHS tapes for the masses and high-end LaserDiscs - which offered superior picture and sound quality, directory commentary, and elaborate liner notes - for film buffs.&amp;nbsp; The esteemed Criterion Collection was effectively borne out of the network effects from LaserDisc format.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;By the late nineties the DVD format began to pick up steam and because DVDs could offer LaserDisc features at VHS prices, both of the latter standards would soon succumb to the same fate at the BetaMax format.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;An episode of The Simpson’s has a funny joke about this where we see a garbage dump with piles of VHS tapes, LaserDiscs, DVDs, and an empty area with a sign that says “Reserved for Blu-Ray”.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I suspect (even hope) many readers just skimmed through the last two paragraphs.&amp;nbsp; As anyone who cares or follows the technology industry knows, network effects are the name of the game in the modern world of technology.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;In a sense it has been like this for a while: in the “olden days” you could be shrewd inventor like Morse or Edison and work the levers of the patent system to protect your invention and build up a network effect.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Alternatively, you could be like IBM or Apple and develop a strong reputation and brand that keeps customers loyal while scaring people away from competitors products.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;In ancient times (and even still today) innovation was limited to what could taught and passed on.&amp;nbsp; Hero of Alexandria invented the steam engine nearly 2,000 years before it would be re-invented. He wrote down instructions on how to make one, but anyone who took the time to build one probably kept the knowledge to themselves.&amp;nbsp; We even romanticize such inventors as “wizards” when many of them were really just knowledge hoarders protecting their secrets.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;While I have my reservations about Edison, IBM, and even Apple, I think building businesses on patents and strong reputations is not necessarily a bad thing for the simple fact that this dynamic propels innovation.&amp;nbsp; The inventors in the pre-modern world would also have to compete.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;On the other hand, businesses that are able to exploit network effects to grow quickly are also worthy of praise.&amp;nbsp; These companies which often come out of nowhere can quickly shake up otherwise calcified industries. Everyone loves an underdog, especially when it beats the champ.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But taken to extremes all of these things: patents; brands; and network effects can end up stifling competition or worse, creating institutions that society depends on but which effectively undemocratic. Companies like Facebook, Google, Microsoft, Amazon, Apple, and I’ll put Western Union in there for good measure all leverage network effects to great success.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Do you think it’s possible to out-innovate Google with a better search engine? Maybe it’s possible, but you also have to realize that Google has the world’s largest database of user search history which is now the main driver of search results.&amp;nbsp; Good luck in getting that, but maybe it’s possible?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Apple has hundreds of thousands of developers locked into its App ecosystem (which only competes with Google).&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Facebook I don’t need to explain, they don’t even bother boasting about the size of their network any longer and basically act like the Saudi Arabia of social networking.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Amazon in some ways has the most genuine competition to deal with. But when you compare it to its next best competitor Walmart, Amazon is miles ahead in terms of its integration between supply chains, warehouses, and delivery chains, not to mention the fact that Amazon also has its own cloud infrastructure “AWS” it can cheaply tap into for all of its digital needs.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The trillions of dollars these companies are worth are based mostly, in my opinion, on these network effects more so than any particular technology or innovation you can point to.&amp;nbsp; We live in a world of that is increasingly substituting glorious free market competition with the less splendid monopolistic competition.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Whenever I see talk of a new start-up, I feel like the honest question to ask is: How does this company combat existing network effects that are working against them and how can they bring about network effects that lock their customers in going forward?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;When I phrase it like this, it sounds a bit anti-capitalist or crude to even wish for lock-in.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;It is so obvious that this is what is happening and yet I rarely see this level of frankness when new technologies are discussed.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;In other words, I believe the way we talk about technology and innovation is dangerously infused with magical thinking that ultimately plays into this dynamic.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;This now brings me to Snowflake, the recently public IPOed cloud database company.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;As of October 2nd 2020 on the NYSE, “SNOW” closed at $227 with a combined Market Cap of $63 billion.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;That is an extraordinary sum of money for a relatively new technology that was founded in 2012 (8 years old).&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Compare this to other companies in the business data analytics on October 2nd:&amp;nbsp;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style="text-align: left;"&gt;&lt;li&gt;Domo (DOMO) $1.1 billion (founded in 2010)&lt;/li&gt;&lt;li&gt;TeraData (TDC): $2.4 billion (founded in 1979)&lt;/li&gt;&lt;li&gt;Looker was recently sold to Google for $2.6 billion (founded in 2012)&lt;/li&gt;&lt;li&gt;Cloudera (CLDR): $3.28 billion market cap (founded in 2008)&lt;/li&gt;&lt;li&gt;Databricks, still private but last valued at $6.2 billion (founded in 2013)&lt;/li&gt;&lt;li&gt;Alteryx (AYX): $8.1 billion market cap (founded in 2010)&lt;/li&gt;&lt;li&gt;Tableau was recently sold to Salesforce.com for $15.3 billion (founded in 2003)&lt;/li&gt;&lt;li&gt;Oracle (ORCL) closed with a market cap of $177.1 billion (founded in 1977)&lt;/li&gt;&lt;li&gt;SAP SE (SAP) closed with a market cap of $189.4 billion (founded in 1972)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Based on this smattering of companies, it would seem like Snowflake is an outlier in the data &amp;amp; analytics space.&amp;nbsp; It’s a newer company, but is valued at least 4 times as much as it’s closest peer (Tableau).&amp;nbsp; On the other end of the spectrum Snowflake is one third of the value of Oracle and SAP, both highly established businesses with deep customer bases.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I can point out many companies that would struggle to keep operating without Oracle and SAP, so their hefty valuations make sense to me.&amp;nbsp; I can also look at a company like Tableau and appreciate its value, mainly based on the strong brand image it was able to cultivate and the loyalty this has generated.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;With Snowflake I am not seeing - at a ground level - the level of business process lock-in that Oracle and SAP enjoy.&amp;nbsp; Nor am I seeing the level of brand recognition that Tableau has among business users (although that has ironically been boosted by the IPO itself).&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Furthermore, the vast majority of IT departments (and businesses in general) tend to prefer homogenous architectures.&amp;nbsp; This means that they would rather purchase from a single vendor with “one throat to choke” than deal with multiple vendors and the friction that comes with cross-vendor integrations.&amp;nbsp; This means that if SAP forms the centre of business functions and processes then you would buy SAP products even if they are not quite as good as the competitor’s product.&amp;nbsp; I’m not saying this is always the right thing to do, but it often is the most efficient approach, and more to the point it’s a more easily defensible position, a la “Nobody ever got fired for buying IBM”.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;So how does Snowflake overcome these huge barriers and go on to be valued at the $63 billion?The answer of course is network effects.&lt;/p&gt;&lt;p&gt;But how exactly?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Before I answer that question, I’ll just explain briefly how Snowflake works and how it is different from traditional databases.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Most databases combine both the data (storage) and the compute power (VM [virtual machine]) into a single service.&amp;nbsp; While modern data warehouses do a decent job of scaling large numbers of users and workloads, they do not scale elastically. What this means is you need to reserve ahead of time the amount of compute hardware for the database.&amp;nbsp; This means you are often paying for unused compute power and it can also mean that you might hit an upper limit if too many users are querying the data warehouse at the same time. It’s not a common thing but it happens, and quite frankly it is rare to see problems that cannot easily be fixed by simply adding more hardware, which is can be accomplished with a few clicks on cloud platforms.&amp;nbsp; You can even automate this scaling.&amp;nbsp; There are also benefits to this constraint in that costs are more predictable and easier to budget for.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Nevertheless, this user isolation concern has been known for decades and techies tend to gravitate towards these types of problems, even if they are mostly theoretical.&amp;nbsp; Originating from Google and beginning around 2010, the concept of a “Data Lake” began to emerge as the solution to this problem. The idea was (and still is) that you run programs on clusters of VMs (i.e. one ore more cloud compute nodes) and read and write from a specialized file system known as HDFS (Hadoop File System, [originally the Google File System]) that is designed for both resilience (i.e. protection from data loss) and performance.&amp;nbsp; This means that data developers and data analysts alike can blend data from multiple sources (essentially just flat files like CSV files, but also&amp;nbsp; binary optimized flat flies like ORC and Parquet formatted files), and then write back out their results as other flat files.&amp;nbsp; However, the problem with this approach is that there has never been a consensus for how developers manage the Metadata that surrounds these flat files.&amp;nbsp; It’s a bit like asking hundreds authors to fill up a library with books but without any indexing system.&amp;nbsp; Eventually you will be left with a library teeming with books but with no ability to find what you’re looking for or even understand who wrote what and why.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Metadata management is a crucial service that databases provide (often in sub-system known as the “information schema”) so users can quickly understand how tables are related and what assumptions we can make of their their contents.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The way in which Data Lake centric platforms solve this problem is to create a managed Metadata service that would provide this service.&amp;nbsp; If you have heard of “Presto” or “Hive” or “Impala”, these software systems piggyback on Data Lakes and provide a Metadata service layer which allows data developers to work with Data Lakes while maintaining the crucial Metadata in a centralized location available to all other users. For example, Databricks and Cloudera both allow their users to create “Hive” databases, whereby all the data is stored in a Data Lake and the computation costs are charged back to the user.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I would argue that this approach is as good if not better than the Snowflake DB approach. The reason is that when you are querying Snowflake data you need to use Snowflake’s own database compute engine (regardless of the fact that it is perfectly isolated).&amp;nbsp; To be fair to Snowflake, they have built an engine that performs well under the vast majority of situations.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But the Data Lake approach has no constraints when it comes to computation or how files are being accessed.&amp;nbsp; You also don’t need to go through the Metadata (e.g. Hive) layer if you don’t want to. For example, you could develop specialized file formats and corresponding compute engines that encode and decode those files for specialized applications that don’t fit neatly into pre-existing patterns.&amp;nbsp; Again, to be fair to Snowflake they can and have extended their platform to allow this type of customization too, but you have to wait for Snowflake to do this.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Let’s get back to this extraordinary market valuation of $63 billion. If there are established alternatives to Snowflake as we just discussed, and those alternatives are already baked into the eco-systems of major cloud vendors (e.g. Databricks and Cloudera are already baked into Microsoft Azure), and it is undesirable to use vendors outside of the dominant platform, then why would so many people abandon homogenous architectures and complicate the IT landscape with Snowflake?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The answer I believe comes down to something much simpler: Convenient user access.&amp;nbsp; Not unlike VHS videotapes.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;A pattern I often see play out in the modern enterprise goes something like this: Technologists (researches, consultants, early adopters, IT staff, etc.) hear about some bleeding edge new technology that “all the cool kids” are using.&amp;nbsp; This is often associated with some kind of hype cycle, like “big data” or “machine learning” or “blockchain”. Then there is a rush to evaluate and even adopt the technology.&amp;nbsp; Some “proof of concept” or trial is produced that might show some potential.&amp;nbsp; Usually there is a one-time data dump/extract provided along with a few business users who are onboarded with the specific purpose of evaluating the potential of the technology.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But then when the proof-of-concept is over and the business users themselves are left to their own devices, they rarely go back and use or even ask for the new technology.&amp;nbsp; Occasionally a user might mention it or even ask if they can use it, but more often than not they will go back to data sources they readily have access to and tools they are comfortable with.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;With all of that said, doesn’t Snowflake also have the same problem, and isn’t it made worse by the fact that Snowflake does not (yet) have the advantage of being sold under the banner of a big cloud provider like Microsoft GCP or Amazon AWS or Google GCP?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Yes and no.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Yes, because there is much of the same type of friction you normally run into when sharing data within the organization or enterprise.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;No, because there is significantly less friction when you are collaborating with people outside of the organization or enterprise.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;It is this cross-organization data collaboration where Snowflake really stands out and where I can see justifying it’s astronomical Market Cap of $63 billion.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Allow me to elaborate…&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The challenge with most modern data platforms is they do not work out-of-the-box with most data tools (e.g. Excel). Instead, the data developer or data user must adjust to a new way of working that is often more awkward than the old way.&amp;nbsp; For example, take that “Data Lake” I was mentioning earlier.&amp;nbsp; To get data into our out of a Data Lake requires special desktop tools (e.g. Azure Storage Explorer) and command line tools that most business analysts don’t have access to.&amp;nbsp; Since these platforms were designed by technologists for “data scientists”, they often lack many simple features most business users take for granted and often feel clunky. For example, if you want to open a file from Azure Data Lake Store, you can’t just browse using Windows File Explorer and then pop open a data file in Excel. Instead you first need Azure Storage Explorer (which may not be available as a standard Enterprise application) and then even if you get it you need to download the file locally first before being able to peer into its contents.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The point being, no matter what modern Data &amp;amp; Analytics platform you are using, there will be some change to the way you work.&amp;nbsp; Thus in this scenario it doesn’t matter if you are using a Microsoft Data Lake or a Snowflake DB.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Now here is where things get interesting.&amp;nbsp; Once you pivot to using a modern Data &amp;amp; Analytics platform, whether it’s Azure Databricks or Azure Synapse or Google Big Query or Amazon RedShift, if you decide at some point that you need to collaborate with groups or individuals who are outside of your organization (or perhaps even in other departments), you start running into brick walls.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;For example, let’s say you are an insurance company that wants to share data to related to fraud for the purpose of catching serial fraudsters (this is a real thing btw).&amp;nbsp; Basically your only option is to find some agreed upon data store that is deployed specifically for this very purpose and that requires building data pipelines to populate and maintain.&amp;nbsp; Even keeping a single table refreshed daily would be a big ordeal given all of the concerns that normally are associated with moving data over the public Internet.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But you might think “Hey could we not just provide some kind of guest account into the Data Lake or whatever the database platform we are using?”&amp;nbsp; The answer is “Yes, but.”&amp;nbsp; It’s not so simple is for security reasons.&amp;nbsp; Namely,&amp;nbsp; most cloud Data Platforms are really just one of many components that are managed under a common cloud service for a given “tenant”.&amp;nbsp; For this reason the big cloud providers (Microsoft, Amazon, Google) tend to put up many barriers to external users from logging in easily.&amp;nbsp; By default there are network firewalls and what are known as “conditional access policies” that will take into account your device, location and other factors and require you to perform “multi-factor authentication”, and that’s even if you can get past these policies to begin with.&amp;nbsp; These cloud platforms are designed - by purpose - to minimize friction internally while introducing much friction externally.&amp;nbsp; That external friction is done in the name of security but it also has the effect of nudging people into using more of the cloud vendor’s tools.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;This is where Snowflake really shines: It is designed to allow data sharing across customers while not compromising security.&amp;nbsp; While this&amp;nbsp; might sound like something that only applies to cross-organization (internet) sharing and collaboration, it can also apply to intra-organization (intranet) sharing and collaboration.&amp;nbsp; This is because all of those Cloud Data Platforms I just listed tend to be very modular and isolated in their design. These days most organizations have several Data Lakes and other Big Data platforms, but those Data Lakes are often just silos used by only a small number of persons.&amp;nbsp; Contrast this with Snowflake which only behaves like a silo if you want to it, but the moment you want to share data with another Snowflake user, whether they belong to your organization or not, you have the simple and easy option to do so. To be clear, it is possible to lock Snowflake down and make a silo if you want it to be, but it’s also much easier to change those settings when you don’t. The architecture never commits you to the extreme form of isolation that the other cloud platforms do.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;While we’re on the topic, I should also point out that Snowflake has another advantage over Data Lakes (which is the alternative that most closely matches Snowflake’s approach): With Snowflake you have fine grained control over what specific rows and columns you want to share, whereas in the Data Lake world, the most granular object is a flat file (table), which tends to made up of many rows and columns.&amp;nbsp; Yes, it is possible to create flat files that are a custom materialized “view” into the data.&amp;nbsp; But you would need to have a data pipeline that keeps that document refreshed on a regular basis.&amp;nbsp; The “custom view” feature in Snowflake (same as RDBMS VIEWs) allows data developers to rapidly and securely collaborate without requiring any “accidental complexity” from data pipeline scripting to creep in.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;So what are the implications of all of this?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;For Snowflake, if they can get to a certain critical mass they could get to the tipping point where their service simply becomes a cost of doing business.&amp;nbsp; At that point their $63 billion market cap will make sense and many investors might even be wishing they had bought sooner.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;This is what Dropbox (NASDAQ:DBX) has been trying to do for a while.&amp;nbsp; Their current market cap is $8 billion down from a high of nearly $17 billion in June 2018.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But Dropbox is more of a consumer technology, and consumers are fickle and don’t really depend on storage services the same way, because very few people maintain their own software systems that manage data.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Snowflake is not consumer data storage service like Dropbox (btw, I am a fan of Dropbox and wish them well) because Snowflake is much more plugged in to Enterprise IT.&amp;nbsp; Snowflake has more “lock in” potential.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;I’m sure Snowflake knows this and is using its IPO money to pursue this “winner take all” end game.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But what about the big cloud vendors, surely they are working on something to combat Snowflake?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;There are two main challenges they are up against:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ol style="text-align: left;"&gt;&lt;li&gt;They definitely need to make their existing data platforms more thoroughly elastic.&amp;nbsp; Platforms like Azure Databricks works more like this, but deployment and administration is not as streamlined as it could be when comparing against Snowflake.&lt;/li&gt;&lt;ol&gt;&lt;li&gt;This can be challenging but it’s very much a tractable problem.&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;They might need to figure out a way to make cross-organization sharing easier while not compromising security for the other components in the same cloud platform.&amp;nbsp; This is a much harder problem to solve because of all the trade-offs involved.&amp;nbsp; A solution that might benefit data sharing in a data warehouse or data lake might create vulnerabilities or complexities elsewhere.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;My advice to these giants would be to play to their own strengths and reduce the friction within their own service perimeters. Earlier in this post I mentioned how Microsoft Azure business users must use Azure Storage Explorer to access data in Azure Data Lake Stores.&amp;nbsp; I should also point out that Azure Data Lake Store itself only supports up to 32 users or user groups from a security standpoint.&amp;nbsp; As a result it’s now common to see dozens of Data Lakes and other data platforms - all silos within the same cloud vendor - sprinkled throughout various departments.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;If Microsoft, Google, and Amazon were to focus on these friction points they wouldn’t need to build a competitor to Snowflake.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Instead what I see is a half baked approach whereby business users are nudged towards more business friendly platforms like Microsoft 365 (formerly Office 365)&amp;nbsp; and Google Workspace (formerly G Suite) while leaving the real power tools for just the IT department.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The so-called Data Scientist is basically stuck between these two worlds and often will just resort to running local Data Prep tools on their laptop like Python or Alteryx or KNIME or whatever they can get their hands on.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;To sum up, it is important to recognize that technically innovative features can distract us from the market value of a given business.&amp;nbsp; Tech journalists spend most of their time discussing technical innovations as if this is what drives these businesses, when instead it is through subtle network effects that can arise out the absence or inclusion of user friction.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;What does this mean for the future of technology and business?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Are we doomed to hop from one locked in platform to another for ever and ever?&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The good news is that as this [pay walled] &lt;a href="ttps://www.economist.com/business/2020/10/07/google-antitrust-and-how-best-to-regulate-big-tech"&gt;article&lt;/a&gt;&amp;nbsp;in The Economist points out most people now realize this is a problem and governments are beginning to regulate the tech industry in much the same way other “winner take all” industries like utilities and banking have been regulated in the past.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The EU in particular appears to be working on legislation that if passed would open up companies like Google, Apple, and Facebook to more competition.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But as The Economist also points out “the devil is in the details” and getting regulation right is tricky (and worth the effort).&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;This means we should also be looking for solutions within technology itself.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;History does not tolerate lock-in forever and eventually something better emerges out of these constraints. It is that “something better” that I am working on and I hope others are working on this same problem too. Much progress has been made due to the open source software movement, both for commercial and non-commercial users.&amp;nbsp; But open source software is not enough because most of what I just described is ironically based on large swathes of open source software and yet here we are.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;But, I can see a way forward from these silos.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;How do you think we will break past these network effects and allow a more naked form of innovation to be the arbiter of success?&lt;/p&gt;</description><link>http://hepburndata.blogspot.com/2020/10/snowflakes-market-valuation-and-network.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-4371492401729228145</guid><pubDate>Wed, 29 Apr 2020 00:08:00 +0000</pubDate><atom:updated>2020-04-28T20:08:37.544-04:00</atom:updated><title>The Power of Diversity: Why I Love the Humble and Organic CSV File</title><description>These days when we hear the word 'diversity', different things might come to mind for different people. Diversity - with modern connotations of immigration or affirmative action has infused the word with with notions of multiculturalism, gender equality, and other forms of social levelling.&amp;nbsp; I am supportive of that, but what I want to discuss here is diversity from a more abstract perspective.&lt;br /&gt;
&lt;br /&gt;
Lest you feel that it's nigh impossible to discuss diversity without taking a political position, I will come out and say it right now: I am pro diversity. Anti-diversity people can stop reading right now. That's the end of this post for you. Have a nice day.&lt;br /&gt;
&lt;br /&gt;
If you're still reading, let's take a step back and look first at diversity from a cosmic perspective...&lt;br /&gt;
&lt;br /&gt;
Life itself here on earth emerged from a process known as &lt;a href="https://en.wikipedia.org/wiki/Abiogenesis"&gt;Abiogenesis&lt;/a&gt;.&amp;nbsp; There's a lot we don't know about this process, but what we do know is that it came about through the power of diversity.&amp;nbsp; Abiogenesis involved mixing of complex molecules (proteins, possibly delivered through meteors from space), shifting temperatures and shifting movement (through tectonic plate shifting), shifting gravitational patterns (through the orbit of the moon), shifting light and radiation patterns (through the orbit of the earth).&amp;nbsp; Even the part of the universe we inhabit has been shown to have more atomic diversity than the other parts of the universe.&amp;nbsp; For example, phosphorus, a key building block for life happens to be available in our next of the cosmic woods but not most other places in the cosmos.&amp;nbsp; Without it we wouldn't exist.&amp;nbsp; It turns out, this element is not evenly spread around the universe, and we happen to be &lt;a href="https://www.popularmechanics.com/space/solar-system/a19685943/alien-life-phosphorus/"&gt;luckily placed in this cosmic soup of diverse elements&lt;/a&gt;.&amp;nbsp; We also need the right amount of radiation too so mutations may occur, allowing new forms of diversity to emerge and further evolve.&lt;br /&gt;
&lt;br /&gt;
Given enough time and space, our existence is both inevitable and miraculous depending on how you look at it.&amp;nbsp; What sets earth apart from most other planets in the visible universe is not only that we have the right conditions to allow life to take hold once it arrives, but just as importantly the right amount of diversity so life may spring from chaos, and not just thrive in a simple state, but evolve to every increasing levels of complexity.&lt;br /&gt;
&lt;br /&gt;
Turning to the present moment, COVID-19 is&amp;nbsp;likely&amp;nbsp;also the product of diversity.&amp;nbsp; Most epidemiologists agree that COVID-19 - like the SARS virus before it - emerged from animal markets in Wuhan China through natural evolution.&amp;nbsp; There is a long history connecting plagues to animal domestication. It has something to do with animals being brought to mingle together in ways that would never occur in the wild. Again novel diversity.&lt;br /&gt;
&lt;br /&gt;
This is why Europeans who first emigrated to &lt;i&gt;North&lt;/i&gt; America brought disease that wiped out most of the indigenous population. These diseases evolved in European farms through similar processes that led to COVID-19.&amp;nbsp; Conversely, Europeans that emigrated to &lt;i&gt;South&lt;/i&gt; America were on the receiving end, constantly battling disease that evolved out of&amp;nbsp;the tropical jungle's diverse ecosystem.&lt;br /&gt;
&lt;br /&gt;
Bio-diversity may have also led to the extinction of Neanderthals. Our old cousins may have been wiped out or partially decimated by diseases that originated from &lt;a href="https://www.cnn.com/2016/04/15/health/humans-responsible-for-neanderthal-extinction-by-transferring-diseases/index.html"&gt;Homo Sapiens homeland in Africa&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
While it may seem that diversity of the kind I speak of here emerged from the "forces of nature" and is outside the purview of human life, I want to draw your attention to another example of how diversity shaped the trajectory of human civilization.&lt;br /&gt;
&lt;br /&gt;
If you want to hear the full story, just listen to my podcast "&lt;a href="https://cradleofanalytics.blob.core.windows.net/coa/index.html"&gt;Cradle of Analytics&lt;/a&gt;", and jump to episode 10 "Meet the Flintstones".&amp;nbsp; I'll summarize the key points for here:&lt;br /&gt;
Interest/usury which is the backbone of Lending, Capitalism, and Free Markets was not invented by anyone on record. It was not invented by Adam Smith. It was not invented by the Greeks.&amp;nbsp; It wasn't even really invented by the Sumerians who first practiced it over 5,000 years ago.&lt;br /&gt;
Rather, it appears have emerged out of a soup of diversity in the plains of Mesopotamia between the Tigres and Euphrates sometime around 8,000 years ago by who we now refer to as the Ubaidian Culture.&lt;br /&gt;
&lt;br /&gt;
We believe that interest emerged during this period through the discovery of clay "counting tokens".&amp;nbsp; These "counting tokens" with symbols depicting animals and other assets were then placed in clay envelops (known as 'bulla') about the size and shape of a softball and fire kilned, preserving them to this day.&lt;br /&gt;
&lt;br /&gt;
We cannot say for sure why or how these tokens were created and used, but there is something of a consensus among archeologists, anthropologists, historians, and economists that these counting tokens were the original contracts and they were for lending with interest.&lt;br /&gt;
&lt;br /&gt;
It sounds almost impossible even anachronistic that people were taking out interest based loans nearly 8,000 years ago, but if you follow me I'll show this may have happened:&lt;br /&gt;
&lt;br /&gt;
Imagine you are a sheep herder who owns a flock of sheep.&amp;nbsp; You use the sheep for their wool which you provide to your family and extended family (your tribe).&amp;nbsp; One day something happens, and a pride of lions destroys a quarter of your flock (there used to be lions in Mesopotamia at this time - but recreational hunters &lt;a href="https://en.wikipedia.org/wiki/Lion_Hunt_of_Ashurbanipal"&gt;later wiped them out&lt;/a&gt;). The situation is dire but there is still hope. You have the skills to get your flock back up to its original size, you just need a couple of rams for breeding and a few extra ewes to fill the gap.&amp;nbsp; You also know of another sheep herder that is less than a days walk away.&amp;nbsp; Borrowing sheep in times of need is quite common and surely you would lend a few sheep to someone you know who is in need?&lt;br /&gt;
&lt;br /&gt;
But as you are walking to the other sheep herder's town you realize that you don't really know this person that well and he is from another tribe and worships a different god. He might not feel the need to do you any favours. The fear of rejection is going through your mind as you keep walking.&amp;nbsp; You need to figure out how to sweeten the deal for this guy. Simply asking for help as an act of charity might be a bridge too far. So you are thinking "how can I repay this person to make it worth their while, when I have nothing to give up front"?&amp;nbsp; And then you are hit with an idea: You realize that once you can get the rams to breed more lambs you may have a small surplus of sheep, and you can pledge that surplus back to the lender. Win win. Bingo!&lt;br /&gt;
&lt;br /&gt;
You might not be a mathematician, but you know how to count and how to add and subtract numbers for the purpose of keeping track of your sheep.&amp;nbsp; So through your basic knowledge of quantities you come up with a number of lambs that you feel is a fair amount.&amp;nbsp; After all, it is your skill as a sheep herder that is allowing these sheep to safely multiply (and now you're even better cause you have since learned how to deal with hazards from lions).&lt;br /&gt;
&lt;br /&gt;
You eventually make it to the foreign sheep herder before sundown and manage to meet him in person.&amp;nbsp; You tell him of your predicament and then quickly explain that you realize you're not from the same family nor worship the same god but would like to make a deal that will make it worth his time.&amp;nbsp; You then explain to the foreign sheep herder that you will borrow 50 sheep from his flock for a year including at least two rams, and then you will repay him the entire flock of 50 sheep plus an additional 10 lambs.&lt;br /&gt;
The foreign sheep herder - who already has far more sheep than he or his tribe actually needs - considers this offer.&amp;nbsp; In his mind he might think to himself "Well I've always been proud of the fact that I own the most sheep in my town, but there is another herder who lives within a 2 day walk from me that I have heard has the most sheep of any herder. I have always envied and fantasized about being that person. I bet that I could be that person if I allow for this loan and it works out for me.&amp;nbsp; Heck, if I pull off a few more of these loans I could become the most powerful sheep herder in all of Mesopotamia."&lt;br /&gt;
And so one man's fantasy of salvation is married up with another man's fantasy of domination.&lt;br /&gt;
One last thing is required to seal the deal: A contract.&lt;br /&gt;
&lt;br /&gt;
Fortunately, being near the Tigres (or Euphrates) there is no shortage of soft clay.&amp;nbsp; While you were on your way to meet the sheep herder you scooped up some clay and form it into a small discs - one for each sheep that you would repay. Using a reed you etch a small symbol on each token. Sixty (60) tokens in all. It took about an hour to do this.&lt;br /&gt;
On your instruction, your lender takes the tokens, bakes them in a kiln, and then takes the fired tokens and places them in a larger clay envelope.&amp;nbsp; You then imprint a symbol known to represent your tribe, this symbol is a representation of the god your tribe worships and who protects you.&amp;nbsp; Your god is now a party to the contract.&amp;nbsp; Don't mess this up or you could bring a much bigger catastrophe to your tribe. Before sealing this contract you make it clear what will happen if the sheep cannot be repaid: You will give your life to this man.&amp;nbsp; You will allow him to enslave you.&amp;nbsp; Hopefully this will not happen.&lt;br /&gt;
The contract has been signed and so the deal has been made.&amp;nbsp; Interest/usury has now emerged as a new concept.&lt;br /&gt;
&lt;br /&gt;
If the more powerful sheep herder had been an enemy, our hero would not have bothered to make the journey. It would have been too dangerous.&amp;nbsp; But if the more powerful sheep herder had been part of the same tribe, our hero would not have felt the need to make such a deal and would have made an appeal to charity.&amp;nbsp; And if there was no clay to scoop up and shape along the way, none of this would work.&lt;br /&gt;
&lt;br /&gt;
It is only through the ability of these individuals - each with their own selfish needs - to collaborate from which a new invention emerges and thrives and multiplies: Interest based contracts.&lt;br /&gt;
A lot of things needed to happen in order for this contract to emerge, but most importantly a diverse population competing for resources is at the core of this organic invention.&lt;br /&gt;
&lt;br /&gt;
You can also look at interest as a kind of 'virus' just like COVID-19.&amp;nbsp; This is why interest/usury is expressly forbidden by all Abrahamic religions (unless it's for people in another religion - just don't charge people in your own religion).&amp;nbsp; The oldest critique against interest was discovered in India and dated to the &lt;a href="http://www.alastairmcintosh.com/articles/1998_usury.htm"&gt;second millenium BCE&lt;/a&gt;.&amp;nbsp; India (and by extension China) never really had interest for most of their history until nearly the 20th century.&lt;br /&gt;
&lt;br /&gt;
What do I think of interest?&amp;nbsp; Well I can see why the Indian Vedics were suspicious of it. I think it's rocket fuel that has a force multiplier effect that has no equal, and should be treated like the powerful rocket fuel it is.&amp;nbsp; I think it's both useful and dangerous.&amp;nbsp; As a rule of thumb, keeping interest below 10% can be sustainable, but interest above 20% is difficult to sustain.&amp;nbsp; That's just the beginning of what is needed to regulate interest and I won't get into all of that here.&lt;br /&gt;
&lt;br /&gt;
Viruses are also useful.&amp;nbsp; We now increasingly use viruses to combat genetic diseases like &lt;a href="https://www.nature.com/articles/d41586-018-07646-w"&gt;Sickle Cell Anemia&lt;/a&gt;.&amp;nbsp; And it was only through the study of extremely weird and obscure viruses that emerged out of novel pools of diversity that we ever even came to discovering that it's possible for a virus to re-program our genetic code to cure disease.&amp;nbsp; It's actually one of the most mind-blowing inventions of the 21st century.&lt;br /&gt;
&lt;br /&gt;
And this now brings us to why I love CSV (comma separated values) Files for storing structured data more than any other format out there...&lt;br /&gt;
In short, it's because CSV is an organic standard that was selected from a primordial soup of diversity.&lt;br /&gt;
&lt;br /&gt;
There are many file formats out there for structured data.&amp;nbsp; If you are on the business side, it's probably Microsoft Excel.&lt;br /&gt;
Or maybe it's 'CSV'&lt;br /&gt;
&lt;br /&gt;
If you are the technology side it's probably something like XML or JSON or YAML or Pickle, or if you work with 'big data', either Parquet or Avro or ORC.&lt;br /&gt;
Or maybe it's CSV.&lt;br /&gt;
&lt;br /&gt;
What do I mean by CSV being organic?&lt;br /&gt;
What I mean is that the CSV format was selected and refined through a process of diversity similar to how interest/usury was sparked out of Mesopotamia's diversity and similar to how powerful viruses are selected from the novel mingling of animals.&lt;br /&gt;
&lt;br /&gt;
From best I can tell, there is no inventor of the format as it is today - it merely emerged as a standard that was retroactively made to appear as a top-down engineered standard after the fact.&lt;br /&gt;
You can tell this from looking at this list of competing &lt;a href="https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats"&gt;file formats &lt;/a&gt;and from the CSV&amp;nbsp;&lt;a href="https://tools.ietf.org/html/rfc4180#page-2"&gt;RFC introduction&lt;/a&gt;.&amp;nbsp; CSV is the only standard with no clear inventor on this file formats list.&amp;nbsp; It's just there with a retroactively defined RFC (RFCs are standards developed by the &lt;a href="https://www.ieee.org/"&gt;IEEE&lt;/a&gt;, a governing body of the Internet).&lt;br /&gt;
This &lt;a href="https://blog.sqlizer.io/posts/csv-history/"&gt;article&lt;/a&gt; provides a brief history of how the standard may have originated and points out that it emerged around the same time as the Relational Database (RDBMS).&amp;nbsp; There were lots of other standards at this time too. But the CSV has been embraced by folks on both the Business side and Technology side of most organizations, so there is something about it that has universal appeal.&amp;nbsp; I think it's because it's both compact and easily readable and almost reminds of lists we might otherwise read in a book or newspaper.&amp;nbsp; I also think its popularity has to do with the fact that it closely aligns to the &lt;a href="https://en.wikipedia.org/wiki/Relational_model"&gt;Relational Model&lt;/a&gt;&amp;nbsp;- a perspective neutral approach to data modelling, which I'll explain later.&lt;br /&gt;
&lt;br /&gt;
No other data file format has been embraced by both the Business side and Technology side of organizations like the CSV File.&lt;br /&gt;
&lt;br /&gt;
There is power here.&lt;br /&gt;
&lt;br /&gt;
Why is it not even more popular then?&lt;br /&gt;
&lt;br /&gt;
There are a couple main reasons why CSVs frustrate technologists.&lt;br /&gt;
I'm going to explain those reasons below, but will indent the text so you can skip over these reasons if you don't really care about what technologists think.&lt;br /&gt;
&lt;br /&gt;
&lt;blockquote class="tr_bq"&gt;
First, because CSVs were never designed as a standard in the first place, the encoders do not always encode information consistently.&amp;nbsp; A common situation is when you have a comma in the text value itself.&amp;nbsp; This in turn has led to another organic invention of simply requiring that values that contain commas must be enclosed in double quote characters.&amp;nbsp; Furthermore if the string contains both a quote and a comma, the quotes within the text value must be 'escaped' by preceding the character with another double quote.&lt;br /&gt;If I wanted to encode this text in CSV: Hello, "world"&lt;br /&gt;I would encode it like this: "Hello, ""world"""&lt;br /&gt;That's basically the only rule, it's all you need to know.&lt;br /&gt;But as simple as that rule is, many programs that export to CSV neglect to enforce this rule (or at least used to - most programs no longer make this mistake).&lt;br /&gt;Second and more recently (well since around 2010 when Hadoop came on to the scene) there has been another criticism of CSV: CSV files rely on newline characters as record delimiters while permitting those same newline characters to be enclosed in quotes (just like commas).&lt;br /&gt;Most systems don't have a problem with this, but in the world of Hadoop there are libraries known as "Sedars" (Serial Decoders), and the most popular Sedar for CSV files has made it a rule that CSVs cannot contain newline characters in string values. The reason is that Hadoop operates on a "divide and conquer" paradigm (known as map/reduce), and wants to split data off into chunks as efficiently as possible with the smallest chunk being a single record.&amp;nbsp; So if you can assume that all chunks are separated by newline characters, end of story, then you can more efficiently split up files for sub-processing by respective worker nodes.&amp;nbsp; But if you need to perform the additional step of looking for enclosure characters then it is harder to achieve the same level of processing efficiency.&lt;br /&gt;So if you ask me, it's just a matter of correcting the Sedars and respecting the CSV format's simplicity. In other words, I should be able to choose a Sedar that doesn't have this limitation.&amp;nbsp; But for reasons I'm not aware of, this doesn't seem to be in the offing.&lt;/blockquote&gt;
&lt;br /&gt;
While pushback against CSVs mainly comes from technologists, business users will often prefer the Excel workbook format over CSVs. Why is that?&lt;br /&gt;
The reason comes down to Metadata.&amp;nbsp; CSVs just contain the data values with little or no information about how those values should be interpreted.&amp;nbsp; The only information that does exist (if it is even written out) is the header row.&amp;nbsp; Excel workbooks also allow you to bundle together multiple worksheets, which would require separate CSV files.&lt;br /&gt;
&lt;br /&gt;
So I have two responses to the Excel advocate:&amp;nbsp; First, the CSV does have a sufficient amount of Metadata for most list taking purposes and is extremely lightweight.&amp;nbsp; Case-in-point: I maintain several key lists just on the notepad app on my phone.&amp;nbsp; I'll share a portion of one with you here right now (you could even paste this into a text document, save as a CSV, and boom you have a database you can load into a BI application like Power BI or Qlik or Tableau or just Excel):&lt;br /&gt;
-----&lt;br /&gt;
Restaurant,City&lt;br /&gt;
El Coyote,Los Angeles&lt;br /&gt;
Rutt's Cafe,Los Angeles&lt;br /&gt;
Denver Biscuit Co.,Denver&lt;br /&gt;
Bubby's,NYC&lt;br /&gt;
H Bar,Toronto&lt;br /&gt;
Peppercorn,Las Vegas&lt;br /&gt;
The Argo,Vancouver&lt;br /&gt;
-----&lt;br /&gt;
I keep similar lists for books and movies.&lt;br /&gt;
I could use an App I suppose or some services, but them I'm locked into their tools making it tricky to blend that information with other data I have.&amp;nbsp; Those Apps also don't allow me to visualize and analyze the data in my own way.&lt;br /&gt;
&lt;br /&gt;
The other reason why Business users might push back on using CSV is that Excel allows them to easily blend Metadata such as text formatting (e.g. underline, bold, italics) and formulas (e.g. A1+B1).&lt;br /&gt;
&lt;br /&gt;
But Metadata is just data, and if Microsoft wanted to, they could separate out all this information into one or more additional CSV files that are then bundled with the underlying data CSV files, in&amp;nbsp; a single folder and ZIP file while presenting everything to the user in a neatly integrated form. Microsoft does do something like this behind the scenes already. Although they are doing so in proprietary formats, rather they are doing what could have been accomplished in zipped CSV files in proprietary formats instead.&lt;br /&gt;
&lt;br /&gt;
These proprietary formats like .xlsx and .xls and .xlsm are all well and good if you are working entirely within the Microsoft Office ecosystem, but even if you move to Microsoft's Azure platform, the Excel workbook is persona-non-grata.&amp;nbsp; So it's limited there.&lt;br /&gt;
&lt;br /&gt;
Microsoft's decision to store structured data in proprietary formats was probably made during a time where the performance of opening and saving an Excel document was a major concern.&amp;nbsp; This is the same thinking that led many mainframe developers to store dates using a 2-digit year as opposed to the full 4-digit year.&amp;nbsp;&lt;br /&gt;
Since the whole 'Y2K' problem (which is still ongoing - just look at your credit card's expiry date) went down, very few people now would see it being sensible to store a 2-digit year given that we are routinely storing movies, pictures, and other forms of unstructured data in volumes that dwarf most structured data stores.&lt;br /&gt;
Storing all structured data and Metadata in CSV format would surely annoy some technologists.&amp;nbsp; But it would also provide a way of flattening away today's database bureaucracy which does far more damage than any minor performance issue.&lt;br /&gt;
But if Microsoft were to do that, they might lose their grip on spreadsheets. So they don't much motivation to see this happen.&amp;nbsp; It would only benefit the vast majority of people, but just not them.&lt;br /&gt;
Oh well.&lt;br /&gt;
&lt;br /&gt;
Only the humble CSV files seems to bridge both the Business and Technology worlds.&amp;nbsp; It is the closest thing we have to a lingua franca for structured data.&lt;br /&gt;
&lt;br /&gt;
But it goes deeper than just this encoding.&amp;nbsp; CSVs also nudge us towards 'perspective neutral' data models (i.e. &lt;a href="https://en.wikipedia.org/wiki/Database_normalization"&gt;normalized&lt;/a&gt; models) rather than lock us into hierarchical perspectives.&lt;br /&gt;
&lt;br /&gt;
Take for example the JSON format which is popular among developers and is also a text format similar to CSV. Yet, unlike CSV it prefers a hierarchical representation of data.&lt;br /&gt;
Using my above example of Restaurant and City which has one row per relation, a JSON file format might decide to encode this as a parent-child relationship with the parent being the City and the child being the Restaurant, especially if there were many restaurants listed under a single city.&lt;br /&gt;
To look up any Restaurant I would first need to traverse the parent City in order to get to it.&amp;nbsp; This is also known as "pointer chasing" and is the reason we shifted from Network and Hierarchical databases (like &lt;a href="https://en.wikipedia.org/wiki/Integrated_Data_Store"&gt;IDS&lt;/a&gt; and &lt;a href="https://en.wikipedia.org/wiki/IBM_Information_Management_System"&gt;IMS&lt;/a&gt;) to Relational databases starting in the 1970s and throughout the 1980s and 1990s.&amp;nbsp; The problem with hierarchical data models is that they tend to lock you into a certain perspective whereas the relational model (which CSVs align to) work more like the human mind, where we can jump from list to list to list without any preconceived hierarchy.&lt;br /&gt;
&lt;br /&gt;
Because most data developers now prefer the relational model, I routinely see JSON documents that are encoded in such a way as to not be hierarchical and simply be lists just like CSV files. Ironic, but expected.&amp;nbsp; CSVs on the other hand are much more compact and readable than any JSON document for this purpose.&lt;br /&gt;
&lt;br /&gt;
It's also for this reason that spreadsheet tools like Excel cannot directly open JSON documents.&amp;nbsp; In order to analyze a JSON document in a tool like Power BI, it is necessary to "flatten" the hierarchies into explicit rows and columns.&amp;nbsp; Defenders of hierarchical formats like JSON point out that they are useful because they allow us to blend a number of related entities into a single document. It's similar to the reason why Business users like to bundle multiple worksheets into a single .xlsx workbook.&lt;br /&gt;
My response to this is: Why not just store the CSVs in a single folder? You can then easily move the folder as a ZIP or RAR file, and its contents can more easily be analyzed as rows and columns, and more easily combined in novel ways.&lt;br /&gt;
&lt;br /&gt;
As a data architect, the CSV feels like that Mesopotamian clay in hand.&amp;nbsp; I can employ the power of relational algebra to combine and analyze CSVs at the speed of thought - pivoting from entity to entity as my stream of consciousness takes me.&amp;nbsp; Tools like Power BI and Qlik Sense allow me to rapidly connect these sets together like snapping together Lego blocks.&lt;br /&gt;
Give me a set of CSVs and some interesting questions at 9 AM, and I'll have your answers and insights by noon.&lt;br /&gt;
But give me a hierarchical format, and I'm mucking around with with an unwanted homework assignment of picking apart hierarchies and flattening them into CSVs that can be manipulated like my beloved Lego blocks.&lt;br /&gt;
&lt;br /&gt;
Why does any of this matter?!?&lt;br /&gt;
Well this blog post is really just warming up for something much bigger: A rethink of how we manage and access all structured information.&lt;br /&gt;
&lt;br /&gt;
And what I will attempt to demonstrate in the near future is that it is possible to manage ALL structured information through the humble CSV, if we have the collective will to make this happen. Think of CSVs as universal well worn data Lego blocks.&lt;br /&gt;
&lt;br /&gt;
And why does that matter?&lt;br /&gt;
Most information problems can be boiled down to a combination of human and technology bureaucracies leading to data gate-keeping and information asymmetry.&amp;nbsp; In many cases gate-keeping is desired and with merit.&amp;nbsp; But in most cases I come across, data gatekeeping is enabled through a by-product of technological bureaucracies&amp;nbsp;built on an alliance of information system vendors who build the systems, consultants who implement these systems, and custodians that operate these systems.&lt;br /&gt;
Show me a software vendor working together with consultants and custodians, and I'll show you a locked-in service layer that often results in confusion and bureaucracy to those on the wrong side of the service.&lt;br /&gt;
&lt;br /&gt;
Perhaps that's the way it has always been with Information Technology. &lt;a href="https://wiki.c2.com/?CyberCrud"&gt;Cyber Crud&lt;/a&gt; has been with us for a very long time, just ask &lt;a href="https://en.wikipedia.org/wiki/Ted_Nelson"&gt;Ted Nelson&lt;/a&gt;.&lt;br /&gt;
I hope to show you that this is not how it has to be.&lt;br /&gt;
&lt;br /&gt;
Stay tuned...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;</description><link>http://hepburndata.blogspot.com/2020/04/the-power-of-diversity-why-i-love.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-1168906514177260876</guid><pubDate>Tue, 31 Mar 2020 01:00:00 +0000</pubDate><atom:updated>2020-05-03T08:57:55.963-04:00</atom:updated><title>Does this Patent (which expires today) Explain Qlik and Power BI’s Opposing Philosophies?</title><description>&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Over the past five years or so I have taken a deeper interest in philosophy. More to the point, I have become especially curious as to WHY certain philosophies are adopted over others.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;If you have had the chance to listen to my podcast “&lt;a href="https://cradleofanalytics.z27.web.core.windows.net/"&gt;Cradle of Analytics&lt;/a&gt;”, you may know that I spent ten episodes (and nearly nine hours) explaining the “Origins of Analytics”, and why I believe that so-called “deductive-analytical thinking” is at the root of Western culture and philosophy. While the capability to thinking deductive-analytically comes from writing, the &lt;i&gt;appetite&lt;/i&gt; for this mode of thinking can be traced back to Mesopotamian financial interest based contracts from as far back as 5,900 BCE, starting with the Ubaidian culture, followed by Sumerian culture who developed temples into early banking institutions.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The Phoenicians would ultimately transmit this knowledge of usury to the Greeks in the 9&lt;/span&gt;&lt;span class="s2" style="font-kerning: none; font-size: 7.3px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal;"&gt;&lt;sup&gt;th&lt;/sup&gt;&lt;/span&gt;&lt;span class="s1" style="font-kerning: none;"&gt; century BCE.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In my view, Western culture and philosophy has more-or-less been adopted as a “meta-philosophy” that subconsciously affects the core of our very thinking. It is part of our identity now.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I argue in the podcast that this appetite for “deductive analytical” thinking (or “digital thinking” as opposed to “analog thinking”) emanated from Mesopotamia and was eventually codified by Aristotle in the 4&lt;/span&gt;&lt;span class="s2" style="font-kerning: none; font-size: 7.3px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal;"&gt;&lt;sup&gt;th&lt;/sup&gt;&lt;/span&gt;&lt;span class="s1" style="font-kerning: none;"&gt; century BCE, and that this codification of deductive logic (the “syllogism” as described in Aristotle’s “Prior Analytics”) quickly led to the development of the “axiomatic method” (as codified by Euclid), and would eventually lead to the scientific revolution as best exemplified by Johannes Kepler (superseding Galileo Galilei) and best codified by Charles Sanders Pierce (superseding William Whewell).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This in turn has led us to humankind’s ultimate product of thinking efficiency: The invention of the Turing complete programmable computer. This invention will eventually automate all rules-based tasks – if we allow them to, and possibly allow humanity to expand across the universe in a scalable manner – if we have the will.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In other words, I believe that there is a vein of history that you can trace from today back nearly 8,000 years and see that there has been for some time evidence of “digital thinking”. In fact, all Western religions (Judaism, Christianity, Islam) run off this same basic operating system of so-called “sovereign laws” which are structured as ‘If A then B’. &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Digital code runs deep in our collective imagination even if we don’t realize this consciously.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Digital thinking is often wrong – but it is always efficient.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I should point out that non-Western “analog” thinking tends to be based on – you guessed it – thinking through analogies.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Charles Sanders Pierce coined the term “abduction” which describes a specific application of analogy thinking to solve problems.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Abduction is very difficult to explain and is often conflated with deduction, and philosophers and psychologists argue over what it is to this very day.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;The point is, digital thinking tends to be more top-down and is efficient because it leverages what we might call “artificial intelligence” (which is to say formulas and rules that have already been tried and tested), but can sometimes lead you astray, often with disastrous consequences.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Whereas “analog” thinking leans more on abduction and is where genius resides, but is less powerful without the aid of formulas and rules to build upon.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I could go on here, but I don’t want to digress too much.&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;If you are specifically curious about the origins of Western philosophy, just listen to the podcast.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;The reason why I bring this up and the main point I am making here and in my podcast is that I have come to see top-down thinking and bottom-up thinking as philosophies unto themselves and we often don’t realize when we are embracing one philosophy and not the other. In this blog post I will be explaining and demonstrating that Microsoft Power BI has adopted a more “top down” philosophy whereas Qlik offers a more “bottom up” philosophy, and why I believe the “bottom up” philosophy is often ignored even though I believe it offers advantages and benefits over Microsoft’s “top down” philosophy.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Put another way, I might just label these two philosophies (which can both to a large extent be followed within Power BI):&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The Tao of Microsoft Power BI data modelling&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The Tao of Qlik data modelling&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But before I show you the example which illustrates the differences I should go over some history…&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To understand the relationship between Power BI and Qlik, you need to back to the origins of Qlik and Power BI.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Here is my abridged version of Qlik and Power BI’s history (I apologize if I have made any errors here):&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In 1993 the first version of QlikView (originally called “QuikView” after the acronym: Quality; Understanding; Interaction; and Knowledge) was released.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;QlikTech (the owning company) was [and to a large extent still is] based in Sweden. The first two versions of QlikView were built on Microsoft Excel. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;What differentiated QlikView from other BI tools was and is its colour coded filter boxes (known as “List Boxes” in QlikView and “Filters” in Qlik Sense). I refer to this capability as Qlik’s “State-Aware User-Experience” or “State-Aware UX”.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In 1997 version 3 of QlikView is released. But it has now been entirely rewritten in compiled “C” language for maximum efficiency.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;This version of Qlik also codified two other aspects of Qlik that would become both indispensable and difficult to explain:&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;A SEMI-JOIN link indexing system for efficient semi-joins&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The main benefit of this is that this allows us to easily integrate multiple fact tables without running into a “fan trap” that leads to duplicates which normally occur when JOINing multiple Fact tables to form “cubes”&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The other benefit of this is that this enables self-serve Data Prep, since it is possible to naïvely (or playfully) integrate new entities in a modular way that is safe and does not corrupt existing entities through merge/join set operations which can lead to “fan trap” duplicates&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;An associative storage and query engine.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This associative engine stores every unique data element value exactly once with all tables and columns pointing back to a reference of this value – regardless of whether the value is a number, a date or a string.&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The benefit of this associative index is to both:&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol3" style="list-style-type: lower-roman;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Improve data compression; and&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Facilitate full text and numeric range searching, so that even if you don’t know where a specific value might be located in the data model (e.g. your last name), you can easily search against all rows and columns instantly from a single search bar through this “associative index”.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It’s really quite a useful feature&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;It should be noted that Qlik refers to ALL of its innovations under the banner “Associative” even though this is the only feature that is strictly associative (based on the academic definition of “associative”)&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To summarize, here are Qlik’s salient features that distinguished it from its competitors at this time (and to a large extent even to the present day):&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;State-Aware User Experience&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Linking SEMI-JOIN Model&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Associative full text searching&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="p2" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; min-height: 13px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Even with these benefits Qlik did not at this time displace incumbent BI (Business Intelligence) platforms like Cognos, Business Objects, Microstrategy, and Microsoft Analysis Services, in the same way the automobile displaced the horse.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;One possible explanation for this is that most BI platforms are never used for business analytics to begin with and are merely glorified data extraction engines, whereby users will simply:&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Find a report (or reports) they need&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Download all reports into an Excel friendly format (e.g. CSV, TAB, XLS, XLSX)&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Open the CSV or Excel documents in a new Excel workbook and begin a process of manual Data Prep usually entailing: Cut-copy-paste; VLOOKUP (to integrate data); and Pivot Tables (to summarize data)&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Present the final output in the form of a beautified Excel document or PowerPoint/PDF Report&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="p2" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; min-height: 13px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;That notwithstanding, there is a need for interactive dashboards that go above and beyond glorified data extraction engines, so you would think that Qlik would be more successful during this time with so little competition.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Unfortunately, this is not how technology and innovation works.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Henry Ford is said to have once said “If I asked people what they wanted, they would ask for a faster horse.”&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;With Ford here implying that people would not ask for an automobile because they could not conceive of it even though they would clearly benefit from its efficiencies.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;History has been kind to this quote (Ford himself was not a very nice man).&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;With something like an automobile it is easy to see its benefits and why one would want an automobile over a horse.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But most innovations are not as apparent as most inventions are often just a cog in a bigger machine.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Unless you can overturn the entire machine itself (and not just improve parts of it), then you are always in a vulnerable position as an innovator.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;This unfortunately is where Qlik has always been; A set of clever – even brilliant -&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;inventions but not enough to overturn how the entire eco-system that Business Intelligence software thrives in.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This eco-system includes databases, ETL tools, Semantic Layers, and Data Visualization tools, not to mention the IT professionals who have staked their careers in learning these tools and how they interact with one another.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I will continue to explain Qlik’s history, but first I should point out when&lt;i&gt; I&lt;/i&gt; first came to learn about Qlik and how I saw its history unfold before my own eyes.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;It was in 2005 while on a vacation in Florida. I reconnected with a friend of my wife who had recently become very interested in Business Intelligence.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;She had heard that I had also worked in BI and wanted to get my opinion about some software she and a colleague had developed some dashboards with.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;At the time I was somewhat new to BI but had read Ralph Kimball’s “Data Warehouse Toolkit” and had learned the Cognos stack.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I understood the ins-and-outs of “cubes”, “dimensions”, “measures”, and detail-drill through reports.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In my mind I felt somewhat confident that I knew what “Business Intelligence” was and how it could help companies.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But what I saw in that demo that morning over brunch blew my mind.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The first thing I noticed was the speed.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Everything was instantaneous with no lag.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;She would click in a chart and everything on the page would appear to magically update itself to be consistent with her selections.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I then noticed how she would casually go to any field and filter on it.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It felt like the world of Dimensions and Measures I had been accustomed to had been flattened and that hierarchies were something arbitrary, and that one could merely follow one’s own train of thought to answer any business question.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I was blown away and could not stop thinking about Qlik.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;After returning from my vacation I continued to think about the demo I saw.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It just didn’t line up with what I knew about BI through Cognos and I would mention it to people who I thought were BI experts and they never heard of it.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;When I searched Google for reasons Qlik was not more popular, I eventually found one that made some sense to me: Qlik’s indexing technology required all data be loaded into RAM (computer memory) and that RAM was scarce so Qlik was not a realistic technology for many large companies with lots of data.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;A big problem was that most computers ran on 32-bit operating systems that could only reference 4 GB of RAM in total and only 2 GB of RAM for any given program.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But around 2008 something started to change.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;What happened was this was around the time that Microsoft “Windows 7” came out and Windows 7 was selling more 64-bit versions than 32-bit versions.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In turn this meant that 64-bit versions of applications could now address up to 2 TB (terabytes) of RAM which at the time was more than most companies had in their data warehouses.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;As a result, Qlik was now positioned to take on large volumes of data and began to surge in popularity. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;This would be the beginning of a new golden era for Qlik.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;By 2009 Microsoft caught wind of this disruptive trend and had announced a new Business Intelligence platform was under development.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This platform was code named “Project Gemini” and was slated for an initial release by the end of 2010.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I followed this news closely and recall Project Gemini having the most lead time of any Microsoft project to date.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In other words, it felt like Microsoft was firing a shot across the bow against Qlik.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I was excited by this news.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I liked the idea that Microsoft was embracing Qlik’s innovations and looked forward to having an alternative to Qlik with all the same benefits.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I had experimented with other tools like Tableau and Microstrategy and others but because they were all built on cube/OLAP based platforms (even Tableau), they lacked the flexibility, responsiveness, and User Experience that Qlik had.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;By 2010 Microsoft had released the first deliverable from Project Gemini: PowerPivot, a plug-in for Excel.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;After experimenting with PowerPivot for while I could see its potential.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I was relieved to see Microsoft had essentially copied Qlik’s SEMI-JOIN linking model which Microsoft dubbed “The Tabular Model” and they also introduced a new language to go along with the Tabular Model called DAX (Data Analysis Expressions), replacing the older MDX (Multi-Dimensional Expressions) language which had been built for the “Multi-Dimensional Model” which was now seeming obsolete.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Currently, MDX is no longer supported by Microsoft in their cloud products – it’s Tabular and DAX all the way now.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I warned BI sales people that Microsoft was just warming up and had laid the foundation for disrupting the BI industry given their huge platform leverage – more so with their Office suite than the Windows platform.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I am aware of Microsoft’s history as the “fast follower” who gobbles up disruptive technologies like Qlik in the same way Star Trek’s “Borg” assimilates new life forms into its Cube Colonies.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Microsoft is after all a creature of capitalism, and capitalism is all about leverage, and Microsoft’s platforms give it huge amounts of leverage. It’s not even close to an even playing field in the software industry.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Case-in-point: There was a big law suit that ended in 1999 with Microsoft being ordered to be broken up after destroying Netscape (and countless other small companies).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But then George W. Bush was elected and that judge’s decision was overruled and basically canceled.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This was still a black-eye for Microsoft and they behave more gently than in those days – some say Google would not exist had this not been the case – but few people know anything about Business Intelligence (it’s not quite a consumer technology), so I figured Qlik would be Microsoft’s crosshairs.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To be clear, I am speculating here.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I don’t have any documentary evidence to present here that Microsoft was indeed “following” Qlik; This is an educated guess.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Regardless of their motivations, Microsoft had developed a tool that for once presented a formidable competitive product to Qlik.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;It should be noted that Tableau was also disruptive (their golden era would arrive around 2012) as they had developed a very user friendly “drag-and-drop” interface for self-serve BI.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;They also invested heavily in engaging data visualizations.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It has been said that Narcissus was seduced by the image and in this same way Tableau was very seductive; it had superior aesthetic to its competition.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Tableau also wisely invested in academia by giving its licenses for free to all college and university students. But Tableau eventually would see its own growth stunted for it lacked the depth of tools like Qlik and Power BI, and many users would often hit brick walls concerning scalability and performance with Tableau.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Enough about Tableau.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;It would not be until 2015 that Power BI was released.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The main reason for it taking so long between the release of PowerPivot and Power BI is that Microsoft unfortunately lost time on a fruitless detour.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Namely, in response to the criticisms of PowerPivot being too much of an Excel tool and not like a “real BI tool”, Microsoft’s answer to this problem was to invent “PowerView”.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;PowerView was a server based product that would finally allow IT departments to roll out “Enterprise Business Intelligence” solutions through the users web browser.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The technology was built an another Microsoft technology called “SilverLight”. Silverlight ended up being shut down as it was deemed by the wider technology community to be something of a withered rump following a keynote speech by Steve Jobs where Jobs stated that he would not support Adobe Flash on the iPhone as it was seen by Apple as a battery hog.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Instead, Jobs went on, Apple would throw its support behind the open HTML5 standard which could be made more energy efficient to run.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;That was then death knell for Adobe Flash and would be the death knell for Microsoft Silverlight, and in turn PowerView.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In 2015 Power BI was released – built entirely on the open HTML5 standard that Steve Jobs was hyping for the iPhone.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In 2016 I was a consultant working for a large business and technology consulting firm.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;We were technology advisors and perhaps I was lucky, but I can genuinely say that the people I was working for were happy for me to recommend Qlik as the best Business Intelligence software with Power BI being the next best alternative starting in 2015. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Nevertheless, I couldn’t help but notice that even if I could convince a business department that Qlik was the way to go, I would inevitably get blocked by an IT department that would claim they had decided some other tool was already decided upon and that there was no room for Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I would explain the reasons why Qlik was able to meet business scenarios better than the tool they had decided on.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Sometimes I would even be able to persuade the person I was speaking with, but they would inevitably send me an e-mail explaining the decision was out of their hands.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Some of my colleagues at the consulting firm I worked for took notice of Qlik and admired it, but if their careers were deeply intertwined with Microsoft or another large vendor, they would continue to cast aspersions of doubt around Qlik’s ability to do this or that (usually scaling to beyond 2 Terabytes).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But as soon as Power BI started to gain traction, the Microsoft consultants were quick to forget their skepticism of Qlik and were almost instantly True Believers of Power BI.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;It was at this point that I could see the power of Microsoft’s leverage working from all sides.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Part of me was happy to see that Qlik had a strong competitor as it gave me more leverage when negotiating license terms with Qlik. I had also hoped that others would see the underlying relationship between Qlik and Power BI.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;This has barely happened.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I mentioned earlier that “Narcissus was seduced by the image”.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;If you are lucky to get a captive audience and show them a 10 minute demo of some BI dashboard, no one in the audience will be able to tell you anything about the underlying data model and how it works.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;It’s invisible. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;And when it comes to features like State-Aware UX and Full-text associative searching, unless you have come to appreciate those benefits you will lack the connoisseurship required to appreciate these things.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To use an analogy, when it comes to Business Analytics, most people’s level of connoisseurship would barely allow them to tell the difference between an overcooked Salisbury Steak and a perfectly cooked Chateaubriand.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;If this analogy comes across as snobby and out-of-touch, well that’s partly my point.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The finer things in life are typically out of the reach of most people – not as much for lack of money, but mostly for lack of education.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Yes, the two are related, but it is also possible to have one without the other, and in this context, connoisseurship is more important than money, and connoisseurship (refined education) can be achieved without a lot money if you are passionate about the subject at hand.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;This finally brings me to the point of this blog: I have recently concluded that while Microsoft may have been a “fast follower” of Qlik, the Microsoft Power BI culture lacks the connoisseurship of Qlik and has – in my opinion – subconsciously embraced a philosophy that is more top-down and cube-like than what it is truly capable of.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;What are these two philosophies exactly?&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Qlik’s philosophy tilts bottom-up, and believes that all Business Intelligence should work like the human mind – where we can hop from concept to concept with no preconceived structures&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;The example they often site is how our thought process works if we lose something like our keys:&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol3" style="list-style-type: lower-roman;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;We scan through a list of places we have been (e.g. outside walking, driving, going to the mall, etc.)&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Then we drill-down on subjects that we think are more relevant; We might start to list out all the places in the shopping mall we visited&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Eventually after jumping from list to list, we eventually find what we were looking for “I left my keys in the shoe store in the mall when I put them down to pay for the shoes”&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Microsoft’s philosophy tilts top-down and embraces the perspective of the CEO&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Time should be spent up-front planning out Dimensions and Hierarchies that in turn reflect a “single version of the truth”&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol3" style="list-style-type: lower-roman;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;We always start with Big Rocks (e.g. Years and Countries) that can be broken into smaller rocks (Dates and Towns)&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Outliers can be discovered by drilling progressively from Big Rocks down to Little Rocks&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;A key outlier will eventually be discovered that can be used to drive better results&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I should point out that Qlik’s philosophy allows for an embrace of both the bottom-up and the top-down.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But Microsoft’s top-down philosophy is more difficult to invert.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Put another way “Splitters can be lumped more easily than lumpers can split”&lt;/span&gt;&lt;/div&gt;
&lt;div class="p3" style="color: blue; font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s3" style="font-kerning: none; text-decoration-line: underline;"&gt;&lt;a href="https://en.wikipedia.org/wiki/Prefactoring"&gt;https://en.wikipedia.org/wiki/Prefactoring&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;So in my view, Qlik is more of a “Splitter” philosophy with Microsoft embracing more a “Lumper” philosophy.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;As you shall see, it is possible to make Power BI act like Qlik. Well almost.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To explain more of I mean here and how I came to this conclusion, I have created an example that I believe clearly illustrates these two philosophies.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In this example I have a small dataset that includes 2019 populations for three countries:&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Australia;&lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Canada; and&lt;/span&gt;&lt;/li&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;United States&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;This dataset includes both national totals as well as State/Province totals for each State or Province. Although I have not included any non-state districts or territories.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;So the country total is not exactly the same as the total for all provinces and states.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I have instrumented this dataset such that it supports “natural linking”.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Namely, I have named each column/field such that I am indicating to both Qlik and Power BI how the tables ought to be linked.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In the Qlik’s case, linking is linking. There is no option in telling it how to link apart from the fields to connect.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But with Power BI, a link may be either “Single” direction or “Both” directions (also known as bi-directional links).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;So it is here we can see Microsoft showing its hand when it makes one link “Single” and other “Both”&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I have pasted below snippets of the raw data so you can see how the fields are laid out:&lt;/span&gt;&lt;br /&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/dim_country.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="118" data-original-width="189" src="https://cradleofanalytics.blob.core.windows.net/blog/images/dim_country.png" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;b style="font-family: Carlito;"&gt;Dim_Country&lt;/b&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;b style="font-family: Carlito;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;b style="font-family: Carlito;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/fact_country.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="103" data-original-width="205" src="https://cradleofanalytics.blob.core.windows.net/blog/images/fact_country.png" /&gt;&lt;/a&gt;&amp;nbsp;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;Fact_Country&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/dim_statprov.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="455" data-original-width="450" height="320" src="https://cradleofanalytics.blob.core.windows.net/blog/images/dim_statprov.png" width="316" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;Dim_StatProv&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/fact_statprov.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="457" data-original-width="284" height="320" src="https://cradleofanalytics.blob.core.windows.net/blog/images/fact_statprov.png" width="198" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;Fact_StatProv&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px; text-align: center;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Once I load the data into Power BI as is, Power BI automatically links up the entities based on a combination of looking at Field names and profiling the underlying data cardinality. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Here is Power BI’s automatically generated the model:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_data_model_default_single_direction.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="206" data-original-width="1417" height="92" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_data_model_default_single_direction.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I have highlighted the key fields so you can see clearly how the tables are linked.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Notice how the link between Dim_Country and Dim_StatProv flows in one direction, top-down from Country to State/Province.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Now if you look at the Qlik model you can see it’s more-or-less the same, with the main difference being that all links are bi-directional (all links in Qlik are always bi-directional – there is no Single direction linking in Qlik).&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_data_model.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="216" data-original-width="1048" height="130" src="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_data_model.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;If you just look at these diagrams it will not be clear how this link direction actually impacts the User Experience, so I have built the same dashboard in both Power BI and Qlik Sense so we can get a better sense of how this subtle difference changes the user experience.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To give the dashboards something to analyze I have also created the following three Measures:&lt;/span&gt;&lt;/div&gt;
&lt;ol class="ol1" style="font-size: medium;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Total Country Population&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Defined as: SUM([Country Population])&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Total State/Province Population&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Defined as: SUM([StatProv Population])&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;State Total Percent of Country Population&lt;/span&gt;&lt;/li&gt;
&lt;ol class="ol2" style="list-style-type: lower-alpha;"&gt;
&lt;li class="li1" style="font-family: Carlito; font-size: 11px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px 0px 8px;"&gt;&lt;span class="s1" style="font-kerning: none;"&gt;Defined as: SUM([StatProv Population]) / SUM([Country Population])&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I should also remind you that “Total Country Population” does not equal “Total State/Province Population” because I have left out any regions that are not strictly states or provinces. For example, DC (District of Columbia) and the Yukon Territories are not included in the State/Province populations but are included as part of “Country Population”.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Let us take a look and compare the Power BI and Qlik dashboards respectively, starting with this Power BI Report (dashboard):&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_no_selections.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="804" data-original-width="1416" height="362" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_no_selections.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In this Power BI Report we can see everything looks quite normal and the numbers all appear to make sense except for the bar chart on the lower right (where I have placed a large red ‘X’) which shows the States/Provinces ranked as a percentage of their country’s population.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;We can see California is ranked at the top of the list, when the correct answer should be Ontario. &lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The reason why we get California is that the Single Direction link is preventing Power BI from linking the State’s population back to its parent population, so it is comparing against the population of all three countries. This phenomenon is known as a “chasm trap” and Power BI has stepped right into this trap so to speak.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;That said, there may be ways of getting around chasm trap this using advanced DAX expressions; I don’t want to mislead readers in thinking there is only one way to solve this problem. My point is to show how the default assumptions Power BI makes when creating links can lead to misleading totals when using Measures that aggregate across linked fact tables.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;For comparison let’s look at Qlik’s dashboard:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_no_selections.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="749" data-original-width="1600" height="298" src="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_no_selections.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In this Qlik dashboard we can see that Ontario is ranked at the top of the list with 38.72% of Canada’s population.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I have placed a large green checkmark to make it easy to find.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;You can also see that most of the States/Provinces on this list are from Australia or Canada as you would expect.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Whereas in the Power BI Report, the top of this is mainly made up of large US states.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;The reason why Qlik gets the correct answer here is that its links are all bi-directional (“Both” directions) with no “Single” links at all.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Now let’s look at what I really wanted to demonstrate here which is the interface for making field selections (generally known as “filtering”).&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In this first example Power BI this is how it appears when a country is selected in the Country Slicer:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_country_selection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="815" data-original-width="1427" height="364" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_country_selection.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;As you can see, the interface is intuitive.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;We can see what country (Canada) is selected as well as the alternate countries that are possible to select.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Furthermore in the “State/Province” Slicer we can see all the provinces that are specific to Canada while all of the other States in both USA and Australia are completely hidden.&lt;br /&gt;We can also see that when we select a single country that our chart in the lower right corner is now producing the correct results, since have effectively eliminated the “chasm trap” by explicitly filtering on a single country.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;If my objective is to drill down from the top (e.g. Countries) down to the bottom (e.g. States/Provinces, or perhaps lower down), then this presentation and UX is decent. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Let’s compare this to Qlik’s dashboard with the same selection:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_country_selection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="757" data-original-width="1600" height="302" src="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_country_selection.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;As we can see the same country “Canada” has been selected and we can clearly see the 10 provinces of Canada with a white background.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But where things differ is that we can now see a list of the excluded States from the other countries. While our analysis is not strictly concerned with these regions, they are available for selection should we choose to change our course of questioning.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I’ve already shown you a couple minor points of contrast between Power BI and Qlik, but now I want to take you to the most salient point of contrast between the two tools.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Namely, what happens when we select a “State/Province”.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Here is how this appears in Power BI when we select the province of Ontario:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_stateprov_selection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="808" data-original-width="1409" height="366" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_single_stateprov_selection.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In my view, there are two issues with this selection. First, I can see at the bottom right that my percentage calculation is off (as was the case earlier when no selections were made).&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Ontario is only showing as 3.70% of the total population when it should be 38.72%.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;What is new and most frustrating about this is that when we look at the Country Name Slicer we cannot easily see what country Ontario belongs to.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Yes, I know it’s obviously Canada, and most people know their own states or provinces.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But I can imagine there are some Americans who might not know if a non-American State/Province is Canadian or Australian, and this dashboard will not tell us that unless we exhaustively go back to the top and click on each country one by one.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Let us finally compare with the Qlik example:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_stateprov_selection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="754" data-original-width="1600" height="300" src="https://cradleofanalytics.blob.core.windows.net/blog/images/qlik_country_populations_both_stateprov_selection.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;As we can see the chart in the bottom-right is producing the correct result of 38.72%.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;More significantly, we can clearly see that Ontario belongs Canada to by looking at the “Country Name” Filter box.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;The reason why this is significant is that Qlik is allowing us to go back up the hierarchy but choose any path we would like.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;In other words, Qlik is basically telling us the hierarchy need not be traversed as a hierarchy but however we would like.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;And this gets to the essence of the “tao of Qlik”: Qlik is allowing us to ignore the typical user constraints of hierarchies and instead invites our minds to wander.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To a cynic, this idea of “wandering” through data and breaking down hierarchies might come across as some post-modern hippy dippy mumbo jumbo.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But in my experience this flexibility has led to tangible benefits.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;You see Analytics is basically a game of “20 questions” (if you have never heard of “20 questions” – it started as a British gameshow and these days is a fun game you can play over a dinner table with just two people). The more questions you can ask the faster you can drill down to the final answer. I have used Qlik’s flexibility to both solve business problems that were unsolved as well as develop very flexible dashboards where I was told after demoing the dashboard “How did you do this so quickly? Was it you or was it the tool?” where I respond: “I appreciate the complement, and I am proud of my work, but I would not have been able to complete it so quickly without this Qlik tool and its unique features.”&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I am a critic and skeptic at heart and the idea of rallying behind a for-profit corporation is not how I see myself. I am somewhat uncomfortable lavishing this praise on Qlik.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;The point of this post is to show that my objections to other tools are not based on some “Coke versus Pepsi” preference, but are rather rooted in fundamental philosophical differences, and I wish that other tools and vendors would get this point and start competing on the same philosophical ground that Qlik has laid out.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;In that spirit, let’s go back to Power BI and modify its linked model to make that Single Direction link bi-directional to see how this changes things.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;First here is a picture of the modified model so you can see the difference:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_data_model_modified_both_direction.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="257" data-original-width="1403" height="116" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_data_model_modified_both_direction.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Now let’s take a look at the dashboard when there are no selections.&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_both_no_selections.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="806" data-original-width="1405" height="366" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_both_no_selections.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Here we can immediately see in the lower-right bar chart that that the Percent calculation is correct and Ontario is at the top of the list.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Now let’s take a look at scenario where we select a State/Province:&lt;/span&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_both_stateprov_selection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="802" data-original-width="1410" height="364" src="https://cradleofanalytics.blob.core.windows.net/blog/images/pbi_country_populations_both_stateprov_selection.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Here too, we can see something of an improvement as Power BI is telling us that Ontario belongs to Canada.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But we cannot see the alternative countries now, whereas before when there was a Single direction link, we could.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;So by replacing that Single direction link with a Both direction link, we have gone from the “tao of Microsoft” to the “tao of Qlik”. &lt;span class="Apple-converted-space"&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But there is still a bit of a User Experience trade-off with Power BI.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Why is that?&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;I believe the answer comes down to this patent held by Qlik: US Patent # 6037938 with the inventor&amp;nbsp;&lt;/span&gt;Hågan Wolke&lt;/div&gt;
&lt;div class="p2" style="color: blue; font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s2" style="font-kerning: none; text-decoration-line: underline;"&gt;&lt;a href="https://patents.google.com/patent/US6037938"&gt;https://patents.google.com/patent/US6037938&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;The patent is called “Method and a device for displaying information about a large number of independent data elements”. Here is the abstract of the patent:&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;“A method of displaying a large number of interdependent data elements in a computer screen area that is small compared with the number of data elements to be displayed is disclosed. Each data element is defined by a data element type and a data element value and has an associated status value. The method comprises the steps of displaying the data element types of the data elements as a scrollable list in the computer screen area, and, for each data element type which defines more than one different data element which has a predetermined status value, displaying a predetermined indication thereof in association with the data element type in the scrollable list; and sorting, in response to a change of the status value of at least one of the data elements, the data element types in the scrollable list according to the status values of the data elements defined by the data element types. An article of manufacture including a computer-readable medium having stored thereon a computer program for carrying out the method is also disclosed.”&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Can I say for sure this patent is the reason Microsoft is not displaying excluded values in a similar fashion to Qlik?&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;No I cannot say that and I highly doubt Microsoft would ever confirm my suspicion for legal reasons.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But if I am correct in my belief, given that this patent expires today, March 30&lt;/span&gt;&lt;span class="s3" style="font-kerning: none; font-size: 7.3px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal;"&gt;&lt;sup&gt;th&lt;/sup&gt;&lt;/span&gt;&lt;span class="s1" style="font-kerning: none;"&gt; 2020, then anyone, including Microsoft should be able to copy this invention without legal repercussions in the US.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;I am not a lawyer so do not take what I am saying here as legal advice in any way shape or form.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Also I don’t know what other jurisdictions the patent has been filed under and how that might complicate Intellectual Property claims.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;Also, I might be barking up the completely wrong tree – this could all be explained by underlying constraints to Microsoft’s technology that I am not aware of.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;To conclude, I wish Qlik the very best and feel they deserve the fruits of their invention.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &lt;/span&gt;But given that they never really got the Business Intelligence world at large to understand the benefits of their bottom-up philosophy to Business Intelligence this might be a good thing for the world at this juncture.&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;Still I feel bad for Qlik and I will continue to root for them.&lt;span class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="p1" style="font-family: Carlito; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin-bottom: 8px;"&gt;
&lt;span class="s1" style="font-kerning: none;"&gt;But life’s not fair.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
</description><link>http://hepburndata.blogspot.com/2020/03/does-this-patent-which-expires-today.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-6944056287979844018</guid><pubDate>Sat, 26 Aug 2017 20:00:00 +0000</pubDate><atom:updated>2017-08-26T16:00:34.213-04:00</atom:updated><title>A Modular Data System for Digital Advertising</title><description>&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;This post is a follow-up to my original posting (and paper) titled “A Modular Approach to Solving the Data Variety Problem”.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;In response to that posting a LinkedIn commenter (Mark B.) asked the following [paraphrased] question to understand how he might use modular approach to build a modular data analysis system to handle the following scenario:&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;“As a digital marketer, I would like to see how the variation in advertising images are related to responses by different audiences.”&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;Thank you for this question Mark.&amp;nbsp; Since you have identified two subjects: Images and Advertisements, this is an ideal jumping off point to illustrate the benefits of taking a modular approach to analytics.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;To give you the short answer, using a modular approach we can ask and answer cross-subject questions that would normally be prohibitively expensive to answer:&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“What images give me the best click-thru and conversion rates?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Do older images have the same click-thru rate as newer images?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Including the cost of image production, what is the overall cost of my Ad Campaigns?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Is there any relationship between the cost of an image and its click-thru and conversion rate by gender?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Do images with a positive sentiment perform better than those with a neutral or negative sentiment?”&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;How does a modular approach allow us to answer these question so easily? It all comes down to being able to leverage Dimensions and Measures already developed for each Subject on their own (i.e. Images and Ad Impressions) and then being able to combine those Subjects into a unified multi-Subject Graph that can be easily queried.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;Recapping my paper, if you take a modular approach to analytics, you can decompose your analyses into separate “Subjects” (tables), and then further decompose those Subjects into Subsets. Each of these sub-components can be developed independently of the others.&amp;nbsp; Once these components (stored as portable data files) are “docked in” to the main repository, they can be “lobbed” and “linked” together by users to form graphs that allow for cross-subject analyses.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;br /&gt;
Let’s first break this down into the two subjects at hand: Images and Ad Impressions.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;Let’s now tackle the first Subject “Images”. We may have a team responsible for developing reports to analyze Image statistics.&amp;nbsp; For example, this team may have developed a set of Dimensions and Measures that allows them to determine how much Images cost to produce, how old they are, and what type of sentiment they are intended to produce.&amp;nbsp; Since images would presumably be developed by different teams, they would have their own reports (represented as tables) segregated by team.&amp;nbsp; Since each team’s reports would conform to a standard published schema, they could be combined to form a single cross-department report.&amp;nbsp; For example, “Team A” and “Team B” could combine their image reports into a single “Image” Subject table.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;Moving on to the second subject “Ad Impressions”. Again, there may be multiple teams running multiple advertising campaigns across multiple advertising platforms over several months.&amp;nbsp; The teams responsible for managing these ad campaigns might even be different based on the Ad Campaign or the Digital Advertising Platform the ads are being served up on.&amp;nbsp; Like the Image team,&amp;nbsp; these advertising teams would also have a set of Dimensions and Measures that would allow them to determine how often an ad was clicked on, how many conversions (e.g. goal actions) there were, what the dollar amount of the conversions is, and how these metrics break out by gender and other demographic &amp;amp; psychographic variables (which may be specific to the ad platform).&amp;nbsp; Again, since each team’s report would conform to a published schema, they could also be combined to form a single report.&amp;nbsp; Again, this combined “report” would constitute the “Ad Impression” subject.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;I have just described two different Subjects, each with their own set of Dimensions and Measures, and each composed of their own sub-sets of data.&amp;nbsp; Where the modular approach becomes relevant is that it is now possible for users to locate these sub-sets and “lob” these sub-sets into larger subjects and then “link” these subjects to form graphs that allow for cross-Subject analyses.&amp;nbsp; Namely, we can now ask and answer the questions we raised near the beginning of this post:&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“What images give me the best click-thru and conversion rates?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Do older images have the same click-thru rate as newer images?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Including the cost of image production, what is the overall cost of my Ad Campaigns?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Is there any relationship between the cost of an image and its click-thru and conversion rate by gender?”&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;“Do images with a positive sentiment perform better than those with a neutral or negative sentiment?”&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;However, there is one piece missing from the picture: In order to make this possible, we would need to define a simple “bridge” table for connecting the image profiles to the ad impressions.&amp;nbsp; This bridge table would be developed and maintained by the team that has access to the information required to link the two subjects together.&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;The following diagram shows how sub-sets sharing the same schema (as depicted with their own colour) can be “lobbed” together to form larger subjects, and how subjects sharing a linking column can be SEMI-JOIN linked together to form a graph for cross-subject analytics.&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style="font-kerning: none;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWwiaHicPOHMJ9kKzS_wgUu9oJeFAJ6frnEltyl3ffHEr7wAsLL8xg_NjPiv5Ack8gygF1o_JC4w8NghuQ_tCuL_N_DxC2vU1RcBVdGZ2UIzsV07empT2WKC5GytHZA2DbQQIK/s1600/Ad+Analytics+Example.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" data-original-height="612" data-original-width="792" height="247" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWwiaHicPOHMJ9kKzS_wgUu9oJeFAJ6frnEltyl3ffHEr7wAsLL8xg_NjPiv5Ack8gygF1o_JC4w8NghuQ_tCuL_N_DxC2vU1RcBVdGZ2UIzsV07empT2WKC5GytHZA2DbQQIK/s320/Ad+Analytics+Example.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;Astute readers might point out that there is nothing preventing a determined analyst with access to the underlying data from answering the same questions. While it is true that the end result can be achieved through current approaches, these approaches tend to be prohibitively expensive. Here is what is different about the modular approach:&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;Users can integrate data through user-friendly graphical interfaces allowing them to vertically “lob” Sub-Sets into Subjects and then horizontally link those customized Subjects without fear of introducing duplicates through the common “Fan Trap” problem that bogs down most data integration efforts&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;Users can independently develop new Subjects and Subject Sub-Sets, and then “dock in” those Subjects and Sub-Sets in a self-serve manner, without relying on IT assistance, while still conforming to enterprise data governance rules thus protecting Metadata Integrity and Data Integrity, thus allowing data to be safely located and integrated by other users&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;Users can “time travel” by choosing an older “AS-OF” date and time, and performing analyses across data that was current as of that date&lt;/span&gt;&lt;/li&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;Data files are portable and can be potentially moved to wherever they are needed for either analysis or downstream processing&lt;/span&gt;&lt;/li&gt;
&lt;ul style="list-style-type: disc;"&gt;
&lt;li style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; margin: 0px;"&gt;&lt;span style="font-size: 13.2px; line-height: normal;"&gt;&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;An example file name, containing from the first Ad Impression Subject Sub-Set (as shown in the above diagram) might be: “&lt;/span&gt;&lt;span style="-webkit-font-kerning: none; font-family: 'Courier New'; line-height: normal;"&gt;AdImpression_V1_CAMDAPMON_SF-G-2017-04_AS-OF 2017-08-26 153100.csv&lt;/span&gt;&lt;span style="font-kerning: none;"&gt;”&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;On top of all of this, other Subjects such as “Web Session” could be “docked in” in to the larger repository allowing Data Analysts to include any Dimensions and Measures developed for the “Web Session” Subject (e.g. ‘Session Duration’) to be incorporated into analyses relating to Images and Ad Impressions.&amp;nbsp; For example, we could ask and answer the question “What images have not been used for the past 7 days of Web Sessions?”&lt;/span&gt;&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal; min-height: 12px;"&gt;
&lt;span style="font-kerning: none;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;





&lt;/div&gt;
&lt;div style="-webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; font-family: 'Helvetica Neue'; font-size: 11px; line-height: normal;"&gt;
&lt;span style="font-kerning: none;"&gt;This example provides a small glimpse into how a modular approach to data management opens up new analytical opportunities that would normally not survive cost/benefit analysis using current approaches.&lt;/span&gt;&lt;/div&gt;
&lt;span style="font-kerning: none;"&gt;&lt;span id="goog_1006013702"&gt;&lt;/span&gt;&lt;span id="goog_1006013703"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
</description><link>http://hepburndata.blogspot.com/2017/08/a-modular-data-system-for-digital.html</link><author>noreply@blogger.com (Neil Hepburn)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" height="72" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWwiaHicPOHMJ9kKzS_wgUu9oJeFAJ6frnEltyl3ffHEr7wAsLL8xg_NjPiv5Ack8gygF1o_JC4w8NghuQ_tCuL_N_DxC2vU1RcBVdGZ2UIzsV07empT2WKC5GytHZA2DbQQIK/s72-c/Ad+Analytics+Example.png" width="72"/><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-6565479031039823618</guid><pubDate>Sun, 06 Aug 2017 16:24:00 +0000</pubDate><atom:updated>2017-08-06T12:25:10.344-04:00</atom:updated><title>A Modular Approach to Solving The Data Variety Problem</title><description>I have been working on a paper in my spare time on the weekends for a number of months now. &amp;nbsp;My goal with this paper is to change the thinking around data and ultimately bridge the chasm between how IT and the Business think about data management and in particular Data Warehousing and Business Analytics.&lt;br /&gt;
&lt;br /&gt;
I am publishing the full paper here as a PDF and will publish portions of this paper in piecemeal over the coming days and weeks beginning with the Executive Summary.&lt;br /&gt;
&lt;br /&gt;
I encourage readers to share this paper and discuss the ideas contained within. &amp;nbsp;I also encourage readers to send their feedback. &amp;nbsp;Since I am a human being and am as sensitive to criticism of my work as the next person, I only ask that you couch any negative criticism in a way that is civil.&lt;br /&gt;
Based on feedback, I may create new versions of this paper which you can easily distinguish by the paper's AS-OF date.&lt;br /&gt;
&lt;br /&gt;
Before I sign off, &amp;nbsp;I would like to thank Jane Roberts for her time in reviewing this paper and for her contributions. Thank you Jane!&lt;br /&gt;
&lt;br /&gt;
Here is the paper:&lt;br /&gt;
&lt;a href="https://drive.google.com/open?id=0B6QIZPV6OQqgb2p4SVVUb1k1Rzg"&gt;A Modular Approach to Solving The Data Variety Problem AS-OF 207-08-06&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div class="page" title="Page 1"&gt;
&lt;div class="layoutArea"&gt;
&lt;div class="column"&gt;
&lt;span style="font-family: &amp;quot;cambria&amp;quot;; font-size: 13.000000pt; font-weight: 700;"&gt;Executive Summary
&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;Big Data has fully captured the popular imagination. Companies like Google, Facebook, Apple, Amazon,
and Microsoft process petabytes of data daily. Limits that once seemed impossible are now the new
normal. In spite of this, analysts and managers still struggle to answer unexpected questions at
executive speed.
&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;The reason is that while much attention has been given to these remarkable data volumes, a different
but related problem has come into sharp focus: The Data Variety Problem. Namely, organizations
continue to struggle to manage and query the ever increasing &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt; font-style: italic;"&gt;variety &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;of data originating from sources
including, but not restricted to: IT controlled systems (e.g. ERPs, PoS systems, subscriber billing systems,
etc.); 3&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 7.000000pt; vertical-align: 4.000000pt;"&gt;rd &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;Party managed systems (e.g. cloud CRMs, cloud marketing DMPs); and Business controlled
departmental tracking spreadsheets, grouping lookup tables, and adjustment tables.
&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;br /&gt;&lt;/span&gt;

     &lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;The approach to locating, obtaining, and integrating these sources of data is highly manual. Case-in-
point: In the Alteryx commissioned study &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt; font-style: italic;"&gt;Advanced Spreadsheet Users Survey&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 7.000000pt; vertical-align: 4.000000pt;"&gt;i&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;, published in December
2016, IDC discovered that &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;“&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;$60 billion [is] wasted in the U.S. every year &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;by advanced spreadsheet users.”
&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;Yet this report only provides a small glimpse into this problem and misses the bigger opportunity:
Organizations urgently need a one-size-&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;fits all ‘happy path’ for consuming and producing an ever
&lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;accelerating increase in the &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt; font-style: italic;"&gt;variety &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;of structured data.
&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;In this paper &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;– &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;drawing on in algebra, computer science, systems thinking, history, psychology, and 20
years experience as a data practitioner &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;– &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;Neil Hepburn posits that the best approach to addressing this
problem for the long term is through embracing a &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt; font-style: italic;"&gt;modular &lt;/span&gt;&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;data warehousing system. Neil goes on to
describe how such a modular data warehousing system could be designed and built using readily
available tools and technology, and what challenges must be overcome to realize this vision.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;a href="https://drive.google.com/open?id=0B6QIZPV6OQqgb2p4SVVUb1k1Rzg"&gt;A Modular Approach to Solving The Data Variety Problem AS-OF 2017-08-06&lt;/a&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family: &amp;quot;calibri&amp;quot;; font-size: 11.000000pt;"&gt;&lt;br /&gt;&lt;/span&gt;

    &lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
</description><link>http://hepburndata.blogspot.com/2017/08/a-modular-approach-to-solving-data.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-8747235686777204330</guid><pubDate>Thu, 29 Dec 2011 00:25:00 +0000</pubDate><atom:updated>2012-02-25T11:17:01.798-05:00</atom:updated><title>Perspective is Everything: Why even the most intelligent software architects don't understand the Relational Model</title><description>A few weeks ago I stumbled on this article "&lt;a href="http://cacm.acm.org/magazines/2011/4/106584-a-co-relational-model-of-data-for-large-shared-data-banks/fulltext"&gt;A co-Relational Model of Data for Large Shared Data Banks&lt;/a&gt;"  ("coRel" hereon) in the on-line version of ACM Queue (this is the Association of Computing Machinery's magazine).  The article was authored by two employees (Erik Meijer and Gavin Bierman) of Microsoft.&lt;br /&gt;&lt;br /&gt;The authors summarize their thesis as thus: "Contrary to popular belief, SQL and noSQL are really just two sides of the same coin."  The problem with this article, is that the author's are asking the entirely wrong the question.  They are looking at the world with a very narrow and single minded perspective of data - a perspective which was conventional wisdom up until the 1970s (academically), and the 1980s (commercially).&lt;br /&gt;&lt;br /&gt;In technical terms, the authors are basically asking this question: If it is possible to implement a Network Model using a Key/Value [NoSQL] database, and it is possible to implement a Network Model using a Relational database, then can the two be queried and modified by the same declarative language?  The answer to this question is a resounding yes.  Unfortunately, the Network Model is not perspective neutral, which is why the Relational Model was invented.&lt;br /&gt;&lt;br /&gt;Backing up a big, allow me to explain what I mean.  The Relational Model takes a perspective neutral approach, and regards all entities, no matter how insignificant they may seem, as "first class citizens".  Other data models such as the Network Model and Hierarchical Model lock the data into a given perspective and make certain entities "first class citizens" and others "second class citizens".  For example, if you have ever organized your inbox e-mails into folders, or documents on your computer into folders, you have probably chosen a certain hierarchy.  Maybe you organized your folders by customer, so that way when a customer asks a question, you can quickly go to the right folder and find all the necessary information.  But what if a project manager comes to you and starts asking questions about a particular project, and that project cuts across customers?  What normally happens here is you start searching through each customer folder looking for e-mails or documents that pertain to the project.  Most people will just copy (or create short-cuts) to those project documents to another folder.  We've all been through these searching and sorting exercises.  When software developers are confronted with the same problem, they pretty much do the same thing - they reorganize (or refactor) the data.&lt;br /&gt;&lt;br /&gt;However, if the data were Normalized  (i.e. modeled relationally) to begin with, no such reorganization would be necessary.&lt;br /&gt;&lt;br /&gt;The author's (and the majority of software developer's) myopia is apparent in their view of history, the example data model they provide, and even in computer science theory.&lt;br /&gt;&lt;br /&gt;Let's talk about history first.  coRel has this to say:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Codd's relational model and SQL allowed implementations from different  vendors to be (near) perfect substitutes, and hence provided the  conditions for perfect competition. Standardizing on the relational  model and SQL created a secondary network effect around complementary  producers such as educators, tool vendors, consultants, etc., all  targeting the same underlying mathematical principles. Differences  between actual relational database implementations and SQL dialects  became to a large extent irrelevant.&lt;/blockquote&gt;&lt;br /&gt;While it's true that standardization around SQL led to wide adoption, such standardization had already emerged prior to the introduction of the relational model.  Namely, CODASYL (the same body that created COBOL) developed a standard around the aforementioned Network Model, often referred to as the Data Base Task Group (DBTG). Much of what you see in modern SQL standards actually comes from this standard - in particular the separation of DDL (data definition language) from DML (data manipulation language).  However CODASYL vendors (and there were a lot of them), were blindsided by Codd's relational model.&lt;br /&gt;&lt;br /&gt;Interestingly, Codd's original language was not SQL, but rather Alpha.  Also, the first two major RDBMS vendors had competing standards:  Ingress used a language called Quel; and Oracle and IBM used SQL. But because Ingres was always based on the relational model, it was able to simply slap on support for SQL.  Ingress lives on to this day in the form of PostgreSQL.  The other non-RDBMS vendors also live on to this day, but tend to serve particular niches (e.g. IBM's IMS is still heavily used in banking).  There was nothing inherently special about SQL, and other relational languages are still around and continue to be invented.  What is special is the underlying Relational Model.&lt;br /&gt;&lt;br /&gt;For a better explanation of why the Relational Model entered the marketplace, here's a passage I scanned in from the article "The Commercialization of Database Management Systems, 1969-1983" found in the IEEE Annals of the History of Computer, Volume 31, Number 4, October-December 2009&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEFuGYSm4uTQcw3kEpVdXwoBpU4-BYkar8ZiXtH9oMuDNxye5MMJ1Ie2FLuiLTOJcsulN2025rch-eLmd_96P8R4yGh7GKMOHt5A_DZdsYS5ONEx0Iml5Tw9TyODUdWoZ-wDZa/s1600/relational+model+history.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 197px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEFuGYSm4uTQcw3kEpVdXwoBpU4-BYkar8ZiXtH9oMuDNxye5MMJ1Ie2FLuiLTOJcsulN2025rch-eLmd_96P8R4yGh7GKMOHt5A_DZdsYS5ONEx0Iml5Tw9TyODUdWoZ-wDZa/s400/relational+model+history.png" alt="" id="BLOGGER_PHOTO_ID_5692393791501357954" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I am now going to talk about the sample data model used by coRel.    The example is based on an example taken from Amazon's SimpleDB.  Here is what the original data looks like, as described by Amazon:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnl7l1beNQ9JhJ1rA-BUXAtA7FYV4afGzf3TblU-zh2D7E-8tCa6mbtbRO7Fh5d8yPIKQT_osLqraQglsnx0IKOqigUbwiPFO23CiKScmWD4BPit7zI4n7lPmKw3iQRUlVn9vs/s1600/SimpleDB+model.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 170px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnl7l1beNQ9JhJ1rA-BUXAtA7FYV4afGzf3TblU-zh2D7E-8tCa6mbtbRO7Fh5d8yPIKQT_osLqraQglsnx0IKOqigUbwiPFO23CiKScmWD4BPit7zI4n7lPmKw3iQRUlVn9vs/s400/SimpleDB+model.png" alt="" id="BLOGGER_PHOTO_ID_5693580308949624738" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is how the authors model this using an object model (essentially a Network Model):&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3KGhss0pJ7zfOMjT4B262VxwWR7CNsKtxpeonGgPj7dGUDt9snf8NF_x2vSU85MIRjCm9Rx7dxbJ4ramfOJ6P-9sqPclUKScplW6C4hOrwIQa3hrwyd-O8bhO-Jij1PK0Nb8/s1600/meijer-1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 290px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3KGhss0pJ7zfOMjT4B262VxwWR7CNsKtxpeonGgPj7dGUDt9snf8NF_x2vSU85MIRjCm9Rx7dxbJ4ramfOJ6P-9sqPclUKScplW6C4hOrwIQa3hrwyd-O8bhO-Jij1PK0Nb8/s400/meijer-1.png" alt="" id="BLOGGER_PHOTO_ID_5693989670853783010" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is how the authors model this using a Relational Model:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqZkTNxcC2D7f-Pq8JLrr8i-DvVRCj_fRIXAhOn-I5xrz9R_HkFLahyphenhyphen2rc3Sp16Iu9gsbW53g-IOdXVzUCnzd2wUJEH_OBQtX3XMAQrb9ro5fY3mUv8KChwFDoX-buBBsMM7Og/s1600/meijer-3.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 268px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqZkTNxcC2D7f-Pq8JLrr8i-DvVRCj_fRIXAhOn-I5xrz9R_HkFLahyphenhyphen2rc3Sp16Iu9gsbW53g-IOdXVzUCnzd2wUJEH_OBQtX3XMAQrb9ro5fY3mUv8KChwFDoX-buBBsMM7Og/s400/meijer-3.png" alt="" id="BLOGGER_PHOTO_ID_5693989677328298194" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If you're an experienced relational modeller, you will observe that the model is not in BCNF (Boyce-Codd Normal Form), as the Ratings and Keywords entities two of the entities have overlapping candidate keys.&lt;br /&gt;&lt;br /&gt;Here is what the data model should look like in BCNF:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizoQ-9CE_nWwLCmsPj6tBWY-yCfvub6m6h1pkdbUkvANLZx8S95GWk6x2-fuY-6_ikX1yuiAWp3YSlKdumHXG-8fzYQLfrhXOEBjkpsmUrPctTo5jaHLzwYad6I3Jah_FvHLlc/s1600/normalized+SimpleDB+sample.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 244px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizoQ-9CE_nWwLCmsPj6tBWY-yCfvub6m6h1pkdbUkvANLZx8S95GWk6x2-fuY-6_ikX1yuiAWp3YSlKdumHXG-8fzYQLfrhXOEBjkpsmUrPctTo5jaHLzwYad6I3Jah_FvHLlc/s400/normalized+SimpleDB+sample.png" alt="" id="BLOGGER_PHOTO_ID_5693609435489503890" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;From the perspective of the Product Catalog application, this change seems somewhat academic.  However, if we extend our normalized data model to include Tweets retrieved via Twitter keyword searches - which is important from the perspective of a marketer - things get more interesting.  Here's the updated model:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRFRLHrsUpnjPxE5AK7PvANh8D4hcyfT-uKl_aSkl_SST6u_IfMHNt1D_RYCUZ7TTBR5XUHrWT6vzeAj9VjufKr7l3R8zcCFtae41vAIdsx5ZlXhfSp3KzIfCXG_WpiyYlOnZj/s1600/normalized+SimpleDB+sample+w+Tweet.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 350px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRFRLHrsUpnjPxE5AK7PvANh8D4hcyfT-uKl_aSkl_SST6u_IfMHNt1D_RYCUZ7TTBR5XUHrWT6vzeAj9VjufKr7l3R8zcCFtae41vAIdsx5ZlXhfSp3KzIfCXG_WpiyYlOnZj/s400/normalized+SimpleDB+sample+w+Tweet.png" alt="" id="BLOGGER_PHOTO_ID_5693609439661746066" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;What is important to note in this normalized model is that all entities are "first class citizens".  As an Product Catalog application developer, I can ask questions or make changes to  Keywords and Products, without having to involve Tweets.  Or, as a marketer I can ask questions about Tweets and Products without having to involve Products.&lt;br /&gt;&lt;br /&gt;When data is seen in a larger context with many different perspective, the relational model makes sense. While it may be more efficient to model data for a particular perspective (i.e. the product catalog application) using a Network or Object model, the same model can be very &lt;span style="font-style: italic;"&gt;inefficient&lt;/span&gt; and lead to anomalies and contradictions in the data.   This point is lost on many developers, since most only deal with a single perspective of the data.  The following paragraph in coRel makes this very clear:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Summarizing what we have learned so far, we see that in order to use a relational database, starting with a natural hierarchical object model, the designer needs to normalize the data model into multiple types that no longer reflect the original intent; the application developer must reencode the original hierarchical structure by decorating the normalized data with extra metadata; and, finally, the database implementer has to speed up queries over the normalized data by building indexes that essentially re-create the original nested structure of the data as well.&lt;/blockquote&gt;&lt;br /&gt;See the problem?  There is rarely such thing 'natural' hierarchy. Perspective is everything, and depending on how we view an ontology, we can ascribe many different hierarchies.&lt;br /&gt;&lt;br /&gt;Now, you may be wondering if it is possible to represent a relational model in a NoSQL database, such as a key/value store.  The answer is: sort of.  While it is possible to recreate the structure of the relational model, it is not possible to centralize the integrity of the relational model.  This is not a trivial point.  Referential integrity (and other forms of integrity, such as uniqueness, nullability, and value domain constraints) are what ensure the correctness of ad hoc queries.  When such constraints are removed, it is up to the application developer to examine the underlying data and perform numerous tests to ensure its integrity.  The end result is poor data integrity and poor data quality.  I can speak quite frankly on this last point as I see the difference between poorly constrained data models and well constrained data models, all the time.  Just like the second law of thermodynamics, when unconstrained, over time data entropy tends to infinity.&lt;br /&gt;&lt;br /&gt;In other words, the relational model allows us to manage information as a &lt;span style="font-style: italic;"&gt;separate concern&lt;/span&gt;.  You might even say this is the whole point of the relational model.&lt;br /&gt;&lt;br /&gt;I now want to move on to the theoretical aspects of the paper, in particular the question of &lt;span style="font-style: italic;"&gt;compositionality&lt;/span&gt;. CoRel's authors define compositionality as: "the ability to arbitrarily to combine               complex values from simpler values without falling outside the               system"  they go on to say&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;SQL is rife with noncompositional features. For example, the semantics  of NULL is a big mess: why does adding the number 13 to a NULL value, &lt;tt style="font-size: 1.25em;"&gt;13+NULL&lt;/tt&gt;, return NULL, but summing the same two values, &lt;tt style="font-size: 1.25em;"&gt;SUM(13, NULL)&lt;/tt&gt;, returns 13?&lt;/blockquote&gt;&lt;br /&gt;A more precise definition of compositionality comes from Wikipedia which states "An important aspect of denotional semantics of programming languages is compositionality, by which the denotion of a program is constructed from denotions of its parts."&lt;br /&gt;&lt;br /&gt;SQL guarantees compositionality since it doesn't have any side effects.  Contrast this with most concurrent programming languages (e.g. Java, C#, Python), and they do not guarantee compositionality since it's possible to write modules which impact the other modules.&lt;br /&gt;But I don't think the authors were thinking along these lines.  They're really arguing that SQL is inconsistent and point out the example with the NULLs.&lt;br /&gt;&lt;br /&gt;The reason why there is this perception of non-compositionality is that sets and tuples are treated as primitives.  You cannot make an aggregate function out of tuple functions, and you cannot make a scalar function out of aggregate functions.&lt;br /&gt;&lt;br /&gt;NULLs are controversial to this day, and Codd even wanted to take things a step further, and distinguish between "unknown but applicable" and "unknown but inapplicable".&lt;br /&gt;Codd's basic argument for the inclusion of NULLs (and three-value logic) can be summarized as thus: In the real world, handling unknown values is inherently complex.  Instead of thrusting the complexity back to the user, the RDBMS should handle unknowns "correctly" - in so far as the behavior correctly models real world behavior.  This can result in NULLs being counter-intuitive, but just because something is counter-intuitive it doesn't mean its wrong (think flat earth intuition, round earth reality).&lt;br /&gt;&lt;br /&gt;For example, let's say you are hotel manager and you want to know the average number of days a guest stayed for.  Assuming you have a database where front reception can log check-in and check-out times.  Obviously when you haven't checked out, the check out time is unknown (or doesn't exist), so that attribute would be NULL. When taking the average of checkin [minus] checkout, you will only be including rows where both the check-in and check-out times are known, since when tuple contains an unknown element [unless stated otherwise] the tuple as a whole cannot be known, and should be eliminated from the set.  This is what you want.  Putting it in a SQL query, it would look like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;blockquote&gt;&lt;span style="font-family:courier new;"&gt;SELECT&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;  AVG(DATEDIFF(CheckOutTime, CheckInTime))&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;FROM&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;  BOOKINGS&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;As you can see, it's a very simple query to write and validate, and reflects the correct handling of NULLs. Quite the opposite of a "big mess".&lt;br /&gt;&lt;br /&gt;But the whole question of compositionality is also missing the point with SQL.  SQL is not a computational language - it's a data retrieval language, based on the relational model.  When we're talking about data, compositionality is not our main concern - normalization is.  Data which is not normalized, is like a program which is non-compositional.  Nasty side effects can and will arise.&lt;br /&gt;---&lt;br /&gt;The funny thing about the relational model is that it is predicated on Relational Algebra which is completely orthogonal to the Universal Turing Machine.  The former is about logic, and the latter is about flow. They are not in competition.&lt;br /&gt;&lt;br /&gt;But when I hear people say things like "it's about time somebody built a better database than those stupid RDBMSs", it's akin to saying "it's about time somebody build a better Universal Turing Machine".  Makes no sense really.&lt;br /&gt;&lt;br /&gt;Before I conclude this post, I want to share with you an excerpt from Joe Celko's "Thinking In Sets" which is very telling:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Many years ago, the INCITS H2 Database Standards Committee (née ANSI X3H2 Database Standards Committee) had a meeting in Rapid City, South Dakota.  We had Mount Rushmore and Bjarne Stoustrup as special attractions.  Mr. Stoustrup did his slide show with overhead transparencies (yes, this was before PowerPoint was ubiquitous!) about Bell Labs inventing C++ and OO programming, and we got to ask questions.&lt;br /&gt;&lt;br /&gt;One of the questions was how we should put OO features into the working model of the next version of the SQL standard, which was known as SQL3 internally.  His answer was that Bell Labs, with all their talent, had tried four different approaches to this problem and they came to the conclusion that it should not be done.  OO was great for programming but deadly for data.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Summing up.  While I am being critical of the coRel paper, there was clearly a lot of thought that went into it, and the authors come across as being intelligent and having a good pedigree.  My point is that there is an institutional bias towards application-centric data modeling, which comes at the expense of perspective neutral data modeling - i.e. the Relational Model.&lt;br /&gt;It has been my experience that this bias has led to a great deal of friction between software or application architects, and data architects.  Much to the frustration of both.&lt;br /&gt;&lt;br /&gt;My greatest hope is that by educating students at an earlier age, this deep rooted bias can be avoided.  This, I will point out is a long running project of mine.  A blog for another day.</description><link>http://hepburndata.blogspot.com/2011/12/perspective-is-everything-why-even-most.html</link><author>noreply@blogger.com (Neil Hepburn)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" height="72" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEFuGYSm4uTQcw3kEpVdXwoBpU4-BYkar8ZiXtH9oMuDNxye5MMJ1Ie2FLuiLTOJcsulN2025rch-eLmd_96P8R4yGh7GKMOHt5A_DZdsYS5ONEx0Iml5Tw9TyODUdWoZ-wDZa/s72-c/relational+model+history.png" width="72"/><thr:total>7</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-736489366422038755</guid><pubDate>Sun, 09 Oct 2011 23:36:00 +0000</pubDate><atom:updated>2011-10-09T19:44:04.665-04:00</atom:updated><title>Keys, Data Syncronization, and the Relational Model</title><description>A common problem faced by database developers involves keeping schemas and data in sync between environments.  The problem isn't even restricted to development.  Often databases must be kept synchronized in operational scenarios. Data is often replicated for reasons of high availability and performance.&lt;br /&gt;In this posting, I want to illustrate a real-world scenario that frequently comes up, and a modeling approach that ensures this scenario never turns into a problem.  Even if you have not encountered this scenario, it's worth reading this blog in its entirety for tips on better data modeling, and why the relational model is still very relevant.&lt;br /&gt;&lt;br /&gt;As anyone who has had to keep their calendar or contacts in sync between devices, or for developers accustomed to using Version Management systems like Visual Source Safe (VSS) or Subversion, the problem really boils down to the following three requirements:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;If the data is in the source data store but not the target, then add it to the target.&lt;/li&gt;&lt;li&gt;If the data is in the target but not in the source, then remove it from the target.&lt;/li&gt;&lt;li&gt;If the data is both in the source and the target, overwrite the target's attributes with the source's attributes.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Simple right?  Indeed everything is simple... IF you have identifiers which are consistent between source and target data store.  For the purposes of this blog posting, I'm going to focus on the use case of syncing data between a development server and a test server, which is equivalent to syncing between a test server and a production server. &lt;br /&gt;&lt;br /&gt;I am not going to get into the intricacies of syncing calendar or personal contact data - that scenario is actually quite different since there is never a universally agreed upon key or way of identifying individuals, so you really don't have a proper key to compare against.  That's why most PIM synchronization software tends to take a "fuzzy" approach when syncing up contacts.  Companies like Facebook and LinkedIn are in a privileged position to address this ongoing problem, as they are effectively becoming a de facto registry for contact information. But I digress...&lt;br /&gt;&lt;br /&gt;For this posting, to make my examples more explicit I'll refer to the Microsoft technology stack, although all of the major RDBMS vendors support the same functionality and tools.&lt;br /&gt;&lt;br /&gt;The crux of the problem when synchronizing most data comes down to the alignment of keys (or identifiers if you prefer).&lt;br /&gt;&lt;br /&gt;A common conundrum that comes up when modeling data is defining the right primary key.  For example, I can create an employee table and use the employee's social insurance number as my primary key.  Or perhaps I don't have access to the social insurance number, and instead use telephone number and first and last name as my primary key.  In the latter case, people are known to change their name or their telephone number.  The key is effectively out of our control, in order to have reliable key we end up creating what I refer to as a "technical key".  Technical keys are often auto-generated from incrementing sequences, but they can just as easily be generated from a GUID. It is common to find auto-generated primary keys because not only is the data modeler in complete control of the keys values, but they are also compact, especially when referenced as a foreign key from other [dependent] tables.  Data synchronization problems begin here simply because the technical key depends on the database or application that generated the key.  If you import the same data into a different database for the first time it can and probably will have a different auto-generated key.  I am well aware that it is possible to load data into other databases while preserving keys, and if you're taking a "master slave" approach to data synchronization you have nothing to worry about, since you're effectively just copying the data over.  However if you already have data loaded in the target table and you simply need to update a few columns then you're going to run into problems.&lt;br /&gt;&lt;br /&gt;A good solution to this problem is to define an alternate natural key.  Basically this is just a UNIQUE index on one or more columns.  For our above example, this could be comprised of name and telephone number.  Given that the natural keys can be different, the developer should perform some analysis to assess and quantify the risk that natural keys are different.&lt;br /&gt;Once you have determined the natural keys are in sync, you can [as an example] use Visual Studio 2010 Schema Compare to perform a data schema comparison which will generate a SQLCMD script (assuming you are promoting schema changes) and you can also use Visual Studio 2010 Data Compare to generate a SQL script to perform all the UPDATEs, INSERTs, and DELETEs.  In VS 2010 you can choose whether to use a table's primary key, or one of the alternate natural keys you have defined.&lt;br /&gt;&lt;br /&gt;I should point out that if your table doesn't have any set of columns (apart from the primary key) which can be guaranteed to be unique, and the primary keys themselves are generated and therefore database server specific, you should try your best to rectify this situation as you now have a more fundamental problem, which I won't be addressing in this blog (hint: You need to start looking at the target's change history).&lt;br /&gt;&lt;br /&gt;So far so good.  But what if you have a dependent/child table whose natural key depends on the parent table?  Let's say there is a table called EMPLOYEE_INVENTORY which is a list of items that have been provisioned to the employee.  The natural key for this table might be the composite of EMPLOYEE_ID and the INVENTORY_ITEM_ID (and the Primary Key for EMPLOYEE_INVENTORY is an auto-generated key).  To keep things simple, the Primary Key and Natural Key for INVENTORY_ITEM is one and the same, and that it is a universal SKU # of the inventory item.  So in summary, the primary key for EMPLOYEE_INVENTORY "ID" is a technical key, and the natural alternate key is "EMPLOYEE_ID" + "INVENTORY_ITEM_ID"&lt;br /&gt;&lt;br /&gt;Now we have a bit of a problem when it comes to keeping EMPLOYEE_INVENTORY in synch.  Namely, half of its Natural Key is derived from the Auto-generated Primary Key in its parent table (EMPLOYEE).  Should we wish to synch based on Natural Keys we're forced to develop code to perform lookups, comparisons, UPDATEs, INSERTs, and DELETEs - we can no longer rely on Data Synching software like VS 2010 Data Compare to do this for us.  While there is nothing inherently complicated about this, it will invariably take you a chunk of time to write this code, whether you do it in an ETL tool like SSIS, or stick to a procedural code.  If you have another dependent/child table which in turn depends on EMPLOYEE_INVENTORY (e.g. EMPLOYEE_INVENTORY_LOG) things get more complicated and you're spending considerably more time to complete the task.&lt;br /&gt;&lt;br /&gt;The preferred approach is to create what is known as an UPDATABLE VIEW which will allow you to substitute the parent table's technical primary key for its natural key.  Just to clarify, an UPDATABLE VIEW is exactly what its name says it is: a VIEW you can UPDATE.  As you can imagine, there are limitations as to which VIEWs can be updated (clearly anything with an aggregate would not be updatable).  For our scenario though creating an UPDATABLE VIEW  might look like this:&lt;br /&gt;&lt;blockquote&gt;SELECT&lt;br /&gt;    e.TELEPHONE_NUMBER,        --Natural Key&lt;br /&gt;    e.FIRST_NAME,            --Natural Key&lt;br /&gt;    e.LAST_NAME,            --Natural Key&lt;br /&gt;    ei.INVENTORY_ITEM_ID,        --Natural Key&lt;br /&gt;    ei.LAST ACTIVITY_DATE,        --Attribute to synch&lt;br /&gt;    ei.QUANTITY            --Attribute to synch&lt;br /&gt;FROM&lt;br /&gt;    EMPLOYEE AS e INNER JOIN&lt;br /&gt;    EMPLOYEE_INVENTORY AS ei ON e.ID = ei.EMPLOYEE_ID&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;For SQLServer if you want to make this VIEW UPDATABLE, there are two things you need to do.  First, you need define it with the "SCHEMABOUND" option.  Second, you need to defined a CLUSTERED UNIQUE INDEX on the natural key columns of the view (i.e. "TELEPHONE_NUMBER", "FIRST_NAME", "LAST_NAME", "INVENTORY_ITEM_ID").&lt;br /&gt;&lt;br /&gt;You are now in a position where you can use data synchronization tools like VS 2010 Data Compare to automatically synchronize the data for you.  Because everything we haven't had to write any procedural code, we can focus instead on problems which arise from the data itself (e.g. a natural key mismatch), as opposed to betting bogged down in throwaway code.&lt;br /&gt;&lt;br /&gt;Problem solved.&lt;br /&gt;&lt;br /&gt;********&lt;br /&gt;&lt;br /&gt;What I want to illustrate in this blog is the power that comes from mindful modelling and adhering to relational principles.&lt;br /&gt;&lt;br /&gt;In the last couple of years, &lt;a href="http://en.wikipedia.org/wiki/Nosql"&gt;NoSQL&lt;/a&gt; databases like MongoDB, Redis, Cassandra, Google DataStore have flourished.  Indeed, these databases provide significant advantages for application developers in terms of scalability.  They also feel like an ORM layer, but with much greater efficiency, so they are very desirable to application developers.  And not to be ignored, many of these new databases are open source and can be used for little or no money.  Case-in-point I'm planning on building a new hobby application using Google Data Store since I get up and going without paying a cent.&lt;br /&gt;&lt;br /&gt;The downside of these modern database technologies is that they suffer from many of the same limitations that plagued pre-relational database developers.  Namely, there is a "perspective lock-in".  What this means is that once the application developer has modeled data for their application's use cases, it may be difficult for future applications to use the data for their own purposes.  It will also be difficult to run any sort of ad hoc queries without first exporting the data to an analytical RDBMS.&lt;br /&gt;&lt;br /&gt;This is not to say that NoSQL databases should not be used.  This is to say that they should be chosen with eyes wide open and a clear understanding of the trade-offs involved.  In fact, I believe the most compelling reason to use a NoSQL database is for low cost (in particular hardware costs).  Let me repeat that: If you absolutely cannot afford to scale using an RDBMS and you can't see yourself bootstrapping yourself along, then go with a NoSQL database.&lt;br /&gt;&lt;br /&gt;Put another way, virtually every NoSQL innovation I've seen, has been absorbed into an RDBMS.  Take for example binary large objects (BLOBS) like videos, music, and documents: Microsoft (and presumably others) now allow you access these objects directly through the file system, while allowing them to be managed transactionally through the RDBMS.  Analytical databases like ParAccel an Vertica allow for Petabyte scaling - and still allow full relational capabilities.  Contrast this with Google DataStore which doesn't even support basic JOINs or aggregations (GROUP BYs).  This means you have to write this code on your own. Not only is that going to be cumbersome and error prone, it's also going to perform worse as well.  This is why you're beginning to see "SQL layers" &lt;a href="http://research.google.com/pubs/pub37200.html"&gt;Google Tenzing &lt;/a&gt;being added to NoSQL databases to speed up the commonly requested tasks.  Even Facebook, which started the Cassandra project, still uses MySQL (combined with Memcached in a sharded configuration).&lt;br /&gt;&lt;br /&gt;There's also the notion of &lt;a href="http://en.wikipedia.org/wiki/Eventual_consistency"&gt;BASE&lt;/a&gt; versus &lt;a href="http://en.wikipedia.org/wiki/ACID"&gt;ACID&lt;/a&gt;, and the types of business models each can tolerate.  But that's a discussion for another day...&lt;br /&gt;&lt;br /&gt;And so I will end this blog entry.</description><link>http://hepburndata.blogspot.com/2011/10/keys-data-syncronization-and-relational.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-6432683526047097680</guid><pubDate>Mon, 27 Dec 2010 16:16:00 +0000</pubDate><atom:updated>2010-12-30T12:14:52.542-05:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">andrew maguire; cftc; silver manipulation; wall street journal; wsj; citizen journalism</category><title>The Decline of Trust: Follow-up to Andrew Maguire post</title><description>Earlier this year I blogged about an alleged case of silver manipulation as raised by Andrew Maguire to the CFTC.  If you haven't already, I urge you to read that &lt;a href="http://hepburndata.blogspot.com/2010/04/curious-case-of-andrew-maguire-my.html"&gt;post &lt;/a&gt;before continuing on with this post.&lt;br /&gt;&lt;br /&gt;I ended my post with a challenge to any serious journalist to properly investigate Andrew Maguire's case.  So far I have yet to see anything resembling a real investigation undertaken by the media establishment.  While I cannot say for sure why this is given how much of a juicy story the Andrew Maguire case appears to be.  I am beginning to see an interesting pattern emerge that may partly explain why this is.  I must say that I'm a bit surprised by my own findings, but they are enough for me to draw conclusions.&lt;br /&gt;&lt;br /&gt;For a period of time the only newspaper that reported the Andrew Maguire story was the New York Post.  You can read the first report &lt;a href="http://www.nypost.com/p/news/business/jpmorgan_chase_story_in_uk_DsMN4PnXFoQG5KdevIsQ7N"&gt;here&lt;/a&gt; which was first published in March (shortly after Maguire blew the whistle).  A follow-up report can be found &lt;a href="http://www.nypost.com/p/news/business/metal_are_in_the_pits_2arTlGNbMK7mb1uJeVHb0O/1"&gt;here&lt;/a&gt;.  However, a third story was published shortly after I posted my last blog entry which in hindsight is most interesting.  You can read it &lt;a href="http://www.nypost.com/p/news/business/feds_probing_jpmorgan_trades_in_gZzMvWBqOJpB55M7Rh9vwM"&gt;here&lt;/a&gt;.  What's interesting about the last post is that one week following its original publication, a correction was issue which basically nullified the entire story.  Namely, JPMorgan responded by saying there was no investigation to begin with.  The correction is posted at the bottom of the story.  The NYPost never countered.&lt;br /&gt;&lt;br /&gt;Fast forward to October 8th of this year.  Reuters reported that &lt;a href="http://en.wikipedia.org/wiki/Cftc"&gt;CFTC &lt;/a&gt;commissioner Bart Chilton had publicly announced that the CFTC was investigating silver manipulation.  Oddly, the statement wasn't from the CFTC itself. Rather Chilton felt the public deserved some answers, speaking on his own accord.  You can read that story &lt;a href="http://uk.reuters.com/article/idUKPTIP43716920101008?loomia_ow=t0:s0:a54:g12:r2:c0.342390:b38816858:z3"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;By the end of October, Reuters published a follow-up story reporting that the CFTC was in particular investigating JPMorgan and HSBC for silver manipulation.  You can read that story &lt;a href="http://uk.reuters.com/article/idUKTRE69Q0SE20101027?loomia_ow=t0:s0:a54:g12:r6:c0.425572:b38792836:z3"&gt;here&lt;/a&gt;.  This would appear to directly contradict the statements made to the NYPost back in May, that they were NOT being investigated for silver manipulation.  JPMorgan declined to comment on the matter.&lt;br /&gt;&lt;br /&gt;A day later, Reuters reported the first of at least five class action suits lawsuit filed by investors targeting JPMorgan and HSBC.  Here is the &lt;a href="http://uk.reuters.com/article/idUSTRE69R2G420101028?loomia_ow=t0:s0:a54:g12:r1:c0.440279:b38816858:z3"&gt;article&lt;/a&gt;, and here is a scanned in &lt;a href="http://www.gata.org/files/SilverManipulationLawsuit-10-27-2010.pdf"&gt;copy of the court filing&lt;/a&gt; as filed in the Southern District of New York.  About a week later, a &lt;a href="http://www.marketwire.com/press-release/Kaplan-Fox-Sues-JP-Morgan-HSBC-on-Behalf-Investors-Silver-Futures-Options-Contract-Losses-1347390.htm"&gt;second class action lawsuit was filed&lt;/a&gt;.  You can read the court filing &lt;a href="http://www.kaplanfox.com/templates/kaplanfox/images/content/pdfs/silver%20futures%20class%20action%20complaint.pdf"&gt;here&lt;/a&gt;. A week after that, a third law firm posted a &lt;a href="http://www.businesswire.com/news/home/20101111006211/en/Girard-Gibbs-LLP-Investigates-J.P.-Morgan-HSBC"&gt;press release&lt;/a&gt; announcing they were investigating silver manipulation by JPMorgan and HSBC on behalf of their clients.  As a follow-up, Reuters posted another &lt;a href="http://uk.reuters.com/article/idUSTRE6A03W720101101?loomia_ow=t0:s0:a54:g12:r2:c0.469812:b39649786:z3"&gt;article&lt;/a&gt; providing historical context to these cases.  The article begins with the obligatory mention of the &lt;a href="http://en.wikipedia.org/wiki/Hunt_brothers"&gt;Hunt brothers&lt;/a&gt;, some mention of "gold bugs" and "conspiracies", but no mention of Andrew Maguire himself.&lt;br /&gt;&lt;br /&gt;To be fair to the media establishment, on November 3rd Andrew Maguire was mentioned by name in this Wall Street Journal article describing a fourth lawsuit against JMorgan and HSBC.  You can read that article &lt;a href="http://online.wsj.com/article/PR-CO-20101103-913285.html"&gt;here&lt;/a&gt;.  Apart from that brief mention in the WSJ, most newspapers either leave out mentioning Andrew Maguire by name, or they describe a "London based Metals Trader".&lt;br /&gt;&lt;br /&gt;For those that are curious about how the manipulation works, a fifth &lt;a href="http://www.gata.org/node/9462"&gt;class action lawsuit&lt;/a&gt; was just launched on December 28th, which describes in detail HOW the manipulation works and how JPMorgan and HSBC profited from it.  It also describes WHY they were in a unique position to do so.&lt;br /&gt;&lt;br /&gt;Let's pause for a moment and ask some questions. Why is it that only the NYPost originally reported on the story? Why didn't NYPost continue to follow the story they claim to have broken after it developed? Why doesn't Reuters mention the name of Andrew Maguire?  Why doesn't the WSJ and Reuters print or mention the original e-mail transcript of the correspondences by Andrew Maguire to the CFTC?  Is this just a one-off oversight?&lt;br /&gt;&lt;br /&gt;Perhaps.&lt;br /&gt;&lt;br /&gt;But around the same time Bart Chilton's announcement was made, another interesting and related story came to light.  Namely, The Washington Post &lt;a href="http://www.washingtonpost.com/wp-dyn/content/article/2010/10/19/AR2010101907216.html"&gt;reported&lt;/a&gt; that one of the two CFTC judges (Judge Painter) had announced his retirement and that he had issued an Order requesting that all seven of his open cases NOT be transferred to the other CFTC judge, Judge Levine.  I have quoted the key paragraph of the one page order (which you can read &lt;a href="http://www.scribd.com/doc/39746954/Judge-Painter-Notice-and-Order-dcpdf-1"&gt;&lt;span style="text-decoration: underline;"&gt;here&lt;/span&gt;&lt;/a&gt;) below so you can see for yourself:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;There are two administrative law judges at the Commodity Futures Trading Commission: myself and the Honorable Bruce Levine.  On Judge Levine's first week on the job, nearly twenty years ago, he came into my office and stated that he had promised Wendy Gramm, then Chairwoman of the Commission, that we would never rule in a complainant's favor.  A review of his rulings will confirm that he has fulfilled his vow.  Judge Levine, in the cynical guise of enforcing the rules, forces &lt;span style="font-style: italic;"&gt;pro se&lt;/span&gt; complainants to run a hostile procedural gauntlet until they lose hope, and either withdraw their complainant or settle for a pittance, regardless of the merits of the case. See Michael Schroeder, &lt;span style="font-style: italic;"&gt;If You've Got a Beef With a Futures Broker, This Judge Isn't for You - In Eights [sic] Years at the CFTC, Levine Has Never Ruled in Favor of an Investor&lt;/span&gt;, Wall St. J., Dec. 13, 2000, at A1 (copy attached).&lt;/blockquote&gt;&lt;br /&gt;In the last sentence of the previous paragraph, Painter cites an article that appeared on the front page of the Wall Street Journal back in 2000.  You can read that article following Painter's Order &lt;a href="http://www.scribd.com/doc/39746954/Judge-Painter-Notice-and-Order-dcpdf-1"&gt;here&lt;/a&gt;.  It is a well researched investigation into the conduct of Judge Levine and sure enough does raise some important and relevant questions.  It's not entirely one sided, but it does put Levine on the defensive.  You would not be unreasonable to think the Wall Street Journal would publish a follow-up article vindicating their original investigation.  Right?&lt;br /&gt;&lt;br /&gt;Wrong.  The Wall Street Journal has taken an entirely different and rather underhanded tack.  Namely, Sarah Lynch penned an &lt;a href="http://online.wsj.com/article/SB10001424052702304011604575564610646663830.html"&gt;article&lt;/a&gt; which is a blatant smear on Judge Painter, describing him as a mentally ill alcoholic who would sleep at work and who was a failure in his private life.  Sandwiched in the middle of the article we find a very brief mention of Painter's Order against Levine, which is immediately followed by "Judge Levine declined to comment, but a former colleague defended Judge Levine's record and said he is fair."  The article continues to smear Painter to the very end quoting a doctor that diagnosed him with "cognitive impairment, alcoholism and depression".  The tone and content is more in line with something you might read in The National Enquirer or US Weekly.  The article seems highly uncharacteristic and rather beneath a newspaper like the Wall Street Journal with such a distinguished history.&lt;br /&gt;&lt;br /&gt;Reading Lynch's WSJ article after reading Schroeder's original investigative article and the recent Washington Post article left me feeling stunned.  The only thing  I will say in the article's defense is that it does help to explain why Painter kept this to himself for so long.  I suspect his enemies were holding a grenade over his head, and clearly they've pulled the pin.&lt;br /&gt;&lt;br /&gt;I am not the only person who noticed this miscarriage of journalism.  Barry Ritholtz was equally disgusted and &lt;a href="http://www.ritholtz.com/blog/2010/10/judge-cftc-corrupt-wendy-gramm-criminal/"&gt;contacted Sarah Lynch for an explanation&lt;/a&gt;.  Lynch response was summarized as:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Lynch wrote back to note she did a story on the judge last Friday,  but  it ran on newswires  but was not picked up by WSJ. (Reporters have  no  control over those  editorial decision). The current article is a   follow up to that prior piece.&lt;/blockquote&gt;&lt;br /&gt;This would all suggest that Lynch's article was taken out of context and something immoral is going on within the higher ranks of the WSJ with respect to reporting around the CFTC and the silver manipulation story.&lt;br /&gt;&lt;br /&gt;******&lt;br /&gt;&lt;br /&gt;I recently went to see the movie "&lt;a href="http://en.wikipedia.org/wiki/Inside_Job_%28film%29"&gt;Inside Job&lt;/a&gt;".  The film documents the reasons that led up to the 2008 economic meltdown, the impact it has had so far, and why nothing has substantially changed in terms of policy and regulation. What surprised me was how interconnected the corruption is.  There is no grand conspiracy of the sort you might think is required to fake the moon landing, or plan 9/11 as an inside job.  Rather, the conspiracy is of incentives and motives which readers of &lt;a href="http://en.wikipedia.org/wiki/Freakonomics"&gt;Freakonomics &lt;/a&gt;will quickly recognize.&lt;br /&gt;&lt;br /&gt;The movie describes the interconnectedness between the big investment banks (e.g. Goldman Sachs, JP Morgan, Morgan Stanley, etc.), the Federal Reserve, the government executive branch (both Bush and Obama), the government regulators (i.e. SEC and CFTC), the rating agencies (i.e. Moodys, S&amp;amp;P, and Fitch), and even the prestigious business schools (e.g. Harvard, Columbia, Wharton, etc.).  On that last one, I have to admit I never realized that so many business professors were on the payrolls of these banks.  Actually, they're usually not on a direct payroll, but rather work as consultants through intermediaries like &lt;a href="http://www.analysisgroup.com/"&gt;Analysis Group&lt;/a&gt;.  Inside Job argues that influence into academia is so pervasive that most of these school's curriculum has morphed to reflect the will of the banks, and tends to be very negative in view of regulation.&lt;br /&gt;&lt;br /&gt;In light of this I would argue then that the media establishment, which has become highly consolidated is also susceptible to the same conspiracy of motives that has adversely influenced some of the world's most respected economic professors.  In this blog post I have attempted to present evidence supporting my theory.&lt;br /&gt;&lt;br /&gt;Newspapers like the Wall Street Journal are generally regarded as pillars of the fourth estate.  I no longer believe the WSJ should be trusted and I'm sure others feel the same way.  Since 2008, trust in governments, banks, rating agencies, business schools, and the media has been in steady decline.  This is not good for any society.&lt;br /&gt;&lt;br /&gt;I don't believe there is any silver bullet to resolve this.  However, by continuing to shine as much light on media biases I believe it is possible for us through &lt;a href="http://en.wikipedia.org/wiki/Citizen_journalism"&gt;Citizen Journalism&lt;/a&gt; and other forms of grass roots reporting to get out of these dark days, hold the media to account, and in turn hold the government, banks, and corporations to account.  Looks like an uphill battle, but I'm an optimist.</description><link>http://hepburndata.blogspot.com/2010/12/decline-of-trust-follow-up-to-andrew.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-6495797451618635423</guid><pubDate>Mon, 23 Aug 2010 19:15:00 +0000</pubDate><atom:updated>2010-09-05T15:29:52.759-04:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">ETL</category><category domain="http://www.blogger.com/atom/ns#">Kettle</category><category domain="http://www.blogger.com/atom/ns#">Pentaho Data Integration</category><category domain="http://www.blogger.com/atom/ns#">PowerShell</category><category domain="http://www.blogger.com/atom/ns#">SSIS</category><title>Twelve Reasons to use an ETL tool / Experience Report on ETL tools: Pentaho DI, SSIS, and PowerShell</title><description>Over the past couple months I've been jumping between three different ETL tools (well PowerShell is not exactly an ETL tool but has some overlapping functionality).  The experience has given me new perspective on the strengths and weaknesses of each tool.  I hope to share with you my experiences, opinions, and recommendations.&lt;br /&gt;&lt;br /&gt;This report is not going to be as structured, or cover as many tools as something you might find from a Gartner report.  My approach here is to get into specific experiences with the tools as well as discuss why I think ETL tools are important to begin with.  My perspective is not intended to be a definitive decision making tool, but rather a useful component to the decision making process when choosing an ETL tool.&lt;br /&gt;&lt;br /&gt;Before getting into my experiences I'll give you a brief overview of what ETL is, and how these tools fit into the ETL landscape.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://en.wikipedia.org/wiki/Etl"&gt;ETL&lt;/a&gt; stands for "Extract Transform Load".  ETL tools first started showing up in the mid nineties as a response to the growing demand of &lt;a href="http://en.wikipedia.org/wiki/Data_warehouse"&gt;data warehouses&lt;/a&gt;.  The first major ETL tool was &lt;a href="http://en.wikipedia.org/wiki/Informatica"&gt;Informatica&lt;/a&gt;, and continues to be one of the best ETL tools available on the market today.  The main reason ETL tools were invented was that it has traditionally been time consuming and error prone to extract data from multiple source systems, merge those data together, and load into a data warehouse for reporting and analytics.&lt;br /&gt;&lt;br /&gt;To this day, the majority of techies in IT are unaware of what an ETL tool is.  However those same people are often tasked with solving the very problems ETL tools are designed to solve.  More experienced developers will achieve their goals through a combination of shell scripts and SQL scripts.  Less experienced developers will simply stay in their "comfort zone" and use whatever programming language they happen to be the most familiar with, with little regard for software maintenance and support.&lt;br /&gt;&lt;br /&gt;The task itself of writing a computer program to extract data, merge it (or perform some other transformation), and loading those data into a target data warehouse is conceptually very simple for the average developer.  However, it is not until you start running into the following problems do you realize that the "follow your nose" approach doesn't work so well.  Here are some of the issues you will likely run into when maintaining ETL jobs for a data warehouse (I've run into them all).  In other words, here is what an ETL tool is designed for:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Code readability: This is probably the biggest difference between an ETL tool and pretty much every other programming language out there.  It is driven off of a &lt;a href="http://en.wikipedia.org/wiki/Visual_metaphor"&gt;visual metaphor&lt;/a&gt;.  When I first saw an ETL job, it reminded me of one of those crazy &lt;a href="http://en.wikipedia.org/wiki/Rube_Goldberg_Machine"&gt;Rube-Goldberg machines&lt;/a&gt;.  However, once you acquaint yourself with the iconography, and what all the connecting lines mean, it becomes very easy to look at an ETL job and understand what it is doing, and how it works.  Unfortunately, many developers  scoff at this visual approach, after all "real coders write real code".  I also wonder if many developers are afraid of developing ETL jobs as it lays their code bare for all to see.  Managers on the other hand love this visual approach, as it allows them to scrutinize and partially understand how code is working.  Furthermore, an ETL tool will visually show your data coursing through it from one step to the next, much like observing water flow through tributaries into rivers into lakes.  When a full load takes several hours to complete, it is very reassuring to see what is actually going on with the data.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Data Element mapping: Much of what you're doing in data warehousing involves mapping a source data element to a target schema.  This is tedious work and it's potentially error prone - especially if you're lining up an INSERT statement with a SELECT statement.  ETL tools make it very easy and safe to line up and map data elements in a fraction of the time you would normally take.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Impact Analysis: Agility is often seen as a function of management.  However, if it takes you a month to figure out what has being impacted as a result of a minor schema change (e.g. adding a column to a table), then you're going to get stuck in your tracks before you have the chance to yell "Scrum!".  ETL tools make it easy to identify where data is being sourced from, and where it is being targeted.  Some ETL tools will even produce impact reports for you.  Keep in mind that this only one aspect of impact analysis, but every little bit helps.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Incremental loading: By default most developers will develop code to perform a full load on a data warehouse.  This code is not only easy to develop, but is fairly reliable.  Unfortunately, full loads can take a lot of time to complete, and as time goes on will take longer and longer.  Switching to an incremental load is tricky since it involves figuring out what data is new, changed, or has been removed.  Depending on your source system, your options may be limited.  Nevertheless, ETL tools provide functionality to assist with different scenarios&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Parallelism: Writing concurrent software is not for the "faint of heart".  There is an inherent overhead required to manage semaphores and other forms of inter-process communication.  Furthermore, the threat of deadlocks or livelocks in complex systems can often only be discovered through trial-and-error.  There are modern frameworks which greatly simplify parallelism in languages like Java or Python.  However it is hard to argue against the simplicity of the ETL approach when developing parallel data processing applications.  Furthermore, some ETL tools (e.g. IBM DataStage Enterprise Edition) can even parallelize JOINs.  This makes it possible to outperform an in-place JOIN within an RDBMS like Oracle (most SQL queries still run single threaded).&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Checkpointing and job recovery: Imagine you have just developed a script to perform a large data warehouse load.  Let's say you run the script on a weekly basis, and the load takes 8 hours to run (you run it over night).  One morning you arrive to find the script has failed 80% into the job.  Some of the tables have been updated, but there are a few that haven't been updated.  You can run the job from the beginning, but this is problematic since you need to back out all the data you have already inserted (otherwise you'll get duplicate data).  Alternatively, you can pinpoint the point of failure and run from there.  However, this requires a code change, which means you will need to test your modified code before getting it into production (and let's not forget the psychological pressure when people are waiting on fresh data).  ETL tools make it relatively easy to checkpoint your code, and resume from a failed step.  Some ETL tools have built-in checkpointing which means you don't have to even instrument your code for checkpointing. Furthermore, if code changes are required [due to a failed job], they are more often a change in configuration, rather than a real change in code, so it is generally safe to make a configuration change and resume the job with less risk than traditional scripting.  That said, not all ETL tools support checkpointing.  However, even ETL tools which don't explicitly support checkpointing can be more easily instrumented to checkpoint than traditional scripting or programming languages.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Logging and monitoring: I've already touched upon this point with respect to monitoring.  To reiterate, ETL tools allow you to see (seeing is believing) data flowing from one step to the next.  It's powerful data visualization that is not unlike a floor plan or map visualization.  Namely, it can communicate far more information than a bar chart or line chart ever will.  As for logging, ETL tools by default centralize logs and implicitly logs what you need to log (with configurable views of granularity), without having to instrument your code.&lt;/li&gt;&lt;li&gt;Centralized Error Handling:  Scripting languages don't provide centralized error handling (this has more to do with the OS and legacy applications), so you're forced to always explicitly check for errors and call the appropriate error handling routine.  ETL tools often include generic error handling routines which are invoked regardless of where the error originated from or how it is raised.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://en.wikipedia.org/wiki/Slowly_changing_dimension"&gt;Slowly Changing Dimensions&lt;/a&gt;: Although I've listed this as #9, perhaps this should be #1 from a data quality perspective.  If you don't implement a Type II (or higher) SCD policy, you're precious data warehousing investment will most certainly degrade over time.  Slowly Changing Dimensions are crucially important, as they allow you to show a consistent view of history.  I could go on at greater length as to what an SCD is, but if you're not sure just read the Wikipedia article.  At any rate, ETL tools have built-in SCD steps which allow you to define the technical key, the natural key, the effective begin and end date fields, and in some cases a version field and "current version" flag field.  Without an ETL tool SCDs are a major pain in the butt to manage, and developers will often not implement a Type II SCD policy (often because they aren't aware of what an SCD policy is), but also due to the added effort.  Don't forget the classic cop-out: "Hey, if the business analyst doesn't ask for it, I'm not going to build it".  SCDs are one of those classic business requirements that few people realize they need until it's too late.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Pivoting.  It is increasingly popular to see data being stored and logged in what are known as &lt;a href="http://en.wikipedia.org/wiki/Entity-attribute-value_model"&gt;Entity Attribute Value&lt;/a&gt; data models.   I generally prefer to see explicit data models over generalized data models (which effectively treat the database as "bit bucket").  But they're here to stay, and in some cases actually make sense.  However, it is virtually impossible to analyze EAV data without pivoting it back into columns.  This is another task which is well served through an ETL tool, and which can be very cumbersome if you intend to accomplish through traditional scripting.  In short, an ETL tool allows you specify the grouping ID (essentially the primary key for the row), the pivot key column, and the various pivot values you wish to map to columns.  Furthermore, if you need to extract data out of some kind of BLOB or something like an XML or JSON document, this too is easily achieved through standard ETL steps.  If you were to do this through scripting, you would probably have to rely on temporary tables and plenty of code.  In other words, just like with Slowly Changing Dimensions, there are fewer &lt;a href="http://en.wikipedia.org/wiki/Function_point"&gt;Function Points&lt;/a&gt; to worry about with an ETL tool&lt;br /&gt;&lt;/li&gt;&lt;li&gt;In-memory batch manipulation:  Often when manipulating data sets in bulk, many developers will rely on temporary tables, and from there perform batch operations against those temporary tables.  While this works, and is sometimes even necessary, ETL tools allow you to more easily process large batches of data as part of a continuous in-memory pipeline of records.  It is this notion of a "data pipeline" which also distinguishes ETL from other programming paradigms, such as scripting approaches (even those scripting languages that support piping operations.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Distributed transaction control: While it is possible to create distributed transactions using a transaction coordinator, often developers who eschew ETL tools will also avoid using other high level tools, such as a distributed transaction co-ordinator.  Some ETL tools include built-in integration with distributed transaction co-ordinators.  However, in my own experience I have never had the need for two-phase-commit in an ETL tool.  I suppose if I needed to ensure that my data warehouse was always 100% consistent, even during load times, then this makes sense.  However, I would urge caution when using transaction managers, since large transactions can easily lead to locking, not to mention they perform horribly under full load scenarios, and a single COMMIT statement can take hours (depending on how you've configured indexes).&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Before I get into my current experiences with PowerShell, SSIS, and Pentaho DI (PDI), I'd like to point out that there is also something called ELT (Extract Load Transform).  ELT differs from ETL in that it tends to rely on the target &lt;a href="http://en.wikipedia.org/wiki/Dbms"&gt;DBMS &lt;/a&gt;to perform the transformations.  ELT tools achieve this through code generation.   Sometimes they are referred to as "code generating ETL" versus most other ETL which distinguishes itself with the moniker "engine-based ETL".  ELT works like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Extract data from source databases(s) using standard connectivity&lt;/li&gt;&lt;li&gt;Load data in bulk into target database&lt;/li&gt;&lt;li&gt;Generate and submit SQL DML statements within target database to achieve desired data transformations&lt;/li&gt;&lt;/ol&gt;Popular ELT tools include the open source TalenD and Oracle Data Integration (formerly  Synoptic). The strength of this approach is that the target DBMS can be leveraged to achieve superior performance.  Furthermore, it is possible to override the code generation and hand-tune the generated code.&lt;br /&gt;&lt;br /&gt;If you target a robust DBMS (e.g. Oracle), the results can be impressive.  However, since you are relying on the target DBMS, your options are more limited.  Yet this is not the main reason I am wary of the ELT approach.  My worry with ELT tools is that they can easily break the visual metaphor by relying too much on hand generated code.  Furthermore, during the transformation stages it is impossible to see what is going on (I prefer maximum transparency during long running transformations).  Over time, your ELT jobs can devolve into something resembling scripting.&lt;br /&gt;&lt;br /&gt;I should point out that when I first started developing transformations and jobs using Ascential DataStage (now an IBM product), I found that most data was being read from and written to the same Oracle database, where I was working.  At my workplace I was quickly able to run circles around most ETL developers using bread-and-butter SQL queries.  Indeed, Oracle has one of the best (possibly THE best) query optimizer out there, and few of my peers understood how to take advantage of DataStage's parallel extender, and were probably fairly average ETL developers.  But since that time I have come full circle and am a firm believer in the visual metaphor.  Yes, I may sacrifice some performance, but at the time I was doing this I didn't have the perspective that I'm giving you in the aforementioned 12 reasons to use an ETL tool.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;PowerShell&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Getting back to my recent experiences, I'll start with PowerShell.  To be clear, PowerShell is NOT an ETL tool.  It's a shell scripting tool.  When I first discovered it (as part of MS SQL Server 2008) I was very impressed with what I saw.  There was a time many years ago when I was a proficient UNIX shell script programmer.  As any UNIX buff knows, mastery of the Bourne/Korn or C Shell (and all the standard command line tools) is the fastest route to becoming a UNIX power user.  Occassionally I'll still fire up Cygwin (I used to use MKS toolkit) and will pipe stuff through sed or awk.  Vi is still my favourite text editor.&lt;br /&gt;&lt;br /&gt;I was therefore very eager to embrace PowerShell as a potential ETL disruptor (one should always be on the lookout for disruptive technologies).  There's a lot to love about PowerShell.  As much as I will diss Microsoft for its monopolistic practices, they are one of the few true innovators of the back-office.  PowerShell is no exception.  Let me quickly point out some of the more powerful features of PowerShell:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Objects can be piped: Piping in UNIX is a way of life, but it can also be tedious as you're forced to serialize and deserialize streams of text, in order to achieve the desired result.  PowerShell allows you to pipe .NET objects from one shell app to the next.  For example, it's possible to pipe a result set, or directory listing as an object.  This comes in handy quite a bit.  For example, it is very easy to retrieve a particular set of columns from a query's result set, without having to sift through and parse a whole bunch of text&lt;/li&gt;&lt;li&gt;Snap-ins can be used to move between shell contexts: Currently the only snap-ins I've used are for the OS shell, and the MS SQL Server shell.   The OS shell is the default shell, and listing "entities" will basically just list files and directories as you would normally expect.  But while in SQLServer mode, the shell lists databases and tables as though they were directories and files.  While most DBAs are accustomed to running database commands using a query shell window, PowerShell goes far beyond what is easily possible using a basic SQL (or T-SQL) query interface.&lt;/li&gt;&lt;li&gt;Parsing files is easy: Going back to the object pipes, it is also very easy to pipe the contents of a file, and parse those contents as a file object.  PowerShell has also reduced the number of steps required to open and read a file, making such operations second nature.  Let's face it, processing text files is as common activity in shell processing as ever (e.g. you need to parse a log file for an error message), so it's very much a pleasure to use PowerShell for these tasks&lt;/li&gt;&lt;li&gt;Great documentation: It's not often I compliment software documentation, but I feel it should be acknowledged that Microsoft even rethought those dry UNIX "man" pages.  Perhaps it's just one technical writer at MS making this difference.  But whoever you are, hats off for having a good sense of humour and making the dry and technical a pleasure to read.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;With all that said, I've decided that PowerShell in its current form (I used version 2.0) has some major drawbacks from an ETL perspective.  First off, by definition it's a "scripting language".  So by definition it will never follow the visual metaphor.  Depending on your perspective this is may be a good thing.  But I for one am a huge believer in code visualization.  While it is possible to run a script and examine log output in real time, it's not easy to tell what is happening where when tasks are running in parallel, and what the relationship between those tasks are.&lt;br /&gt;&lt;br /&gt;The second major issue I have with PowerShell is that it simply cannot perform the function of what an ETL tool can do.  Originally I had coded my ETL tasks as T-SQL Stored Procedures, which were basically wrappers for INSERT/SELECT statements.  Unfortunately under high volumes, these queries caused database locking to occur.  Apart from using TEMP tables, my only other option was to perform my JOINs within an ETL tool.  This all but makes PowerShell a complete showstopper from a data warehousing perspective when compared to an ETL tool.  Nevertheless, I did for a while attempt to call my individual ETL transformations within PowerShell, and instead treat PowerShell as a job controller.&lt;br /&gt;&lt;br /&gt;The third issue with PowerShell  is that I couldn't find a simple way to log everything.  Depending on the type of error being raised I had to capture it in any one of three ways.  Furthermore, I basically had only one easy option for where to put the log: into a text file.&lt;br /&gt;&lt;br /&gt;Overall I am very impressed with PowerShell from the point of view that Microsoft has elevated scripting to the next level. But as an ETL job controller, it simply cannot do as good a job as an ETL tool like Pentaho DI or SSIS.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;SSIS&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;At first I wasn't overly impressed with SSIS.  It struck me as overly architected, and too reliant on scripting steps, thus undermining the all important visual metaphor.  I have come to believe that it's architecture is more elegant than meets the eye, but I still believe it could use a heck of a lot more built-in steps.  For example, the lack of an INSERT/UPDATE step is a glaring omission.  I also couldn't find any built-in step for hashing (which comes in very handy when comparing multiple columns for changes) or an explicit DELETE or UPDATE step (you must accomplish DELETEs and UPDATEs using an OLE DB transformation step).  SSIS also lacks a Slowly Changing Dimension step.   Instead, SSIS provides you with a wizard which will generate multiple steps for you out of existing components.  &lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;I don't agree with Microsoft's approach here since it makes SCDs less configurable, and more reliant on moving parts.&lt;/span&gt;&lt;span style="font-size:100%;"&gt;  Contrast this with Pentaho DI which includes all of these steps, as well as supporting a single configurable Slowly Changing Dimension step.  That said, it is possible to add in third party steps, or simply code the step using the built-in scripting components.  However, the problem with third-party add-ins is that they complicate deployments.  And the problem with over reliance on scripting steps is that they can't be explicitly configured, rather logic must be coded. As I keep mentioning, scripting steps break with the visual metaphor, so you have to either read the step's label or open it up to know what it means (ETL iconography makes it easily possible to see what is going on in a single glance).&lt;br /&gt;&lt;br /&gt;Continuing my gripes with SSIS, there is some confusion as to what steps to use when it comes to database connectivity.  Namely, SSIS 2008 offers both OLE DB and ADO.NET connectors.  I'm currently using OLE DB, but should I be using ADO.NET now?  I'm not too concerned about that dilemma.  What drives me bananas is that they also include a SQLServer destination step.  It turns out that this step only works properly if you're running your SSIS server on the same server as your SQL Server.  For myself this is maddening, and I'm not sure why Microsoft even bothered including this step - I couldn't discern any difference in performance from the OLE DB destination step.&lt;br /&gt;&lt;br /&gt;Okay, that's the bad news.  The good news is that SSIS has some pretty nifty features making it worthwhile, especially if you're executing on a Microsoft strategy.  I'll list out the strong points of SSIS, and what sets it apart from Pentaho:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Checkpointing and transaction management: This is probably my favourite feature of SSIS, when compared to Pentaho DI.  Checkpointing is more or less what you would expect.  If a job fails part way through, SSIS records (in a local XML file) the point of failure, as well as all of the job variable values.  This allows an operator to investigate the cause of failure (typically by examining logs), take corrective action, and then restart the job from point of failure.  In the world of ETL, this scenario is not uncommon.  On top of checkpointing, SSIS also makes it possible to contain multiple steps as part of a single transaction, so that if any step within the transaction fails, previous steps are rolled back.  This is accomplished through tight integration with Microsoft's Distributed Transaction Controller - a component of the Windows operating system.  I have experimented with this feature, and have got it to work successfully, although at first it was less than straightforward to configure.  That said, I've never had a need for this feature.  Also, this transaction management (from what I can tell) only works at the job level, and not at the transformation level, which is where it's more needed.  One last thing I should point out about checkpointing:  Don't confuse job checkpointing with incremental loading.  Incremental loading (as opposed full refreshes) requires the ETL developer to identify when data has been INSERTed, UPDATEd, or DELETEd since the last time the job was run, which requires upfront analysis and cannot be addressed solely by a tool.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Consolidated logging, and ability to log to Windows Event Log.  All ETL tools produce a single log.  Most ETL tools allow for varying levels of verbosity, all from a single setting.  SSIS goes one step further and nicely integrates with the Windows Event Viewer which if you're a Microsoft shop is HUGE benefit.  As I mentioned earlier, logging was a sore spot for PowerShell.  So the difference between SSIS and PowerShell is day-and-night.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Consolidated error handling.  Typically when a job fails (for any reason), you want your ETL job to fire off an e-mail to technical support.  SSIS makes this very easy to do, as there is a single consolidated Error Handling job which you can develop for.  Both PowerShell and Pentaho DI appear to lack this feature.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;C# and VB.NET scripting steps: Most ETL tools include a scripting step.  This makes it possible to perform arbitrary transformations, as well as doing other tasks like adding or removing rows from the pipeline, or even adding and removing columns.  Pentaho DI relies on either Java or JavaScript for its main scripting step.  Personally, I'm fine with just the JavaScript step in Pentaho.  SSIS allows you to script in either VB.NET or C# (C# is only supported in SSIS 2008 or later) which if you're already a .NET developer is a major benefit.  Furthermore, it's significantly easier for that same code to me made into a standalone data flow step component (which gets you back to the visual metaphor).  My only warning with SSIS's scripting step, is that I can easily see developers overly relying on it to perform the majority of their transformations.  As I keep saying: the visual metaphor is the most powerful concept behind ETL, and dramatically lowers post-implementation costs, such as support costs, and the cost of impact analysis.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Persistent look-up cache: This is new with SSIS 2008.  This feature allows you to physically store data in a local binary file optimized for keyed look-ups. I've only gone so far as to test that this thing works, and can be used in a look-up step.  I'm not exactly sure what the primary business driver is behind this feature.  I would be nice if Microsoft would elaborate on what the recommend use cases are, and just as importantly, where NOT to use this feature.  My concern is that developers may use this feature in bulk look-up situations, where a sort/merge/join would perform just as well if not better.  Compare this to, Pentaho DI offers "stream lookup" step.  Can't say which approach is better though.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Excellent SQLServer integration: This goes without saying.  My only gripe here is the dedicated SQLServer target step which appears only to work if the job is running on the same server as the SQLServer DB.  I found out the hard way that it won't work otherwise. While I could go on about some of the subtle benefits of the SQLServer integration (e.g. you can easily configure batch INSERTs into cluster indexed tables to perform very well), the biggest advantage is the same advantage you get whenever you homogenize technology under a single vendor.  Namely, by using SSIS and SQLServer together, if something goes wrong Microsoft can't blame some other vendor.  Nor can they say "well, it's really not designed to work for _that_ database)."  These games that vendors will play with you is the main reason why IT departments prefer homogeneous architectures.  On the downside, there is also very tight MS Excel integration. As of this writing though, this integration doesn't work when running SSIS in 64-bit mode.  Maybe with Office 10 this will be fixed now, since Office 10 is the first version of Office to ship a 64-bit version.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Dynamic step configurations using expressions: SSIS includes a declarative expression language which allows you to dynamically configure most settings.  The most obvious use of expressions is to configure the database connection string(s), based on a variable value (which is set at run-time).  At a higher level, expressions make it much easier to develop generic packages/jobs which share the same code base, but have different source data stores and target data marts.  I feel Pentaho DI is lacking in this department.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:100%;"&gt;Built-in text analytics steps.  This is one of the first things I noticed about SSIS.  Basically there are two Data Flow (i.e. transformation) steps which support basic text analytics.  The first is a keyword extractor.  Namely, the step can be configured to extract relevant keywords from documents.  These keywords can then be used to classify documents, making them easier to search against and report on.  The second text analytic step allows you to search a given set of text against a list of keywords retrieved from a configurable database query.  This step is very useful for doing things like sentiment analysis (i.e. determining if a block of text includes words like "cool" or "fun" versus words like "sucks" or "fail"), but it could also be used to flag sensitive information, such as people's names.  While, Pentaho DI does not include any built-in steps like these, it does have a regular expression step (which SSIS does not explicitly include), but if I had to choose, I prefer SSIS's text analytics functionality.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;So as you can see SSIS has some very powerful features.  If you're already a Microsoft shop and already use MS SQLServer, and have a pool of developers and application support personnel already trained on Microsoft's technology, I doubt you'll find a better SSIS tool to meet your requirements.  Even if you're not using SQLServer but are doing most of your development on a Microsoft stack, it's probably a very good fit.  However, if you're on non-Microsoft platform, there are many other options you should to consider.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Pentaho Data Integration (PDI)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Which brings me to Pentaho DI.  One of the most platform neutral ETL tools out there.  I am an unabashed fan of Pentaho, partly I'll admit, because I have quite a bit of experience working with it.  Before I get into what it is I like and dislike about Pentaho DI I'll explain how it's different from other ETL tools at a very high level.&lt;br /&gt;&lt;br /&gt;Pentaho DI is an "Open Source" ETL tool, based on the Kettle project.  However, unlike Linux which has many different flavours and supporting vendors, there is only one vendor, Pentaho, that supports the tool.  This is not unlike MySQL which was only being supported through SUN and now Oracle.  I'm fine with that.&lt;br /&gt;&lt;br /&gt;Just to digress for a moment, years ago I read "&lt;a href="http://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar"&gt;The Cathedral and the Bazaar&lt;/a&gt;" which is basically a manifesto for open source development.  It argues that people will self select projects to work on, and this will drive the future of software development.   I don't agree with the overarching thesis, but there are merits to this argument.  The problem with volunteer based projects is that people tend to volunteer for selfish reasons.  In the case of Linux, it is seen by many as a way of combating Microsoft's perceived tyranny.  Shared adversaries are after all the strongest bonds between humans (I heard that from an anthropologist, and believe it to be true).  When it comes to other forms of crowdsourcing, people often derive enjoyment from contributing.  For example, I've contributed to &lt;a href="http://www.openstreetmap.org/"&gt;Open Street Map&lt;/a&gt; adding in streets and landmarks in my neighbourhood.  If I had more time on my hands I'd probably be doing more of it.  It's kind of fun.  I can also see how millions of other people might find Open Street Map fun.  And let's not forget how popular Wikipedia is.  But the problem is that most of the work people get paid to do is not necessarily fun - even in the software world.  Some things need to get done, even if nobody wants to do them.  The other problem with open source projects, is that until they reach a critical mass, competing projects can pop up, which not only dilutes the overall pool of talent but also reduces the likelihood of any one project succeeding.  For example, if there were many different open source operating systems at the time Linux began to emerge, all competing with each other, would we have Linux today?  I'm not so sure.  My point in all this is that I support companies like Pentaho backing open source projects, even if that means more corporate control over the project.  The important thing is that the entire source code is code is available, and anyone can contribute to that source code.  This therefore puts Pentaho in the position of being a _service_ company, which is really the future of software any way you look at it.&lt;br /&gt;&lt;br /&gt;Okay, moving on to Pentaho DI itself.  First, the bad news.  As with any piece of software, there are bugs.  Now, if you have a support contract with Pentaho, this is never a problem as they will provide you with a patch (and you will receive intermediate patch releases).  It's also possible to install those patches yourself (or fix the bug yourself), but that involves building the application from source files. That said, all of the bugs I've encountered can be worked around.  And to be fair, I haven't found any bugs yet in the latest version (version 4.0).&lt;br /&gt;&lt;br /&gt;Now on to the good news.  Here is what I love about Pentaho DI:&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Plenty of built-in steps:  Although I've had to resort to the JavaScript scripting step on more than one occasion (version 4.0 now includes a Java scripting step), most transformations can be entirely built with their built-in steps.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Web ETL steps built-in:  PDI comes with several web extraction steps, include GET and POST lookups, WSDL lookups, and a step to check if a web service is even available.  There is also a step for decoding XML documents.  For JSON documents, I just use the JavaScript step (it's not too hard to find JavaScript code to decode JSON documents).  I mention all of these steps because I believe web ETL is a big part of the future of ETL.  This is especially true now with so many data services out there.  Contrast this to SSIS, which is rather weak in this department.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Single Type II Slowly Changing Dimension step: Unlike SSIS (which doesn't have a single SCD step, but rather uses a wizard to generate multiple steps), PDI includes a single Type II SCD step.  I've used it quite a bit, and it works very well and is simple and straightforward to configure.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Normalized database repository:  The other ETL tools I have used (SSIS and DataStage) will allow you to save jobs in the form of a file, or within a database.  However, I find those files to be relatively opaque and difficult to query (I suppose with the right tools it would be possible to query them).  PDI on the other hand stores all of its Jobs and Transformations in a normalized database.  This makes it possible to query your ETL code using standard SQL queries.  I have taken advantage of this by writing queries to produce simple impact analysis reports, showing which tables are impacted by which jobs.  Since impact analysis is a huge aspect of data Enterprise Architecture, this is a real benefit.  Furthermore, because PDI jobs are stored within a centralized RDBMS database, I find that it is much easier to deploy and troubleshoot jobs in production than it is with SSIS.  To be fair, SSIS does allow packages to be deployed to SQLServer, but their really just deployed as an opaque "BLOB" which can't be easily queried.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Java based (runs on any platform): There was a time where Java applications were not considered appropriate for applications requiring high performance, such as an ETL tool.  Well, to my surprise, Java has caught up pretty closely to binary native compiled code.  As such, the benefits of "write once, run anywhere" are starting to come into focus.  Indeed, I have run PDI on both Windows and Linux machines without incident. I also can't tell the difference between the performance of SSIS versus PDI&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size:130%;"&gt;&lt;span style="font-size:100%;"&gt;Clustering/Cloud deployment: I honestly don't have any hands-on experience deploying PDI as part of a cluster, or within a cloud.  Although I have heard some impressive benchmarks, with reports of over 200,000 records processed per second within a single transformation.  Because it is possible to cluster PDI servers, this means it is possible to scale up individual ETL transformations across multiple server nodes.  Furthermore, Pentaho is now beginning to offer cloud based solutions.  This is also something I have no hands-on experience with, but I like what I'm hearing.  Contrast this with SSIS, which is not "cluster aware" and must be installed as "standalone" in Windows Cluster.  To be fair, I believe part of the reason for SSIS's shortcoming is that it is more transaction aware than PDI, and with that comes certain challenges.  I also think that it is possible to run SSIS as part of a cluster, but Microsoft does not recommend this.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;Well, that's my experience report.  Hope you learned something.  Feel free to e-mail me or comment any corrections.  It's quite possible I've made an incorrect assumption, so if any PowerShell, SSIS or PDI developers are reading this, don't hesitate to chime in.  I'll happily update this blog if I agree there is a problem with something I've written.</description><link>http://hepburndata.blogspot.com/2010/08/twelve-reasons-to-us-etl-tool.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>3</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-823127920195033533</guid><pubDate>Tue, 20 Jul 2010 01:43:00 +0000</pubDate><atom:updated>2010-07-19T22:49:52.934-04:00</atom:updated><title>The Tooth Fairy, Easter Bunny, and the Long Form Census Privacy Breech</title><description>"What do you think of the long form census"  I ask inquisitively, adding "You know, the census form, which one out of every five persons needs to fill out.  In order to provide information about stuff like ethnicity, religion, and income".&lt;br /&gt;&lt;br /&gt;"I don't want to give out information to the government if I don't have to."  I'm sternly told.&lt;br /&gt;&lt;br /&gt;"Why not?" I ask politely.  "It will never get released." I elaborate: "I've worked quite a bit with the Canadian census and with StatCan.  I know from dealing with them that it is impossible for them to release any private information."  I qualify my argument further "StatCan has strict privacy regulations preventing them from releasing identifiable information of either an individual or a business.  It's the law."  I go on "they have a 'rule of 3' whereby information must be aggregated to at least 3 parties. Furthermore, they also have a 'round to 5 rule', whereby all respondent counts are rounded to the nearest 5".  I natter on some more "there has never been a privacy breech, even for a terrorism related investigation.  The chief statistician reports directly to the prime minister of Canada.  The OECD ranks StatCan as the best national statistics body in the world, and their workers take great pride in all of this, and see privacy protection as their 'Prime Directive' - I know these guys."&lt;br /&gt;&lt;br /&gt;No change.  "I don't care.  You never know what they'll do with this information.  That's just what they tell you."&lt;br /&gt;&lt;br /&gt;Hmmm, I had to dig deeper.  "But they 'the big bad government' [not StatCan] already know how many bedrooms you have, where you live, what you buy, who you're friends are.  If you believe they are corrupt, they can access bank records, your Internet history, your telephone calls, your property records.  Even I can bring up Google Street Maps and take a look at your house.  I can browse your Facebook profile [which Facebook surreptitiously has opened up unbeknownst to many].  They can check your vehicle registration, check your credit rating, your insurance premiums, your credit and debit card purchases.  All they need to do is send a bunch of faxes and make a few phone calls.  They have the authority already.  You're too late."  Feeling a bit frustrated, I become more forceful in my argument "The only way you can get around them is by living like The Unibomber Ted Kaczynski.  You would have to abandon your home and all your possessions and start again.  You'd have to drive up to northern Ontario in the cover of night, build a log cabin, and live there for the rest of your life, living off the land - even making your own soap.  And there's no guarantee that will work."&lt;br /&gt;&lt;br /&gt;It's no use, his mind is made up.  No counterargument, just "I don't want to give out my personal information to the government."&lt;br /&gt;&lt;br /&gt;I give up on this conversation and change the subject.  I can see it's going nowhere.  Thank goodness for blogs...&lt;br /&gt;&lt;br /&gt;The problem is, without a census, it's really difficult to make educated policy decisions.  The census was specifically designed to be analyzed in aggregate.  It was not and is not intended to be used as a repository of personal information.  Although detail records are required in order to generate accurate pictures at a higher level, and these analyses almost always are based around geography.&lt;br /&gt;&lt;br /&gt;As you may well be aware, our Prime Minister, Stephen Harper has single-handedly dismantled one of the most essential and proud pieces of Canadian infrastructure: the Canadian Census Long Form.  As such, he is hobbling all levels of government, including provincial and municipal, not to mention the thousands of businesses that depend on this information to make informed decisions like where to open their storefront.  To make matters worse, he is greatly hindering foreign investors from investing in Canada.  The biggest obstacle to making any kind of investment is basic information.  Canada has traditionally been a very safe place for companies to invest, in large part due to the educated work force and middle class consumers.  However, if you don't know where that workforce lives, or where those consumers are, it's a lot tougher to justify those multimillion dollar investments.&lt;br /&gt;&lt;br /&gt;But the reason why Harper has decided to drop the Long Form census is that he knows there are a lot of people who have unfounded and irrational fears of the Census Long Form.  Many of those people are immigrants who come from tyrannical governments who do abuse their power (although ironically these governments also don't need the census to locate and persecute).&lt;br /&gt;&lt;br /&gt;Stephen Harper understand this hysteria all too well, and is using it bolster support for himself and his government, facts be damned.  And in selling this hatchet job, he is using every dirty trick in the book.  Last week the line was "the government doesn't need to know how many bedrooms you have in your house" - a clear attempt conflate the census with "the government should stay out of your bedroom", which was a common tag-line that comes up when sexual liberties are being discussed.  Today it was "why does the government need to infringe on people's privacy?", a line that Glenn Beck or Bill O'Reilly may attempt to deliver with a straight face.&lt;br /&gt;&lt;br /&gt;There is something also very "meta" about all this.  Harper is effectively abandoning Evidence Based Decision making in order to... have everyone else abandon Evidence Based Decision making.  It is shameless fear mongering, and it is bad for everyone, including those who don't like filling out forms.&lt;br /&gt;&lt;br /&gt;On that note, I would like to conclude with a plea to reach out and help people understand why the census is so important, and why privacy fears are unfounded.  A good start is to keep this conversation going.  This &lt;span style="font-style: italic;"&gt;should &lt;/span&gt;be our prime ministers job.  Instead he would rather us be fearful of the bogeyman and hide under our beds.&lt;br /&gt;&lt;br /&gt;Pathetic.</description><link>http://hepburndata.blogspot.com/2010/07/tooth-fairy-easter-bunny-and-long-form.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-8524213278155940408</guid><pubDate>Sat, 08 May 2010 02:47:00 +0000</pubDate><atom:updated>2010-05-08T20:18:31.294-04:00</atom:updated><title>The Curious Case of Andrew Maguire / My challenge to serious journalists</title><description>In August 2008 economic world history was made, as a major world record was smashed.  You probably never heard this news - even if you read the business section of a major newspaper, or work in the finance industry.  Instead, what you may have heard was the consequence of this news.  Namely: gold and silver prices hitting a major slump.&lt;br /&gt;&lt;br /&gt;But behind those price slumps was the biggest and most concentrated naked short position  ever held for any commodity.  To give you some numbers, world silver production is 680 millon oz annually.  The short position of 3 (or fewer) US banks increased 11 fold from July to August, with a naked short position of 169 million oz.  That's 25% of all annual silver production!  Gold followed a similar pattern and the same 3 banks increased their short position in August to account for 11% of all annual gold prodution. For those that aren't familiar with "&lt;a href="http://en.wikipedia.org/wiki/Short_sell"&gt;short selling&lt;/a&gt;", it's the practice of selling something NOW (that you have borrowed), and paying for it in the future.  So, if the price goes down you make a profit.  If the price goes up you get "squeezed" and have to take a loss.&lt;br /&gt;&lt;br /&gt;If you'd like to see the numbers for yourself, they're all available in the official Bank Participation Report for &lt;a href="http://www.cftc.gov/dea/bank/dmojul08f.htm"&gt;July &lt;/a&gt;and &lt;a href="http://www.cftc.gov/dea/bank/deaaug08f.htm"&gt;August&lt;/a&gt;.  While it is possible that the banks themselves were not ordering the short-selling (but rather possibly clients), it is very unlikely that all short-sellers just happened to go to the same bank.  It should be noted that this short-selling of silver caused the price of silver to halve from $20 down to $10.&lt;br /&gt;&lt;br /&gt;So why would these banks be betting so heavily against silver and gold?  Only the banks themselves know for sure, and they aren't saying much.  Nevertheless, things worked out well for them and as a result of dumping so much bullion on the market at once, in effect they triggered a larger sell-off (in part due to &lt;a href="http://en.wikipedia.org/wiki/Stop_loss_order#Stop_orders"&gt;stop loss orders&lt;/a&gt;) which created a prophecy that fulfilled itself.  These banks more than likely generated huge profits.&lt;br /&gt;&lt;br /&gt;Not surprisingly many people (especially "&lt;a href="http://en.wikipedia.org/wiki/Gold_bug"&gt;Gold Bugs&lt;/a&gt;") cried foul and claimed there was market manipulation.  Independent Toronto based investor &lt;a href="http://harveyorgan.blogspot.com/"&gt;Harvey Organ&lt;/a&gt; submitted a &lt;a href="http://www.capitolconnection.net/capcon/cftc/032510/Presentations/Panel%204/CFTC%20Hearing%20on%20Metals%20Harvey%20Organ.pdf"&gt;written statement to the CFTC&lt;/a&gt;, which he accompanied with oral testimony at their March 25th hearing.  You can see the video &lt;a href="http://www.capitolconnection.net/capcon/cftc/032510/cftc-archive-wmv.htm"&gt;here &lt;/a&gt;(skip to 4:40:00 into the webcast).  Silver analyst Ted Butler also wrote a piece on this event referring to it as &lt;a href="http://news.silverseek.com/TedButler/1219417468.php"&gt;The Smoking Gun&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Of course this doesn't actually &lt;span style="font-style: italic;"&gt;prove &lt;/span&gt;there was any manipulation going on.  A common retort to the conspiracy theorists was "If there was a conspiracy, surely &lt;span style="font-style: italic;"&gt;somebody &lt;/span&gt;would spill the beans".  Well, as if a Sasquatch were to walk out of the forest and into broad daylight, so arrived Andrew Maguire, a London based metals trader who mainly trades in the silver market.  Maguire was privy to a very small network of metals traders (ostensibly with JP Morgan Chase at the helm) who would co-ordinate "takedowns" in the sliver market through the aforementioned "&lt;a href="http://en.wikipedia.org/wiki/Naked_short"&gt;naked short selling&lt;/a&gt;" of huge concentrated quantities of silver contracts.  Whether this constitutes long term or short term manipulation is certainly up for debate.  But what Mr. Maguire's revelations appear to show (if they are true) is that manipulation is happening on a fairly regular basis, and at the great expense of honest investors.  It should be pointed out that Maguire himself profited from these manipulations and is said to be a wealthy man.&lt;br /&gt;&lt;br /&gt;For several months, Andrew Maguire provided evidence to the &lt;a href="http://en.wikipedia.org/wiki/Cftc"&gt;Commodities Future Trading Commission&lt;/a&gt; (this is the regulatory body [similar to the SEC] that regulates commodities trading, including gold and silver) that demonstrated how the manipulations were orchestrated, and how they played out.  I have pasted the entire e-mail trail (which was later leaked to the &lt;a href="http://gata.org/"&gt;Gold Anti-trust Action Committee&lt;/a&gt;) at the bottom of this blog.  However, as you can hear in this &lt;a href="http://kingworldnews.com/kingworldnews/Broadcast/Entries/2010/3/30_Andrew_Maguire_&amp;amp;_Adrian_Douglass.html"&gt;audio  interview&lt;/a&gt;, Maguire himself was prevented from attending the CFTC hearing on bank position limits on March 25th.  Nevertheless, Adrian Douglas and Bill Murphy (of GATA) were able to attend, and were able to provide testimony on Maguire's behalf.  You can see the footage on YouTube &lt;a href="http://www.youtube.com/watch?v=Sl2zi3khUFI"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;But for me, what's most disturbing is that there has been no media coverage from any major newspaper or television network.  And what's even weirder is that if you listen to the interview it's stated that interviews were lined up with Bloomberg and Reuters, but were both cancelled within 24 hours with no explanation.  It is this fact alone that has prompted me to write this blog entry.&lt;br /&gt;&lt;br /&gt;I have spent the last four weeks research this, and so far all signs point to market manipulation.  In fact after a while it becomes hard to understand how the market could not be manipulated.  To make matters worse, if you read the prospectuses on Gold and Silver ETFs (e.g. here is the GLD &lt;a href="http://www.spdrgoldshares.com/media/GLD/file/SPDRGoldTrustProspectus.pdf"&gt;prospectus &lt;/a&gt;and &lt;a href="http://us.ishares.com/content/stream.jsp?url=/content/repository/material/prospectus/silver.pdf&amp;amp;mimeType=application/pdf"&gt;SLV &lt;/a&gt;prospectus) your eyes will pop out at how complex and risky they appear to be.  In fact just this week an excellent in depth investigation into these prospectuses was released which showed that there are major undisclosed conflicts of interest for JPMorgan (the custodian for SLV) and HSBC (the custodian for GLD). The report goes on to suggest that the ETF has been architected to effectively avoid any obligations around reporting these conflicts of interest.  In some regards you could say if you wanted to design an investment product that avoids all forms of regulation, but which can be sold by trusted investment banks, you couldn't do a better job than SLV and GLD.  What's also weird is that in Canada I can't even find the prospectus for IGT (the Canadian equivalent of GLD) on the Canadian iShares web site.  And a search conducted just now on the Canadian ishares site for "IGT" returns nothing (but it was there in their web site 2 days ago, and a mention can still be found in the PDF).  For a complete analysis of the SLV and GLD ETFs, I highly recommend you read the well researched &lt;a href="http://solari.com/archive/Precious_Metals_Puzzle_Palace/"&gt;report&lt;/a&gt; by Catherine Austin Fitts and Carolyn Betts.&lt;br /&gt;&lt;br /&gt;One of the interesting items brought up in this report is the conflict of interest that the US Government itself has with respect to this manipulation.  I quote the following paragraph:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;The absence of these disclosures is particularly disturbing given the potential conflicts between the banks' responsibilities serving as custodians and trustee and their responsibilities and liabilities as members and shareholders of the Federal Reserve Bank of New York. The NY Fed serves as the depository for the US government and as agent for the Exchange Stabilization Fund on behalf of the U.S. Secretary of Treasury. The ESF allows the Secretary of the Treasury to deal in gold, foreign exchange, and other instruments of credit and securities. NY Fed member banks typically serve as agents of the NY Fed in providing services. In addition, JPMorgan Chase and HSBC maintain responsibilities as Primary Dealers of U.S. government securities. When called upon to defend the U.S. government’s interests in the bond market, or the U.S. dollar’s interest in the currency market, or to help prevent another financial meltdown, whose interests will be primary? Will it be the central bank and government with pressing national security interests or retail investors?&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;I was curious to see what if anything the US government has to say about the obvious decline in sliver prices due to that massive short sell in August of 2008.  As such I referred to the &lt;a href="http://minerals.usgs.gov/minerals/pubs/commodity/silver/"&gt;US Geological Survey Annual Summary on silver&lt;/a&gt; for 2008 and 2009 (released in &lt;a href="http://minerals.usgs.gov/minerals/pubs/commodity/silver/mcs-2009-silve.pdf"&gt;2009 &lt;/a&gt;and &lt;a href="http://minerals.usgs.gov/minerals/pubs/commodity/silver/mcs-2010-silve.pdf"&gt;2010&lt;/a&gt; reports, respectively).  Given that they comment on changes on the average price of silver, I was curious to see what they would say about the precipitous drop in average price for 2009.  Well, it's either complete ignorance or Orwellian (depending on your point of view), but the report states (contradicting its own reported numbers) that the price actually went &lt;span style="font-style: italic;"&gt;up&lt;/span&gt; in 2009.  And by up, they clearly mean &lt;span style="font-style: italic;"&gt;down&lt;/span&gt;.  Cue Seth Myers and Amy Poeler: "Really USGS!  Really!!!"&lt;br /&gt;&lt;br /&gt;To be fair, I have found the following &lt;a href="http://www.financialsense.com/editorials/2010/0423.html"&gt;rebuttal &lt;/a&gt;from Jeffrey Christian of The &lt;a href="http://www.cpmgroup.com/main.php"&gt;CPM Group&lt;/a&gt; (a consultancy which in part serves the bullion banks).  While Mr. Christian's points are valid, they don't do much to put my mind at ease.  Firstly, he mainly rebuts "red herring" arguments that aren't important, and doesn't bother to comment on the heart of Mr. Maguire's allegations or the activity in August 2008.  Secondly, none of his explanations use much real data, instead relying on the abstract providing scant details of any names, numbers, or dates.  I am skeptical of his sincerity in getting to the bottom of this problem.  I think a truly dedicated consultant would say "Hey, I don't know for sure, but let's figure out how to get these facts".  He doesn't go there.&lt;br /&gt;&lt;br /&gt;So what's my point?  Why am I blogging this?&lt;br /&gt;The reason I am blogging this is that I simply cannot figure out why no serious journalist has investigated Andrew Maguire's story.   I mean let's face it: Either this is all one big hoax (which is a story unto itself).  Or this is part of a massive fraud involving rigged markets and major international players.  This is the kind of story journalists dream of.  What's going on here people!&lt;br /&gt;&lt;br /&gt;So, before I post a copy of Andrew Maguire's e-mails to the CFTC, I put forth the following challenge to any and all serious journalists out there:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Obtain a copy of or confirm or deny the existence of conversations between Andrew Maguire and the CFTC&lt;/li&gt;&lt;li&gt;Obtain a comment from the CFTC on their interactions with Andrew Maguire&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Obtain a comment from the five (5) Bullion banks on the massive short of silver for August 2008 clarifying whether they were behind the short or not, and why&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Record an interview with Andrew Maguire to learn more about his relationship with the bullion banks.&lt;/li&gt;&lt;/ol&gt;And so I end this blog post with a copy of the allegedly leaked e-mails between Maguire and the CFTC.  Enjoy!&lt;br /&gt;&lt;br /&gt;----&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;Sent: Tuesday, January 26, 2010 12:51 PM&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Cc: Chilton, Bart [CFTC]&lt;br /&gt;Subject: Silver today&lt;br /&gt;&lt;br /&gt;Dear Mr. Ramirez:&lt;br /&gt;&lt;br /&gt;I thought you might be interested in looking into the silver trading today. It was a good example of how a single seller, when they hold such a concentrated position in the very small silver market, can instigate a selloff at will.&lt;br /&gt;&lt;br /&gt;These events trade to a regular pattern and we see orchestrated selling occur 100% of the time at options expiry, contract rollover, non-farm payrolls (no matter if the news is bullish or bearish), and in a lesser way at the daily silver fix. I have attached a small presentation to illustrate some of these events. I have included gold, as the same traders to a lesser extent hold a controlling position there too.&lt;br /&gt;&lt;br /&gt;Please ignore the last few slides as they were part of a training session I was holding for new traders.&lt;br /&gt;&lt;br /&gt;I brought to your attention during our meeting how we traders look for the "signals" they (JPMorgan) send just prior to a big move. I saw the first signals early in Asia in thin volume. As traders we profited from this information but that is not the point as I do not like to operate in a rigged market and what is in reality a crime in progress.&lt;br /&gt;&lt;br /&gt;As an example, if you look at the trades just before the pit open today you will see around 1,500 contracts sell all at once where the bids were tiny by comparison in the fives and tens. This has the immediate effect of gaining $2,500 per contract on the short positions against the long holders, who lost that in moments and likely were stopped out. Perhaps look for yourselves into who was behind the trades at that time and note that within that 10-minute period 2,800 contracts hit all the bids to overcome them. This is hardly how a normal trader gets the best price when selling a commodity. Note silver instigated a rapid move lower in both precious metals.&lt;br /&gt;&lt;br /&gt;This kind of trading can occur only when a market is being controlled by a single trading entity.&lt;br /&gt;&lt;br /&gt;I have a lot of captured data illustrating just about every price takedown since JPMorgan took over the Bear Stearns short silver position.&lt;br /&gt;&lt;br /&gt;I am sure you are in a better position to look into the exact details.&lt;br /&gt;&lt;br /&gt;It is my wish just to bring more information to your attention to assist you in putting a stop to this criminal activity.&lt;br /&gt;&lt;br /&gt;Kind regards,&lt;br /&gt;Andrew Maguire&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Ramirez, Eliud [CFTC]&lt;br /&gt;To: Andrew Maguire&lt;br /&gt;Sent: Wednesday, January 27, 2010 4:04 PM&lt;br /&gt;Subject: RE: Silver today&lt;br /&gt;&lt;br /&gt;Mr. Maguire,&lt;br /&gt;&lt;br /&gt;Thank you for this communication, and for taking the time to furnish the slides.&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Cc: BChilton [CFTC]&lt;br /&gt;Sent: Wednesday, February 03, 2010 3:18 PM&lt;br /&gt;Subject: Re: Silver today&lt;br /&gt;&lt;br /&gt;Dear Mr. Ramirez,&lt;br /&gt;&lt;br /&gt;Thanks for your response.&lt;br /&gt;&lt;br /&gt;Thought it may be helpful to your investigation if I gave you the heads up for a manipulative event signaled for Friday, 5th Feb. The non-farm payrolls number will be announced at 8.30 ET. There will be one of two scenarios occurring, and both will result in silver (and gold) being taken down with a wave of short selling designed to take out obvious support levels and trip stops below. While I will no doubt be able to profit from this upcoming trade, it is an example of just how easy it is to manipulate a market if a concentrated position is allowed by a very small group of traders.&lt;br /&gt;&lt;br /&gt;I sent you a slide of a couple of past examples of just how this will play out.&lt;br /&gt;&lt;br /&gt;Scenario 1. The news is bad (employment is worse). This will have a bullish effect on gold and silver as the U.S. dollar weakens and the precious metals draw bids, spiking them higher. This will be sold into within a very short time (1-5 mins) with thousands of new short contracts being added, overcoming any new bids and spiking the precious metals down hard, targeting key technical support levels.&lt;br /&gt;&lt;br /&gt;Scenario 2. The news is good (employment is better than expected). This will result in a massive short position being instigated almost immediately with no move up. This will not initially be liquidation of long positions but will result in stops being triggered, again targeting key support levels.&lt;br /&gt;&lt;br /&gt;Both scenarios will spell an attempt by the two main short holders to illegally drive the market down and reap very large profits. Locals such as myself will be "invited" on board, which will further add downward pressure.&lt;br /&gt;&lt;br /&gt;The question I would expect you might ask is: Who is behind the sudden selling and is it the entity/entities holding a concentrated position? How is it possible for me to know what will occur days before it will happen?&lt;br /&gt;&lt;br /&gt;Only if a market is manipulated could this possibly occur.&lt;br /&gt;&lt;br /&gt;I would ask you watch the "market depth" live as this event occurs and tag who instigates the move. This would surly help you to pose questions to the parties involved.&lt;br /&gt;&lt;br /&gt;This kind of "not-for-profit selling" will end badly and risks the integrity of the COMEX and OTC markets.&lt;br /&gt;&lt;br /&gt;I am aware that physical buyers in large size are awaiting this event to scoop up as much "discounted" gold and silver as possible. These are sophisticated entities, mainly foreign, who know how to play the short sellers and turn this paper gold into real delivered physical.&lt;br /&gt;&lt;br /&gt;Given that the OTC market (where a lot of the selling occurs) runs on a fractional reserve basis and is not backed up by 1-1 physical gold, this leveraged short selling, where ownership of each ounce of gold has multi claims, poses a very large risk.&lt;br /&gt;&lt;br /&gt;I leave this with you, but if you need anything from me that might help you in your investigation I would be pleased to help.&lt;br /&gt;&lt;br /&gt;Kind regards,&lt;br /&gt;Andrew T. Maguire&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Sent: Friday, February 05, 2010 2:11 PM&lt;br /&gt;Subject: Fw: Silver today&lt;br /&gt;&lt;br /&gt;If you get this in a timely manner, with silver at 15.330 post data, I would suggest you look at who is adding short contracts in the silver contract while gold still rises after NFP data. It is undoubtedly the concentrated short who has "walked silver down" since Wednesday, putting large blocks in the way of bids. This is clear manipulation as the long holders who have been liquidated are matched by new short selling as open interest is rising during the decline.&lt;br /&gt;&lt;br /&gt;There should be no reason for this to be occurring other than controlling silver's rise. There is an intent to drive silver through the 15 level stops before buying them back after flushing out the long holders.&lt;br /&gt;&lt;br /&gt;Regards,&lt;br /&gt;Andrew&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Cc: BChilton [CFTC]; GGensler [CFTC]&lt;br /&gt;Sent: Friday, February 05, 2010 3:37 PM&lt;br /&gt;Subject: Fw: Silver today&lt;br /&gt;&lt;br /&gt;A final e-mail to confirm that the silver manipulation was a great success and played out EXACTLY to plan as predicted yesterday. How would this be possible if the silver market was not in the full control of the parties we discussed in our phone interview? I have honored my commitment not to publicize our discussions.&lt;br /&gt;&lt;br /&gt;I hope you took note of how and who added the short sales (I certainly have a copy) and I am certain you will find it is the same concentrated shorts who have been in full control since JPM took over the Bear Stearns position.&lt;br /&gt;&lt;br /&gt;It is common knowledge here in London among the metals traders that it is JPM's intent to flush out and cover as many shorts as possible prior to any discussion in March about position limits. I feel sorry for all those not in this loop. A serious amount of money was made and lost today and in my opinion as a result of the CFTC's allowing by your own definition an illegal concentrated and manipulative position to continue.&lt;br /&gt;&lt;br /&gt;Bart, you made reference to it at the energy meeting. Even if the level is in dispute, what is not disputed is that it exists. Surely some discussions should have taken place between the parties by now. Obviously they feel they can act with impunity.&lt;br /&gt;&lt;br /&gt;If I can compile the data, then the CFTC should be able to too.&lt;br /&gt;&lt;br /&gt;I would think this is an embarrassment to you as regulators.&lt;br /&gt;&lt;br /&gt;Hoping to get your acknowledgement.&lt;br /&gt;&lt;br /&gt;Kind regards,&lt;br /&gt;Andrew T. Maguire&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Sent: Friday, February 05, 2010 7:47 PM&lt;br /&gt;Subject: Fw: Silver today&lt;br /&gt;&lt;br /&gt;Just logging off here in London. Final note.&lt;br /&gt;&lt;br /&gt;Now that gold is undergoing short covering, please look at market depth right now in silver and evidence the large selling blocks in a thin market being put in the way of silver regaining the technical 15 level, which would cause a short covering rally and new longs being instigated. This is resulting in the gold-silver ratio being stretched to ridiculous levels.&lt;br /&gt;&lt;br /&gt;I hope this day has given you an example of how silver is "managed" and gives you something more to work with.&lt;br /&gt;&lt;br /&gt;If this was long manipulation in, say, the energy market, the shoe would be on the other foot, I suspect.&lt;br /&gt;&lt;br /&gt;Have a good weekend.&lt;br /&gt;&lt;br /&gt;Andrew&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Andrew Maguire&lt;br /&gt;Sent: Tuesday, February 09, 2010 8:24 AM&lt;br /&gt;To: Ramirez, Eliud [CFTC]&lt;br /&gt;Cc: Gensler, Gary; Chilton, Bart [CFTC]&lt;br /&gt;Subject: Fw: Silver today&lt;br /&gt;&lt;br /&gt;Dear Mr. Ramirez,&lt;br /&gt;&lt;br /&gt;I hadn't received any acknowledgement from you regarding the series of e-mails sent by me last week warning you of the planned market manipulation that would occur in silver and gold a full two days prior to the non-farm payrolls data release.&lt;br /&gt;&lt;br /&gt;My objective was to give you something in advance to watch, log, and follow up in your market manipulation investigation.&lt;br /&gt;&lt;br /&gt;You will note that the huge footprints left by the two concentrated large shorts were obvious and easily identifiable. You have the data.&lt;br /&gt;&lt;br /&gt;The signals I identified ahead of the intended short selling event were clear.&lt;br /&gt;&lt;br /&gt;The "live" action I sent you 41 minutes after the trigger event predicting the next imminent move also played out within minutes and exactly as I outlined.&lt;br /&gt;&lt;br /&gt;Surely you must at least be somewhat mystified that a market move could be forecast with such accuracy if it was free trading.&lt;br /&gt;&lt;br /&gt;All you have to do is identify the large seller and if it is the concentrated short shown in the bank participation report, bring them to task for market manipulation.&lt;br /&gt;&lt;br /&gt;I have honored my commitment to assist you and keep any information we discuss private,however if you are going to ignore my information I will deem that commitment to have expired.&lt;br /&gt;&lt;br /&gt;All I ask is that you acknowledge receipt of my information. The rest I leave in your good hands.&lt;br /&gt;&lt;br /&gt;Respectfully yours,&lt;br /&gt;&lt;br /&gt;Andrew T. Maguire&lt;br /&gt;&lt;br /&gt;* * *&lt;br /&gt;&lt;br /&gt;From: Ramirez, Eliud&lt;br /&gt;To: Andrew Maguire&lt;br /&gt;Sent: Tuesday, February 09, 2010 1:29 PM&lt;br /&gt;Subject: RE: Silver today&lt;br /&gt;&lt;br /&gt;Good afternoon, Mr. Maguire,&lt;br /&gt;&lt;br /&gt;I have received and reviewed your email communications. Thank you so very much for your observations.</description><link>http://hepburndata.blogspot.com/2010/04/curious-case-of-andrew-maguire-my.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-1027545442081009944</guid><pubDate>Sat, 19 Sep 2009 20:58:00 +0000</pubDate><atom:updated>2009-09-20T20:59:46.760-04:00</atom:updated><title>What Does This Mean? Cost effective Metadata Management</title><description>It's been a long while since I've updated this blog.  I've been very busy with a number of other projects. But with the wife and kids away this weekend I've found some time to get some ideas out that I've been sitting on for the past few weeks.&lt;br /&gt;&lt;br /&gt;This blog post is on Metadata.  If you're not sure what Metadata is, you can read an older post I published a couple years ago &lt;a href="http://hepburndata.blogspot.com/2006/12/metadata-defined.html"&gt;here&lt;/a&gt;.  My goal is to go over the challenges of retrieving metadata, how they can be overcome, and how to prioritize one's efforts.&lt;br /&gt;&lt;br /&gt;I have yet to walk into an organization that has any kind of Metadata Management Strategy in place.  I occassionally hear about organizations - usually very large and mature - that do have a handle on their Metadata, but I myself have yet to walk into an organization where even 25% of the data elements are documented.  Even most data warehouses lack good Metadata.  The problem with Metadata is that it's not seen as being on any critical path, and his hence below the "priority threshold" and never gets done.  Yes, some IT manager knows it should get done, but there's always something more important to do, no matter what's going on.&lt;br /&gt;&lt;br /&gt;Building a business case to source metadata is nearly impossible, since it's uses are almost always unexpected.  I recently read an article from one of those big research organizations explaining how to build a business case for Master Data Management.  The article was mainly predicated on improving data quality for the purpose of optimizing Direct Mail campaigns (read: junk mail), with the goal of reducing the number of wasted mail-outs.  I took a look at the companies interviewed for the article and as expected they're mainly just big software vendors.   My experience with Metadata is that it is most useful when you least expect it, and most companies end up scrambling to figure out what their data means, or worse, suffer the consequences.&lt;br /&gt;&lt;br /&gt;What to do?  Well, there's two problems here really: First, how do I ensure I'm properly capturing Metadata on a go-forward basis;  Second, how do I retroactively retrieve Metadata for legacy systems?&lt;br /&gt;&lt;br /&gt;Addressing the first question I will describe a &lt;span style="font-style: italic;"&gt;realistic&lt;/span&gt; approach which can be applied to organizations of any size (including a 1 person company), and which do NOT incur additional costs of development, or require the outlay for new systems and staff.  This approach is in part based on a distilled version of the ISO 11179 standard for Metadata Registries.  I'm a big fan of the ISO 11179 standard since it gets to the essence of the problem, and is not bogged down in tools and technologies.  It allows for a wide degree of interpretation which can be as robust or simple as your needs require.  Most importantly, it helps us formulation the important &lt;span style="font-style: italic;"&gt;questions &lt;/span&gt;we should be asking about the data.&lt;br /&gt;&lt;br /&gt;At the heart of  ISO 11179 is the Data Element.  For most organizations this is just a column in a table, but it can an XML attribute, an HTML meta tag, or a field in a VSAM file.  The ISO 11179 approach breaks down data elements into what is referred to as a "Data Element Concept", and a "Value Domain" (VD).  The Data Element Concept (DEC) describes the semantics of the data, which should be expressed in "plain English" terms.  The DEC is not concerned with the nitty gritty representation of the information, just what it means to a layperson.  For example, I might have a data element which contains a patient's recorded temperature, taken when they entered the hospital.  So the DEC could be written as "patient's recorded temperature at time of hospital triage".&lt;br /&gt;&lt;br /&gt;However, this is only part of the picture.  How this data is encoded and reprented is also critical to our understanding - primarly to support data interoperability.  This is where the Value Domain comes in.  When descrbing our VD, we need to ask the following questions:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Are there a discrete set of permitted values (known as an Enumerated Value Domain), or must the VD be described as range of values (known as a Described Value Domain)?&lt;br /&gt;&lt;/li&gt;&lt;li&gt;If the VD is an Enumerated Value Domain, what are each of the Value Meanings behind each permitted value?  List them all.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;If the VD is not Enumerated, but rather "Described", we need to know how the data is encoded and what its valid range of values are.  Depending on the data in question, different encodings are possible.  I'll give you some common examples: for dates and times, what timezone are we using, what our time measurement unit is (e.g. 24 hours, 12 hour, or POSIX time), and how accurate is the time?  For monetary values, we should know what currency are we transacting in?  For physical measures, what is our dimensionality, unit of measure, accuracy, and upper and lower limits?  For text based fields, what characters or patterns are allowed or disallowed?&lt;/li&gt;&lt;/ol&gt;In our main example, the VD could be described as "temperature measured in celsius, integer values only".&lt;br /&gt;&lt;br /&gt;It should also be recognized that a DEC and VD both derive from the same Conceptual Domain (CD).  In our example, the Conceptual Domain would simply be "temperature".&lt;br /&gt;&lt;br /&gt;Putting our DEC and VD together we get the following definition:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Patient's recorded temperature taken at time of hospital triage, measured in celsius, rounded to the nearest degree (integer values only)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;We now have a reasonably comprehensive data definition which can be used to not only &lt;span style="font-style: italic;"&gt;explain&lt;/span&gt; what the data means to the layperson, but it can also be used as tool to measure data quality.&lt;br /&gt;&lt;br /&gt;Another thing you may notice about this definition is the order in which I've explained its meaning, and how I've added "rounded to the nearest degree" as an added measure of clarity.  It's always important to start with the DEC, followed by the VD.  In other words, start simple, and elaborate later.  One of the challenges of formulating data definitions is finding the right balance between simplicity and accuracy.  Too simple, and you don't always capture the essence of the data element, or its distinguishing features.  To precise, and you confuse people.  By choosing how you order your words, you can get the best of both worlds.  You should be able to read the first clause and think "okay I get it".  If you're doing more detailed analysis you can look more carefully at the second clause.&lt;br /&gt;&lt;br /&gt;Is that it then?  Is that all you need to capture?  No, ideally you should be maintaining a Metadata Registry, which can contain all sorts of other details, such as data derivation information, and even semantic classification schemes (e.g. a hierarchy such as a phylogenetic tree).  However, this requires having full time data stewards and heavy amounts of data governance.  Most companies simply can't afford that.  Furthermore, many companies are adopting JAD or Agile approaches to systems development, and tend to rely on generalist developers to create and maintain data models.&lt;br /&gt;&lt;br /&gt;So if there's no centralized registry, where to store the Metadata? I believe the right approach is through tight-coupling of the Metadata the data schema whenever possible.  After all, the schema definition constitutes metadata unto itself.  This is especially true for relational databases, which form the cornerstone of most IT shops these days.  In particular, every major RDBMS (including Microsoft Access) has a comment field for every column, and table.  Furthermore,  many RDBMSs like MySQL and Oracle, even allow you to query to this Metadata as just another table allowing for both search and reporting against company Metadata.  You might say, the Metadata registry was there all along!    That said, practically no developers will go out of their way to put data definitions in there.  But if you show them how little effort is required, and what can be pulled out of it, and most importantly, how to go about formulating data definitions, then you would be amazed at how quickly you can start building a complete Metadata Registry, basically for free.&lt;br /&gt;&lt;br /&gt;Similarly, XSD documents contain &lt;xs:annotation&gt; and &lt;xs:documentation&gt; tags which can be used to capture data definitions for data stored in XML.  As with an RDBMS, the definitions can be thrown in there as the data is being modeled in XML.  Unfortunately, what I've found is that because XML is primarily a data interchange format (as opposed to a data store), the beneficiary of the Metadata is someone else.  Nevertheless, it's still important to capture these Metadata since you may get a call from an external party inquiring into the meaning of the data, so there are still selfish reasons for capturing it in the schema document.&lt;br /&gt;&lt;br /&gt;As for other data stores where there is no room for documentation (e.g. VSAM files), there are different approaches you can take.  One is to put the Metadata into a simple Excel spreadsheet, or Wiki, or Cobol Copybook.  The question you need to ask yourself is: "Where will people most likely look for the Metadata?"&lt;br /&gt;&lt;br /&gt;Taking things a step further, some vendors-  notably IBM and Microsoft - have integrated their data management tools (i.e. RDBMS, ETL tool, and reporting/BI server) so that data lineage can be determined automatically.  This makes it easy to determine where a data element on report or cube came from.  Knowing this lineage greatly reduces your time for impact analysis when making a change to any system.  In fact, I believe this is the only feasible way of addressing a "where is" search requirement for data elements.&lt;br /&gt;&lt;br /&gt;Okay, I've answered the first big question on how to capture Metadata on a go-forward basis.  Now to address the second question, which is how to capture Metadata from legacy systems.&lt;br /&gt;&lt;br /&gt;The first step is to profile the data.  The basic steps for data profiling involve:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Obtaining a representative sample of the entire data set for the data element in question.&lt;/li&gt;&lt;li&gt;Obtaining a representative sample of the recently produced data for the data element in question.  This is important since the source systems may have changed over time, and so the definition may have changed over time.  Depending on what your requirements are, you may only need a data definition for the most current data&lt;/li&gt;&lt;li&gt;Look for outliers in both the entire and recent data set.  This is often referred to as boundary analysis.  It can either turn up the limits of the value domain, or possibly data quality issues.  Either way you probably want to know this if you're enquiring into the meaning of the data.&lt;/li&gt;&lt;/ol&gt;The second step is to "follow the bread crumbs" to see where the data originated from.  This is detective work so there is no doubt there is an art to this step as it may be difficult to obtain and analyze source code.  I once saw a situation where one of the data loaders had no source code at all and was completely a blackbox (eventually this had to be rewritten for obvious reasons).  The approach I normally take is to "triangulate" around the data, looking for all the clues I can.  For example, I'll look at system documentation and business requirement documents.  I'll also look at downstream applications to see how they're using the data.  Often data is replicated, so sniffing around there can help too.&lt;br /&gt;&lt;br /&gt;After data profiling and analyzing the data lineage, you may be in a position to formulate a data definition.  At this point I recommend taking your best stab at writing the data definition.   You now have an &lt;span style="font-style: italic;"&gt;unconfirmed&lt;/span&gt; data definition. With this data definition in hand you now need to hunt down at least one of the following persons to confirm or clarify it:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Those that are responsible for producing the data  (e.g. customer service rep)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Those that are responsible for consuming the data (e.g. floor manager)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Those that are considered an authority on the data (e.g. business unit owner)&lt;/li&gt;&lt;/ol&gt;WARNING WARNING DANGER DANGER most people don't have time to answer this question, and e-mail communication won't always work here.  It's much better to get the person on the phone, or even better, show up to their desk with coffee and donuts.  Treat people with respect, and they will respect you.&lt;br /&gt;&lt;br /&gt;One last thing I'd like to finish on.  I mentioned earlier that Metadata is used when you least expect it.  Well there's at least one unexpected situation which occurs predictably (huh!?!): Database archiving.  I recently attended a talk by &lt;a href="http://www.amazon.com/Database-Archiving-Keep-Lots-Press/dp/0123747201/ref=sr_1_1?ie=UTF8&amp;amp;qid=1253494610&amp;amp;sr=8-1-fkmr0"&gt;Jack Olsen&lt;/a&gt;, one of the world's leaders in database archiving.  Jack pointed out that it is necessary to archive Metadata with data should we hope to interpret it correctly, long after the supporting systems have been retired (makes sense if you think about it).  Legal archiving requirements are pushing retention dates farther and farther out.  If data has been archived, but cannot be clearly interpreted, it is effectively lost (think missing Rosetta Stone).  And if the data is lost, you automatically lose in court.&lt;br /&gt;&lt;br /&gt;That's it for this post.  In my next post I'd like to delve into data modeling.&lt;br /&gt;---&lt;br /&gt;Shameless plug: I'm currently looking for data management work.  If you know of anyone looking for a data architect or other data/information expertise, please contact me at: neil@hepburndata.com (+1-416-315-5514).</description><link>http://hepburndata.blogspot.com/2009/09/what-does-this-mean-cost-effective.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>1</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-5998097452996853505</guid><pubDate>Thu, 05 Jun 2008 15:21:00 +0000</pubDate><atom:updated>2008-06-05T21:02:32.535-04:00</atom:updated><title>Creative Destruction: Column based Data Warehouses.  The next generation of data warehouse technologies has arrived.</title><description>I first heard about column-based data warehouses in the &lt;a href="http://www.informationweek.com/news/storage/showArticle.jhtml?articleID=206801203"&gt;Feb. 25th print edition of Information Week&lt;/a&gt;. The article interviewed &lt;a href="http://en.wikipedia.org/wiki/Michael_Stonebraker"&gt;Michael Stonebraker&lt;/a&gt; (Stonebraker was one of the original architects of the relational database, and released the first low cost RDBMSs under the company &lt;a href="http://en.wikipedia.org/wiki/Ingres"&gt;Ingres&lt;/a&gt;) , and has fully embraced this new architecture which takes a fundamentally different approach to storing and retrieving structured tabular data.&lt;br /&gt;&lt;br /&gt;It's hard to exaggerate the impact column-based database design is likely to have on the DBMS industry.  You may have recalled the hype around object-oriented databases back in the 90s, but column-oriented databases are qualitatively different for the following two reasons:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;There is a pent-up demand for OLAP databases to reduce storage demands, and improve query performance.&lt;/li&gt;&lt;li&gt;Column-based architecture is primarily a back-end change, and generally does not affect application or data architecture.  It's a bit like going from a 32-bit OS to a 64-bit OS, but with significantly greater performance returns.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;But what is the difference between traditional record-based DBMSs and column-based DBMSs?  The difference is quite simple to understand, and according to Stonebraker can yield 50-fold performance increases over traditional record-based RDBMSs.  While I have yet to test a column-based database myself, it's not hard to understand why such dramatic performance increases can be had.  The reason is thus: Traditional databases (like the ones you're probably using now) treat records as a basic element of storage and retrieval.  It's not possible to access an attribute of a record, without first retrieving the entire record.  On the flip side, column-based architectures align data to the column (a.k.a. field or attribute).  As such, when querying, only the data that is required to be analyzed, is retrieved.   Furthermore, because data from a column perspective tends to be similar, column-based warehouses can also enjoy significant compression advantages.  Many people [incorrectly] believe that compression always impedes access times.  However, more often than not, compression can actually speed up retrieval times since there is less data to be moved from the data store through the bus.&lt;br /&gt;&lt;br /&gt;This may seem a little academic, so let me illustrate with an example.  Imagine I was a bank, and had a master customer table which  stores the most pertinent attributes for each customer.   For this example, lets say this table contains 100 columns.  If I want to  produce a simple report showing all customer accounts that have an excess of $10,000 in their savings account, along with their actual balance.  Using current architectures, my query would retrieve all 100 columns for each customer that matched those search criteria, and then format the results to only show me the account number and balance.  Let's say that there are 1,000,000 customers that match those criteria.  As such, I'm effectively copying 100 columns for those million customer records into temporary storage (usually memory), so as to display only two columns (account number and account balance).  A column-based data warehouse would instead take a different approach, and only retrieve the two columns which are required, ignoring the other 98 columns right from the get-go.  In other words, I'm moving one fiftieth (1/50th) of the data.  Furthermore, because account balances tend to be within a small range, and are compressed as a single unit, I am moving even less data through my bus.  The end result is a dramatic performance increase.  We're talking about the difference between a query executing in under an hour versus the same query taking more than a day to complete.  With such huge discrepancies in performance, there are real business competitive advantages to be had here... for now.&lt;br /&gt;&lt;br /&gt;I decided to take a quick look at the actual vendors selling these new column-based.  I've summarized my key findings for six major vendors:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;a href="http://www.calpont.com/"&gt;Calpont&lt;/a&gt;'s CNX Data Warehouse Platform is a drop-in solution for Oracle.  Namely, the solution can be deployed as a standalone DBMS or as an optimization/acceleration layer into an existing DBMS environment.  Currently Oracle is supported, with planned support for DB2.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.infobright.com/"&gt;Infobright&lt;/a&gt;'s Brighthouse boasts superior compression.  Namely, Infobright claims it can achieve a 40:1 compression ratio. Contrast this with Oracle 11g, which uses record based storage, advertises a 2:1 compression ratio (although I've heard that depending on the data, it can actually achieve between a 3:1 and 4:1 ratio in tests).  Furthermore, the company claims it leverages MySQL "making seamless use of the mature connectors, tools, and resources associated with this widely deployed open source database".  From this it sounds like Brighthouse is effectively a forked version of MySQL.  Also worth noting for my local readers is the fact that the business is based here in Toronto.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.paraccel.com/"&gt;ParAccel&lt;/a&gt;'s Analytic Database boasts a Shared-nothing, MPP (massively parallel processing) architecture.  MPP architectures are normally associated with data warehouse appliances, such as Netezza, but there's no reason why an appliance solution is required, so it's nice to finally see a vendor selling this technology.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.sand.com/"&gt;Sand Technology&lt;/a&gt;'s SAND/DNA Analytics solution claims that no indexing or specialized schemas are required, and that this is a unique feature.  I'm not sure about that last claim, but they are certainly emphasizing ease-of-use as a major selling feature.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.sybase.com/"&gt;Sybase&lt;/a&gt; also has a columnar DB: Sybase IQ.  Apart from being a recognized and stable vendor, Sybase IQ boasts the first petabyte benchmarks.  Interestingly, their whitepaper discusses column-based encryption, which I've never seen before.  As Sybase points out, this is ideal for data aggregators with mult-client services.  However, I would say that Sybase's brand is probably their biggest advantage for the PHB crowd.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.vertica.com/"&gt;Vertica &lt;/a&gt;is Michael Stonebraker's company.  I particularly like their marketing materials as they provide real benchmarks on their web site here: http://www.vertica.com/benchmarks.  They also emphasize that their technology runs on "green" grids of off-the-shelf servers, and they even have a hosted "cloud" solution.  Personally, I'm a big fan of hosted solutions, and after all the catastrophes I've witnessed, I would argue that they're lower risk.  But there are those that prefer to drive than fly since it gives them a sense of control over the situation, statistics be damned.&lt;/li&gt;&lt;/ol&gt;It will be interesting to see what shakes out over time.  I suspect that every DB vendor is currently working on a column-based solution of their own, so waiting is certainly an option.</description><link>http://hepburndata.blogspot.com/2008/06/creative-destruction-column-based.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>1</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-9004945050695872731</guid><pubDate>Mon, 02 Jun 2008 16:03:00 +0000</pubDate><atom:updated>2008-06-02T12:50:29.775-04:00</atom:updated><title>On a lighter note, how do we change the culture of software development</title><description>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBAtBBXcrnCWpv1gAYsmrZXjwYuLWjhCdEIcTO_FY2Bo0-e_imUj28GpOLUkwN1xer4Tl05etLOegkI8JDVuU979vAQ9-8ryXA-Gh9qLH7w4hLedjlo84-zvX9MHv1q916oPEn/s1600-h/msstudio_ad.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBAtBBXcrnCWpv1gAYsmrZXjwYuLWjhCdEIcTO_FY2Bo0-e_imUj28GpOLUkwN1xer4Tl05etLOegkI8JDVuU979vAQ9-8ryXA-Gh9qLH7w4hLedjlo84-zvX9MHv1q916oPEn/s400/msstudio_ad.jpg" alt="" id="BLOGGER_PHOTO_ID_5207315765747322914" border="0" /&gt;&lt;/a&gt;I just noticed this ad on the back of a recent print edition of Information Week.  I'd seen these ads before (mainly on CNET's web site), but not in print form in a magazine geared towards data management professionals.&lt;br /&gt;&lt;br /&gt;The ad campaign looks like it's targeting 12 year old boys, but I suppose it must be targeting the 12 year old boy within us all... or geeky software developers.  But before I start shooting fish out of a barrel, I thought I share with you a very strange subtext to this mini-narrative.  If you look closely into the crowd scene there is in fact one character who stands out.  This of course is the busty bird woman I suppose our hero is fighting to impress.  In case you missed it, here's the zoomed in version below.&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiauUWjdsJgsuZk6ITSlHQrNg2BRMpPBGTV3es4ZmouseIBlOxghky957npv2FFnGDXbJnMAgOhX87aHI6A-WJNBA_zqYzv1R8tgYTV6DjUFb5X1RD6n8GItsD3Bbm1HvjnWqmB/s1600-h/bird_woman.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiauUWjdsJgsuZk6ITSlHQrNg2BRMpPBGTV3es4ZmouseIBlOxghky957npv2FFnGDXbJnMAgOhX87aHI6A-WJNBA_zqYzv1R8tgYTV6DjUFb5X1RD6n8GItsD3Bbm1HvjnWqmB/s320/bird_woman.jpg" alt="" id="BLOGGER_PHOTO_ID_5207317372065091634" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;If anyone has any theories as to why this particular bird woman was chosen (maybe she is in fact concerned about the cow-man in the magic bubble) - I'd really love to know!&lt;br /&gt;&lt;br /&gt;But for me, what this ad is is a reminder of how haphazard IT decisions tend to be, and how many still view IT as a crafts-based practice.  It's a bit like those sugar cereal commercials that would run during Saturday morning cartoons, as the parental pressure points are well understood.&lt;br /&gt;&lt;br /&gt;Most developers tend to be intelligent, analytical, and creative types.   If they're lucky, they will work for a dynamic software company that embraces their talents in all phases of product management and design.  However, most are not so lucky and end up working for bureacratic IT departments with similar expectations.  The developers role here, is to merely take requirements for a System Design Specification, code them in a development environment, and do a little unit testing to ensure the requirements are satisfied.   That's all folks.&lt;br /&gt;&lt;br /&gt;Yet so many developers I meet want to take on business analysis (well, the fun parts), project management (the fun parts), human factors (the fun parts).  Sometimes these guys even want to take on data modeling (but usually treat the DB like a bit bucket for their in-memory data structures).  These guys will also tell you about some new gizmo or technology which is on the cusp of solving all our problems, or will introduce us to the modern age.  Of course all that Change Management, training, data interoperability, service management, risk management, quality management,  and all that other stuff is just pointless busywork that gets in the way of true innovation.&lt;br /&gt;&lt;br /&gt;But I actually relate to these guys and totally know where they're coming from.  When they show up for their first week on the job, it's all sunshine and lollilops with so much optimism and hope in the air.  I try my best to foster this passion and excitement, but also explain that large organizations tend to have role-based cultures, and don't always appreciate talented and knowledgable individuals.  I describe career paths which include business analysts, project management, and application architecture.  But it's never as exciting or free as what they have in mind.&lt;br /&gt;&lt;br /&gt;I personally remember a time not so long ago where systems were being developed on unstable resource constrained platforms with low level languages (e.g. Windows 95 and C++).  System stability was a major issue, and talented developers who could analyze a core dump file, or who knew the inner works of memory management, or who could create helper systems to better recover from instabilities were invaluable.  To this day I would reckon there is still value to these talents.  However, most people can't really describe these talents, so their importance is greatly diminished now that Windows is a stable OS, and that software development has been highly abstracted.&lt;br /&gt;&lt;br /&gt;But these killer developers were more than just killer debuggers, they were full on rennaisance men and women, capable of tracing a low level _if_ statement all the way back up to a business rule, and even suggesting new business processes to accomodate hardware limitations (this still happens).  These were the only people who had complete line-of-site to all aspects of the business, and these were the people who held the true power.&lt;br /&gt;&lt;br /&gt;My point is, with no new problems for these people to solve, they now must bring the business back down to their level of tools and technologies.  So,  I reckon that it is perhaps a cultural issue we must solve first, before we can tackle the obvious problems at hand.   In the meantime, the vendors will be selling more sugar cereal.  Can't get enough of those Sugar Smacks!</description><link>http://hepburndata.blogspot.com/2008/06/on-lighter-note-how-do-we-change.html</link><author>noreply@blogger.com (Neil Hepburn)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" height="72" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBAtBBXcrnCWpv1gAYsmrZXjwYuLWjhCdEIcTO_FY2Bo0-e_imUj28GpOLUkwN1xer4Tl05etLOegkI8JDVuU979vAQ9-8ryXA-Gh9qLH7w4hLedjlo84-zvX9MHv1q916oPEn/s72-c/msstudio_ad.jpg" width="72"/><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-3025988561065026949</guid><pubDate>Tue, 20 May 2008 14:37:00 +0000</pubDate><atom:updated>2008-05-20T13:08:38.488-04:00</atom:updated><title>Unstructured Data and the Hunt for the Elusive Customer Service KPI / Looking for Work</title><description>The culture of data and analytics is starting to take foot among the general population.  Books like "Competing on Analytics" and "Supercrunchers" herald a new era of business whereby every little decision is vetted through intense fact-based scrutiny.&lt;br /&gt;&lt;br /&gt;Well, it would seem to be that way.  The truth is, most processes within most businesses (even the businesses discussed in the aforementioned books) are managed using crude or imprecise measures.  The classic example for me is the new IT project.  Such projects routinely introduce new business processes, data elements, and business rules.  The success or failure of such projects is typically measured based on whether or not the project delivered on time and on budget.  The operational ramifications are rarely considered.  However, interestingly post-implementation costs [which account for at least 80% of the overall project's costs] are rarely measured.  Most people would say that the main reason for this is that it would require periodic follow-up assessments (i.e. "busywork"), and that the original project team has long disbanded.  I partially agree with this, but I would argue that there is a bigger issue here.&lt;br /&gt;&lt;br /&gt;People (especially those in senior positions) are loathe to be measured.  Ask a VP, director, or manager if she likes the idea of measuring her team's performance and she would say "yes!".  Ask the same person if &lt;span style="font-style: italic;"&gt;she &lt;/span&gt;would like to be measured, and you'll get a long winded answer as to why her performance can't be gauged using numeric measures, and should be based on a "360 review" with all manner of testimony and exhibits.  Indeed, today's performance reviews are are more like going on trial, rather than a quantifiable measure of performance. It's okay to subject others to KPIs - just not me!&lt;br /&gt;&lt;br /&gt;None of this should come as any surprise.  But what is interesting is that the new generation of knowledge workers wants to be measured by KPIs, and they want better KPIs to be measured against.  How do I know this?  I know this because I routinely talk to people who are on the front line of customer service, and I ask them as many questions as I possibly can.  Here is what I've learned folks: The generation that grew up with the Internet (usually under the age of 28), and has been asked to work with a computer and a telephone quickly realizes that their job is being measured by a few simple KPIs, and those KPIs can be quickly gamed.  This is a generation that looks for inefficiencies on eBay, that has developed methods for finding free music and videos, that can quickly fact check for discrepancies when BS is suspected.  These people are playing massive multi-player on-line games, are on every major social network, and are relentlessly logical and efficient when working with rules-based systems (i.e. companies).&lt;br /&gt;&lt;br /&gt;It should come as no surprise that the new generation of customer services reps, and other front-line knowledge workers are quick to find the path of least resistance when approaching their job.  This is their comfort zone.&lt;br /&gt;&lt;br /&gt;The bad news is that the current KPIs that measure these people, are crude and blunt instruments.  Taking customer service as an example.  There are two basic KPIs which gets used:  The first, "Average Handle Time", or the time it takes to get the customer off the phone.  The second, "Number of Call-backs", or the degree to which the customer had his issue "resolved".  That's basically it.  Of course, most companies perform random audits to keep reps on their toes.  While this "boogeyman" style of management keeps the train on its rails, it hardly provides any goals to aspire for.  Even a callback can made be for any reason under the sun, and it may even be a satisfied customer calling back to spend more money (this contradiction between reality and metrics is often referred to a the "99 foot man paradox").  As for "Average Handle Time"?  Well, I once heard a story about a bunch of call centre reps who had a nice scam going where they would simply hang-up on every incoming call (effectively deflecting the caller to other CSRs).  Until they were caught, they were being held up as model CSRs for their lightening efficiency.&lt;br /&gt;&lt;br /&gt;So how does one actually measure customer service so that someone "gaming" the KPI is forced to provide competent customer service at a reasonable cost?  The answer depends a great deal on the company's missions.  However, with today's technology there are a lot of options available, especially given that it's now standard to record each and every inbound call.  Furthermore, the latest voice recognition software does an impressive job of recognizing the majority of words and phrases.  Focusing on this data set alone, we can start pulling out some interesting KPIs.  We would first need to convert these unstructured data to structured data sets.  This is probably our most difficult task (on a that note, Bill Inmon's &lt;a href="http://www.amazon.com/Tapping-into-Unstructured-Data-Intelligence/dp/0132360292/ref=pd_bbs_sr_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1211300734&amp;amp;sr=8-1"&gt;"Tapping into Unstructured Data"&lt;/a&gt; is one of the better books on this subject).  Once we have a handle on our CSR/customer conversation data, we can start mining for certain terms that would indicate satisfaction or dissatisfaction.  I'm aware that it's even possible to capture a customer's mood and emotion through vocal tone analysis.  In theory, you should even be able to separate out the CSR's tone and words from the customer's tone and words, for even more fine grained analyses.&lt;br /&gt;&lt;br /&gt;I am not saying that it would be easy to establish a KPI to objectively measure customer satisfaction.  Rather, I am saying that it's not hard to improve upon our current KPIs.&lt;br /&gt;&lt;br /&gt;But what's the point?  First off, customer service is generally pretty bad these days.  This may even have something to do with CSRs gaming the existing KPIs, especially "Average Handle Time".  It also has to do with the complexity of services being offered these days, and the natural frustration that goes along with an excess of business rules that we can't possibly comprehended (there is a greater problem here, that I don't have time to get into in this post).  My point in all of this is to say that we are headed towards a culture of analytics whether we like it or not, and if our KPIs have flaws in them, then they will be manipulated against us by our customers and employees.  The world we live in is complex and nuanced, and one of our best tools at managing this is through simplification through the use of comprehensive indexes.  We just need to get better at designing our KPIs, and ensure they provide us with maximum goal congruence.&lt;br /&gt;&lt;br /&gt;---&lt;br /&gt;&lt;br /&gt;On a completely unrelated note, I have just rolled off a big project, and am looking for new work.  My area of expertise is in data management and enterprise architecture, but I'm also very nuts-and-bolts, and enjoy doing everything from: application support; software development; data modeling; requirements gathering; system sourcing and selection; process architecture; change management; external data procurement and aligntment; data warehousing &amp;amp; BI; metadata management; data governance policy; IT strategy; marketing analytics; and pretty much anything else technology or information related you can think of. &lt;br /&gt;&lt;br /&gt;I'm based out of Toronto, but am willing to travel and work anywhere in the world where there's interesting work.  I am incorporated, and prefer contract work, but would also consider full time work if there's a good fit.&lt;br /&gt;&lt;br /&gt;If you know of anything that you think I might be interested in, feel free to contact me at: neil@hepburndata.com</description><link>http://hepburndata.blogspot.com/2008/05/unstructured-data-and-hunt-for-elusive.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-3228092764245256337</guid><pubDate>Thu, 10 Apr 2008 13:50:00 +0000</pubDate><atom:updated>2008-05-09T10:25:51.710-04:00</atom:updated><title>David Letterman can teach us a thing or two about BI</title><description>I've been working a lot these days with data visualizations and presentation reports. I must admit that I've learned a thing or two about how people approach data, both from the IT side and the business side.  However, after looking at dozens and dozens of data visualizations and executive reports, I have realized that there are effectively two kinds of reports that you can present to an executive, and we should approach and understand them accordingly.&lt;br /&gt;&lt;br /&gt;The first kind of report is what I describe as an "entertaining report".  These reports rely heavily on data visualizations, and while they can and should convey information.  Their primary purpose is to grab your attention (i.e. entertain you), over and above driving decisions.  Since human beings are instinctively visual beasts, we have a soft spot for these types of reports.  We rarely know what to do with the information we see in these fancy presentations, but we love it all the same as it speaks to us emotionally.  The old saying "seeing is believing" is as true now as it has ever been.&lt;br /&gt;&lt;br /&gt;The second kind of report is what I would describe as "decision driving".  These reports tend to be bland ordered lists, with numbers.   However, these reports are not only the most important in driving decisions, but due to their abstract nature (and lack of understanding of the decision-makers predicament) are very difficult to get right the first time.  In fact, these reports tend to be an after-thought since we tend to occupy our imagination with the more wonderous data visualizations, and would rather avoid trying to understand the messy world that the decision making manager has to live in.  In fact, I'm sure I've even seen some IT folk sneer at the decision makers for not appreciating their glorious art.  I've probably sneered myself at one time.&lt;br /&gt;&lt;br /&gt;Going one step further, if we ask ourselves how decisions typically get enacted in business, and look at how people take on decisions, we can see that there is s desire to streamline decision making.  Managers are expected as part of their role to make decisions on a regular basis.  However, because new decisions represent risk, this in turn leads to stress.  So, if we can help managers make better decisions without increasing their stress, this is what we should strive for.&lt;br /&gt;&lt;br /&gt;I believe that the top 10 (or bottom 10) list is an excellent framework for streamlining decision making.  In fact it's so popular, that it is this tool we use to manage our own lives.  I maintain my own to-do lists each day.  If I need to go grocery shopping, I always have a shopping list in hand.  If I need to get my personal spending down, I take a look at my biggest expenses and attack those in order.  In other words, the top 10 list provides a framework for grouping decisions together, and therefore making each subsequent decision easier to tackle.  Furthermore, since we already know that list items get easier as we go down the list, the entire set of decisions seems less daunting since we can get into a groove and track our progress.&lt;br /&gt;&lt;br /&gt;But let's do a thought experiment to give you a better idea of where I'm coming from.  Let's say you were the mayor of Toronto, and you had pledged to reduce crime.  You might think to first get a grasp of where all the crime is happening.  You hire a consultant to explain this to you.  The consultant comes back one month later with an impressive heat-map of the City of Toronto showing in excruciating detail where all the crime hotspots are.  As the city mayor you recognize all the neighbourhoods, and probably aren't too surprised by what you see.  However, you will feel wiser seeing this map as you can now visualize where the crime is taking place (well you might think you can visualize it).  Great! Now it's decision time.  You need to make some hard spending decisions as to where you want to allocate social spending programs, improve community safety, and boost law enforcement.  Is this map sufficient for you to sign into budget these decisions?  Perhaps I as mayor could request more heat-maps showing different types of crime like homicide or grand larceny?  Maybe an animated time-seriesed map showing the spread of crime might better help?  Do you feel confident allocating millions of dollars based on moving blotches on a map, even knowing that those blotches are confiding the truth?  Probably not.&lt;br /&gt;&lt;br /&gt;I suspect at this point you will want to start generating good old fashioned lists.   You might want to see: Top ten neighbourhoods, as ranked by a blended crime index.  Or maybe, top ten neighbourhoods, as ranked by velocity of increase in crime over the past 4 years.&lt;br /&gt;&lt;br /&gt;There is no shortage of these top 10 lists you could produce, but the whole time you're dealing with unamiguously ranked neighbourhoods, supported by hard numbers.  As a compromise, I might say that you could add a simple bar chart visualization to help make some numeric comparisons a bit easier.  Either way, you will &lt;span style="font-weight: bold;"&gt;need&lt;/span&gt; to boil things down to a list of some sort, since you will need to verbally articulate the decision you made.  What sounds better: "I have allocated an increase spending to: Jane &amp;amp; Finch; Rexdale; and Regent Park, as they currently have the highest indexed crime per capita for the past 5 years standing, according to Statistics Canada".  Or, would you rather say: "If you could have seen the map I saw, you would know to allocate funding to Jane &amp;amp; Finch; Rexdale; and Regent Park".  Yes, if you're lucky, you might get to hold up the map, but then you would be forced to explain its legend.  And if the three neighbourhoods are visually similar to a few other neighbourhoods from a heat-map persepective, then you might be in the awkward situation of squinting your eyes and saying "Well, in my opinion, this blotch looks slightly larger than that blotch".  However, if there was a clear winner, then maybe the map wouldn't be so bad?  But if there was a clear winner you could state &lt;span style="font-style: italic;"&gt;that&lt;/span&gt; more clearly in verbal terms.&lt;br /&gt;&lt;br /&gt;Show me a data visualization, and I'll show you a top 10 list which does a better job at driving decisions.&lt;br /&gt;&lt;br /&gt;However, with all that said, you might be led to believe there are some exceptions to what I am saying.   For example, experienced meteorologists are able to make reasonably accurate predictions by visually studying animated satellite imagry of weather patterns.  Touche! But I'm not sure if I would even categorize this as BI, since the data never got beyond the video stage into a "fact-based" database.  The same goes for military intelligence studying satellite imagry.  Once again, the photos are being analyzed as-is for enemy presence.&lt;br /&gt;&lt;br /&gt;What I am saying is that the name of the game is to figure out what the most ideal top 10 (actually top 5 might be better due to limits on our &lt;a href="http://en.wikipedia.org/wiki/The_Magical_Number_Seven%2C_Plus_or_Minus_Two"&gt;capacitative memory&lt;/a&gt;) lists are to drive decisions, and you will have saved everyone time, and make managers lives so much easier.  However, the hard part is getting into the head of the decision maker.  If you cannot understand what the decision maker is confronted with, you will just be throwing darts at a board.  Who knows, maybe you'll get lucky.&lt;br /&gt;&lt;br /&gt;Thank you David Letterman.  You know us so well!</description><link>http://hepburndata.blogspot.com/2008/04/david-letterman-can-teach-us-thing-or.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-8884813959243305222</guid><pubDate>Sat, 01 Dec 2007 19:34:00 +0000</pubDate><atom:updated>2007-12-02T12:00:25.803-05:00</atom:updated><title>The Secret... to Getting Things Done... in Data Management, for Dummies and Idiots... in Three Simple Steps</title><description>I've decided to hop on the dumbing down bandwagon for once.  But instead of just offering your typically mind-numbing advice, and providing a bunch of pointless examples to prove my point.  I'm going to try to address the skeptics head-on.&lt;br /&gt;&lt;br /&gt;So what is this "secret", and these "three simple steps" you might ask?  I will get right to the point, and then substantiate my arguments for the skeptics.  That said, as with anything there are always "exceptions that prove the rule", and if you can point those out (and I encourage you to do so), then I'll try to address those arguments as well.  Without further ado, here is "the secret".&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Keep one, and only one copy of your data in an enterprise class RDBMS.&lt;/li&gt;&lt;li&gt;Normalize your entities to third normal form, or Boyce-Codd normal form.&lt;/li&gt;&lt;li&gt;Maintain operational data definitions for each data element, ensuring that the definitions. are understood to mean the same thing by all stakeholders.&lt;/li&gt;&lt;/ol&gt;That's it.  Simple right?  Honestly speaking, if you can pull off what I've just described you will have set the foundation for perfect or near-perfect data quality (which should be the goal of all data management professionals). Well, not so fast.  This is actually a lot harder than it looks, even for a green field application, let alone for legacy systems.  I will point out some of the pitfalls and some of the solutions to those gotchas, starting from easiest to hardest.&lt;br /&gt;&lt;br /&gt;Problem #1: Keeping only a single copy of the data has the following two major drawbacks:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Data warehouses need to be kept separate from operational data stores, for performance reasons, storage reasons, structural reasons, and auditability reasons.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Putting all your "eggs in once basket" is inherently risky, and leads to a single point of failure.&lt;/li&gt;&lt;/ol&gt;These are both valid points, but I will argue that the technology has more or less arrived that invalidates (or is on its way to invalidating) this argument.  Generally speaking, we should be striving to let our RDBMS do the data management work, and not our IT professionals.  The second something has to be managed by a human being is the second that you can expect quality issues.  All these tools you see, such as ETL, data discovery, data cleansing, etc. are basically a way of managing the copying of data.  If you can get rid of the copying (or put it in a black box), then you can eliminate the management.&lt;br /&gt;&lt;br /&gt;Getting back to reality: First, it is possible with all major RDBMSs (i.e. DB2, Oracle, and SQL Server) to limit a user's resource, and protect another user's resources.  In other words, it's possible to provide guarantees of service to the operational systems by denying resources to the reporting systems.  If resources becomes a regular problem it is now relatively straightforward [through clustering technology] to swap in more memory and processing power.  Second, as far as storage is concerned: Storage Area Network [SAN] and Information Lifecycle Management [ILM], and table partitioning technologies have got to the point where, with proper configuration, tables can grow indefinitely without impacting operational performance.  In fact, in Oracle 11g, you can now configure child tables to partition automatically based on parent table partition configuration (before 11g you would have to explictitly manage the child table partitions).  While we're on the topic of 11g, it is also now possible to Flashback forever.  Namely it's possible to see historical snapshots of the database as far back as you want to go.  That said, I don't believe this feature has been optimized for timeseries reporting.  So, this may be the last technology holdout to support the case of building a data warehouse.  Nevertheless, this is something I'm sure this is a problem the RDBMS vendors can surely solve.  Third, as far as structure is concerned, it is not necessary, and in fact can be counter-productive to de-normalize data into star/snowflake schemas.  I'll address this in my rebuttal to "problem #2".  Fourth, auditability is indeed a big part of the ETL process.  But we wouldn't need audit data if we didn't need to move it around to begin with, so it's a moot point.  If you want to audit access to information, this can easily be done with current trace tools built into all the major RDBMSs.&lt;br /&gt;&lt;br /&gt;On the problem of "putting your eggs in one basket".  While I can see that this has an appeal that can be appreciated by the business, it's really a facile argument.  Simply put, the more databases you need to manage, and the more varieties of databases you have to manage, results in less depth of knowledge for each individual database, and weaker regimes for each database.  If you had all your data in a single database, you could then spend more of your time understanding that database, and implement the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Routine catastrope drills (i.e. server re-builds, and restores)&lt;/li&gt;&lt;li&gt;Geographically distributed failover servers&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Rolling upgrades (I know this is possible with Oracle)&lt;/li&gt;&lt;li&gt;Improved brown-out support&lt;/li&gt;&lt;/ul&gt;In theory, if you had a lean and mean DB regimes, and well trained DBAs you could recover from a fire in your data centre in a matter of hours, and experience nothing more than a brown-out. However, in reality, DBAs and other resources are spread too thin to be able to do this, and in times of this kind of catastrophe you would be lucky to recover your data and have your applications back on-line within a week.&lt;br /&gt;&lt;br /&gt;Problem #2: Keeping data in 3NF or Boyce-Codd Normal form is difficult to work with, and performs poorly in a data warehouse.&lt;br /&gt;&lt;br /&gt;These are also valid points.  First, I will agree that people employed in writing BI reports or creating cubes (or whatever reporting tool they're using), prefer to work with data in a star or snowflake schema.  Organizing data in this format makes it easy to separate qualitative information from quantitative information, and lowers the ramp-up time for new hires.  However, whenever you transform data you immediately jeopardize its quality, and introduce the possibility of error.  But more importantly you &lt;span style="font-weight: bold;"&gt;eliminate&lt;/span&gt; information and options in the process.  The beauty of normalized data, is that it allows you to pivot from any angle your heart desires.  Furthermore, things like key constraints constitute information unto themselves, and by eliminating that structure it's harder to make strong assertions about the data itself.  For example, an order # may have a unique constraint in its original table, but when it's mashed up into a customer dimension, it's no longer explicitly defined that that field must be unique (unless someone had the presence of mind to define it that way, but once again human error creeps in).  Second, as far as performance is concerned, I agree that generally speaking de-normalized data will respond quicker.  However, it is possible to define partions, cluster indexes and logical indexes, so as to achieve the exact same "&lt;a href="http://en.wikipedia.org/wiki/Big_O_notation"&gt;Big O&lt;/a&gt;" Order.  Therefore the differences in performance are linear, and can be solved by adding more CPUs and memory, and thus overall scalability is not affected.&lt;br /&gt;&lt;br /&gt;Problem #3: It is impossible to get people to agree on one version of "the truth".&lt;br /&gt;&lt;br /&gt;While all my other arguments were focused on technology, this problem is clearly in the human realm, and I will submit that this problem with never ever be definitively solved.  Ultimately it comes down to politics and human will.  One person's definition of "customer" may make them look good, but make another person look bad.  Perception is reality, and the very way in which we concoct definitions impacts the way the business itself is run.  A good friend of mine [Jonathan Ezer] posited that there is no such thing as "The Truth", and it is inherently subjective.  As an example, he showed how the former planet Pluto, is no longer a planet through a voting process.  Yes, there is still a hunk of ice/rock floating up there in outer space, but that's not what the average person holds to be important.  Fortunately, only scientists were allowed to vote, but if astrologers were invited, they would surely have rained on the scientists parade.   Okay, so this sounds a little philosophical and airy fairy.  But consider this: According to Peter Aiken, the IRS took 13 years to arrive at a single definition of the word "child", including 2.5 years to move through legislation.  That said, what advice I can offer to break through these kinds of logjams, and which I've also heard from other experienced practitioners is to leverage definitions that have already been vetted by external organizations.  Going back to the IRS example, we now have a thorough operational definition that provides us with a good toehold.  Personally, I try to look for ISO or (since I'm Canadian) StatCan definitions of terms, if the word is not clear.  Another great place is Wikipedia or its successor &lt;a href="http://en.citizendium.org/wiki/Main_Page"&gt;Citizendium.&lt;/a&gt;  Apart from that, all I can suggest is to be as diplomatic and patient as possible.&lt;br /&gt;&lt;br /&gt;Am I wrong?  Am I crazy?  Or does data management &lt;span style="font-style: italic;"&gt;need&lt;/span&gt; to be complicated.  Well, I guess the software vendors like it that way.&lt;br /&gt;&lt;br /&gt;-Neil.</description><link>http://hepburndata.blogspot.com/2007/12/secret-to-getting-things-done-in-data.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>2</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-91322004526834191</guid><pubDate>Tue, 14 Aug 2007 23:47:00 +0000</pubDate><atom:updated>2007-08-30T23:27:33.592-04:00</atom:updated><title>Social Networks for the Enterprise</title><description>&lt;div&gt;I've been meaning to put this entry down for weeks now, but in each passing week it seems I come to new realizations about social networks and the direction they are going. It's been about two weeks since I've had any new thoughts on the matter, and for now my thinking as settled down.&lt;br /&gt;&lt;br /&gt;But before getting into my thoughts on social networks and how I feel they might be applied to the enterprise, I'd like to share a little story with you. A couple weeks ago I sat down for lunch at the local food-court down the road. An old neighbour of mine - Murray - who I hadn't seen in a while, saw me and sat down at my table. Within no time Murray (who is a senior manager for a large Canadian insurance company) started to vent about Facebook. Facebook as you may have heard, is the hottest social networking site out there, and in particular is most popular in Toronto with over 700,000 Torontonians, and growing. Murray brought up the fact that Facebook may have a huge valuation of over $10 billion, but is in fact costing companies significantly more than that in lost productivity. While I'd heard about companies (and even the Ontario government) banning access to Facebook, it really dawned on me as to what a time sucker this thing is. I was tempted to bring up the fact that maybe there were other issues surrounding employee managment, and would argue that this is the overarching problem.  However, since that conversation I've read that about half of all corporations now ban access to Facebook.&lt;/div&gt;&lt;div&gt; &lt;/div&gt;&lt;br /&gt;&lt;div&gt;With all this hype surrounding social networks it's inevitable that people are writing about how they might be applied to the Enterprise.  So far, I haven't read anything that has really impressed me (I've just read this rah rah stuff on ZDNet about how it helped a bunch of people solve problems faster, but didn't explain how or why).  So I thought I might build on a post I wrote in the past on Wikis in the enterprise, and relate this to social networks, but also [and more importantly] discuss the differences in social dynamics between consumer social networks and enterprise social networks.  Actually, I would say that this is really what has not been discussed in enough detail by the general media: What is the nature of our relations in enterprises versus the nature of our relations with friends and family.&lt;/div&gt;&lt;br /&gt;&lt;div&gt; &lt;/div&gt;&lt;br /&gt;&lt;div&gt;On that note let me first discuss what I am seeing happening on MySpace and Facebook.  I am assuming you the reader have a somewhat cursory knowledge of these services.  If you have no idea what these services are used for or why they are important, I suggest you do some research on these sites, and then return to this blog entry. &lt;br /&gt;&lt;br /&gt;Moving on, MySpace was the first major social networking site to capture the popular imagination.  There were sites before this (Six Degrees comes to mind), but MySpace became a hit for the following reasons:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It was targeted to, and appealed directly to teenagers: Probably the most socially self-concious group that exists.  This has changed somewhat due to concerns over sexual predators.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;It was completely open.  Anyone could see anybody else's MySpace page without having to register or login.&lt;/li&gt;&lt;li&gt;It was a platform unto itself.  While building a MySpace page is mainly a "fill in the blanks" exercise, users are invited to add "widgets" and from there "pimp out" their MySpace page.  Of course this spawned a widget cottage industry, which in turn makes the MySpace platform more desirable to its users.&lt;/li&gt;&lt;/ol&gt;Facebook on the otherhand succeeded mainly for these reasons:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It was targeted to college students: Probably the second most self-concious group that exists.&lt;/li&gt;&lt;li&gt;It was not so open, which made it more conducive to posting private details.  Namely, users could feel more confident about posting personal photographs because the security measures were in place to ensure that only certain people ("friends") could see photos and other personal details.&lt;/li&gt;&lt;li&gt;It included a news feed which allows you to see all your friends updates.  This is perhaps the most powerful [and originally controversial] feature of Facebook, and the one feature that has generated the most stickiness.&lt;/li&gt;&lt;li&gt;It also is a platform like MySpace.  However, it's an arguably more powerful platform since the underlying capabilities of Facebook are more robust, especially the security. &lt;/li&gt;&lt;/ol&gt;Both MySpace and Facebook have their strengths and weaknesses, but in their curent state, I don't see either of them as being an ideal fit for the Enterprise.  The other social network I didn't mention is LinkedIn, which I won't get into, but I also feel that this too is ill suited for the Enterprise.&lt;br /&gt;&lt;br /&gt;In order to understand why this is, you have to ask yourself the following question:  What is the nature of relationships in the Enterprise, and how are they different from relationships in mainstream social networks?&lt;br /&gt;&lt;br /&gt;In a nutshell, I would say that the answer is thus: Normal social networks are typically defined by relationships that both parties willingly desire.  In the Enterprise, relationships tend to be dictated by the Enterprise, and are thus of a utilitarian nature.   While it's nice to work with people we're friends with, this isn't always going to be the case.  However, if we can make these utilitarian relationships friendlier this is always a good thing.  So, I would propose that any social network for the Enterprise be cognizant of the nature of the relationship, but also facilitate warmer connections.  In that regard, divulging a certain amount of personal information is not a bad thing, but should be managed with a greater amount of astuteness, which should take its queue from what is normally discussed by the watercooler, or what would normally be posted on a cubicle wall (e.g. photos of spouse and kids).&lt;br /&gt;&lt;br /&gt;So, fleshing out the nature of relationships I will describe the following types of Enterprise relationships that I am aware of, and how I think information should be managed with respect to these relationships.  Since this is my fantasy, I will assume that the enterprise has Wikified itself, in the manner that I described a few months &lt;a href="http://hepburndata.blogspot.com/2007_03_01_archive.html"&gt;ago&lt;/a&gt;.  The basic types of relationships, the types of information that should be accessible through those relationships, and how those information should be secured are as follows:&lt;br /&gt;First: Operational versus Project relationships.&lt;br /&gt;Operational relationships are ongoing and indefinite.  Pretty much everyone has a relationship with HR.  Furthermore, everyone has a relationship with the helpdesk.  In some cases you will want to maintain personal relationships (HR is a good candidate here), and in other cases you will want to maintain a relationship with a proxy (the helpdesk is a good candidate here).  For operational relationships, you don't really need to have very much insight into the documents and data that these entities rely on, and for the most part you would just have their contact details, and a few other things that these persons may make public.  For example, HR could post (or link to) information about Insurance companies, company dress code policy, benefits, etc.  But you don't need to know which IS/IT systems they are using to manage your benefits as this does not concern you.&lt;br /&gt;&lt;br /&gt;Project relationships on the otherhand are temporal, but tend to require greater line-of-site to knowledge.  So, as opposed to our relationship with HR, where we don't really need to know HOW they do their job, in the case of project relationships this line-of-sight is usually a good thing.  As an example: If I'm on a project working with a team of: software developers; quality assurance professionals; business analysts; systems analysts; and project managers.  It would save me time to be able to see what they're up to.  Speaking in concrete terms, this means I would like to see what documents they are using (i.e. what Wikis they are frequently accessing), what databases they are connecting to, and generally what they are up to (I am also thinking of a Twitter RSS feed here - btw, Twitter on its own has the potential to be an extremely powerful management tool).  I don't need to know everything about their life, just everything that they are doing NOW.&lt;br /&gt;&lt;br /&gt;Second: Hierarchical relationships.  The Enterprise always has been and always will be hierarchical in nature.  Yes, we all aspire to the "flat" egalitarian Enterprise, but frankly speaking this simply goes against human nature.  It will never happen as long as hairless apes run the world.  However, we can manage it.  Namely, it should be simple for our Enterprise social network to apply the correct security and privacy settings based on hierarchy.  I should be able to see everything my subordinate is up to, but not so much as what my boss is up to.  It's all right if she can see what I'm up to though.  It sounds a bit cynical, but this is no different from out Enterprises curently function.  As for peers, this gets a bit tricky and should be handled on a case by case basis.&lt;br /&gt;&lt;br /&gt;Third: Intra-department versus inter-department versus inter Enterprise relationships.  I don't have any hard and fast answers here, but this is definitely something that should be considered.  Things of course get tricky when you're talking about relationships that go outside the Enterprise.  Typically these would be vendor relationships, and typically from a knowledge management perspective, this is by default a one-way street.  Namely, the Enterprise should collect information about the vendor, but be hesitant to share anything with them through a social network.  While I can see a time where social networks cross over Enterprises, it's hard to say if this is a priority.  To be sure, there is operational information that is routinely shared.  For example, a shipping company would keep its customers informed about the status of packages and deliveries.  But this hardly has anything to do with insight about any particular person within either Enterprise.&lt;br /&gt;&lt;br /&gt;This is just a sketch of how a social network could be implemented in an Enterprise, and if nothing else some of the things that an Enterprise architect should be mindful of.  At the very least, it should break down barriers of communication, and although I mentioned earlier that hierarchies are inevitable, they also can get in the way and ironically dehumanize us.  As a simple start, if more large organizations had personal pages where people could add a few photos, say a few things about themselves, and post links to frequently referenced documents, it would make the place a lot less intimidating, and much easier for new hires or new transfers.&lt;br /&gt;---&lt;br /&gt;On a completely different note, I was contacted by Michael who writes the &lt;a href="http://datagovernanceblog.com"&gt;Data Governance Blog&lt;/a&gt;: &lt;a href="http://datagovernanceblog.com/"&gt;http://datagovernanceblog.com/&lt;/a&gt;&lt;br /&gt;Michael had some nice things to say about my own blog and I am very flattered and appreciative of that.  Although I don't blog that often, one of my main goals has been to connect with likeminded individuals out there who see Enterprise Architecture and Data Management as a professional discipline, and who also understand that the discussion is not about Microsoft or Cognos or IBM or any other silver bullet manufacturer, but is something much more nuanced and sophisticated than any of these tech vendors would portray the problem as being.  So, I am more than happy to hear from any others out there who see things the same way I do, or enjoys healthy debate.&lt;br /&gt;&lt;br /&gt;For my next blog entry, I've got something a bit more abstract - but with real consequences planned.  I am partially basing it on a lecture by my good friend Jonathan Ezer.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;</description><link>http://hepburndata.blogspot.com/2007/08/social-networks-for-enterprise.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-36696550.post-2321482227821630501</guid><pubDate>Thu, 17 May 2007 14:08:00 +0000</pubDate><atom:updated>2007-05-17T10:08:26.024-04:00</atom:updated><title>SOA without IT governance = good luck</title><description>Before getting into my post, I wanted to mention an interview I read in this month's &lt;a href="http://www.wired.com/techbiz/people/news/2007/04/mag_schmidt_qa"&gt;Wired with Eric Schmidt, CEO of Google&lt;/a&gt;. I want to share with you a small excerpt:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Wired:&lt;/strong&gt; Google’s revenue and employee head count have tripled in the last two years. How do you keep from becoming too bureaucratic or too chaotic?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Schmidt:&lt;/strong&gt; It’s a constant problem. We analyze this every day, and our conclusion is that the best model is still small teams running as fast as they can and tolerating a certain lack of cohesion. Attempting to provide too much order dries out the creativity. What’s needed in a properly functioning corporation is a balance between creativity and order.&lt;br /&gt;&lt;em&gt;But we’ve reined in certain things. For example, we don’t tolerate the kind of “Hey, I want to have my own database and have a good time” behavior that was effective for us in the past.&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Very interesting... Of all the examples the CEO of Google could come up with in terms of governance, is basically &lt;em&gt;data governance&lt;/em&gt;. I think this is an excellent thing to mention when developers get in a hissy about how they're using data. Even the almighty Google adheres to a data governance policy, and the CEO is 100% supportive. Which leads into my blog post, about maintaining SOA services. Something tells me that Google probably does a decent job of governing their web services.&lt;br /&gt;&lt;br /&gt;Now onto my point...&lt;br /&gt;&lt;br /&gt;The SOA revolution is on in full force. It's the shiniest silver bullet to come around in a long time, and to be sure it has some real benefits that cannot be ignored. Unfortunately, I will be surprised if any companies out there that don't already have a strong IT governance in place will be able to succeed in achieving their desired ROIs. Of course slick new technology doesn't need a business case, as most CIOs are shamed into implementing a SOA program even if there is no specific need - it simply becomes "commons sense".&lt;br /&gt;&lt;br /&gt;Before launching into my critique I must point out that I am a huge supporter of the SOA approach. Web services, like those offered by Google, Amazon.com, Yahoo!, eBay, and others(check out: &lt;a href="http://programmableweb.com"&gt;programmableweb.com&lt;/a&gt; for a comprehensive directory of web services) are without a doubt a standard that's here to stay. Developing future applications using a SOA model clearly makes a lot of sense.&lt;br /&gt;&lt;br /&gt;From a corporate IT perspecitve, the SOA value proposition is two-fold: First it allows for re-usability like never before. In this respect, SOA's direct antecedent is software components (e.g. COM components, or EJBs); Second, SOA makes building distributed systems a whole lot easier. In this respect SOA's direct antecedant is a mishmash of all sorts of technology (e.g. message passing, RPC [which ODBC uses], store-and-forward, etc.).&lt;br /&gt;&lt;br /&gt;Now here's the rub. If you're going to switch to building things using a SOA approach, you're probably just going to start building services for new applications. Those applications in turn will be funded by projects, which will be managed by a project managers' whose responsibilities are to the success of the project, and not for the success of IT infrastructure. As the PMI likes to remind us: "Never goldplate". Full disclosure: I am PMI certified. Okay, so what does this mean? This means that while it is possible to build re-usable services. In all likelihood, they will be built for a specific application. Fair enough, when the next project that comes around that needs something slightly differently, we can just extend those services, while at the same time extending the value of those services. Not so fast! The project manager on the second project will likely have to decide: Is it cheaper to extend a live service, or just take the original source code, and extend &lt;em&gt;that&lt;/em&gt; instead, creating a brand-new service that is all but identical to the original service. Well, in spite of all the best intensions, most PMs will quickly cost out the price to regression test the current application(s) using the existing interface (not to mention the logistical headache) and will take the path of least resistance by building a nearly identical new service interface. Eventually over time what you get is a balkanized set of services which IT will constantly talk about "re-factoring" or "consolidating", but in reality there's very little discretionary money to complete a major project like that. Instead, what will happen is there will be some kind of required change that will impact all services. At that point IT will have to decide whether or not to consolidate or fix each one-by-one. More often than not, it will be the one-by-one fix that you will see. The costs of fixing each of these services will greatly outweigh the original investments to consolidate services, but it will just be a constant headache that cannot be solved without a major infrastructure overhaul which some IT disaster may eventually justify.&lt;br /&gt;&lt;br /&gt;You will of course point out that this type of IT sprawl is really just a lack of IT governance. Of course it is lack of governance. The point is: The discipline required to manage the reusability of web services is no different than the discipline to manage the reusability of data, which in turn requires metadata management, which in turn requires solid data governance, which in turn requires solid IT governance.&lt;br /&gt;&lt;br /&gt;To sum up: Implementing a SOA strategy, without any success managing data [and hence metadata], is like boarding a ship with an incompetent navigator. Will you get to your destination? Sure, but it'll take you a lot longer, and cost you a lot more.</description><link>http://hepburndata.blogspot.com/2007/05/soa-without-it-governance-good-luck.html</link><author>noreply@blogger.com (Neil Hepburn)</author><thr:total>0</thr:total></item></channel></rss>