<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Henk&#039;s  tech blog</title>
	<atom:link href="http://henkvandervalk.com/feed" rel="self" type="application/rss+xml" />
	<link>http://henkvandervalk.com</link>
	<description>a weblog about IT optimization</description>
	<lastBuildDate>Mon, 19 Dec 2016 20:49:35 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.2.9</generator>
	<item>
		<title>Data loading into a Clustered ColumnStore Index</title>
		<link>http://henkvandervalk.com/data-loading-into-a-clustered-columnstore-index</link>
					<comments>http://henkvandervalk.com/data-loading-into-a-clustered-columnstore-index#comments</comments>
		
		<dc:creator><![CDATA[Henk]]></dc:creator>
		<pubDate>Mon, 19 Dec 2016 11:28:00 +0000</pubDate>
				<category><![CDATA[Performance tip]]></category>
		<category><![CDATA[SQL2016]]></category>
		<category><![CDATA[ColumnStore]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[SQL2016 SP1]]></category>
		<guid isPermaLink="false">http://henkvandervalk.com/?p=1240</guid>

					<description><![CDATA[Introduction With the release of SQL Server 2016 Service Pack 1 the In-Memory ColumnStore technology is now also available in the Standard, Web and even Express and LocalDB Editions. Besides the benefit of only 1 codebase to maintain, this change in policy will also become a clear disk storage space saver due to its high [&#8230;]]]></description>
										<content:encoded><![CDATA[<h3>Introduction </h3>
<p>With the release of SQL Server 2016 Service Pack 1 the In-Memory ColumnStore technology is now also available in the Standard, Web and even Express and LocalDB Editions. Besides the benefit of only 1 codebase to maintain, this change in policy will also become a clear disk storage space saver due to its high data de-duplication and compression ratios and, last but not least, it’s also a serious ad-hoc query performance booster!</p>
<p>The main difference between the SQL flavors is how much CPU power and memory is allocated to tasks like (re-)building of the Clustered ColumnStore Index. For example: with the Standard Edition a single core (=max. 100% Processor time of the sqlservr process) is being used and querying a CCI happens with a maximum of 2 CPU’s (MAXDOP=2), versus leveraging all available CPU’s in Enterprise Edition. </p>
<p>Building a Clustered ColumnStore Index (CCI) with SQL Server 2016 Standard Edition:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image.png" rel="lightbox[1240]" title="100% Processor (1CPU) time of the sqlservr process "><img decoding="async" loading="lazy" title="100% Processor (1CPU) time of the sqlservr process " style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="Single CPU utilization building an ColumnStore" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb.png" width="681" height="289" /></a></p>
<p>Building a CCI with all 4 available cores with SQL Server 2016 Enterprise Edition:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image9.png" rel="lightbox[1240]" title="All 4 CPU&#39;s (400%) Processor  time in use by the sqlservr process "><img decoding="async" loading="lazy" title="All 4 CPU&#39;s (400%) Processor  time in use by the sqlservr process " style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="All CPU&#39;s utilized building an ColumnStore with SQL EE" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image9_thumb.png" width="685" height="331" /></a></p>
<p>&#160;</p>
<p>The base timings for loading 7.2 GB/~60 Million rows from a single TPCH lineItem files doesn’t show much of a difference between the flavors when Bulk inserting the data direct into either a heap table or a table with a CCI; the difference become clear when we compare the time needed to build a CCI on a heap table or rebuilding a CCI:</p>
<p>&#160;</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-1.png" rel="lightbox[1240]" title="Timing overview T-SQL Bulk Insert + (Re-)Building a clustered ColumnStore Index is much faster with SQL Enterprise Edition"><img decoding="async" loading="lazy" title="Timing overview T-SQL Bulk Insert + (Re-)Building a clustered ColumnStore Index is much faster with SQL Enterprise Edition" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="Comparing SQL Standard Edition vs SQL Enterprise Edition" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-1.png" width="681" height="284" /></a></p>
<p>To summarize, the absolute fastest way to have data available in a table with a Clustered ColumnStore Index is to: load into heap + build the CCI afterwards with SQL 2016 Ent. Ed.</p>
<p>&#160;</p>
<h3>Direct load into CCI</h3>
<p>For tables with a Clustered ColumnStore Index already created make sure that you stream directly into Compressed Row Groups to maximize throughput. In order to do so, the Insert <u>batch size </u>should be equal or larger than 100K Rows (102400+ to be precise).</p>
<p>Smaller batches will be written into compressed delta store tables first before being tuple moved into its final compressed Row Group segments, which means SQL Server has to touch the data twice:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb51.png" rel="lightbox[1240]" title="Stream directly to Compressed Row Groups"><img decoding="async" loading="lazy" title="Stream directly to Compressed Row Groups" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="Delta Store Row Groups " src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb51_thumb.png" width="681" height="218" /></a></p>
<p>&#160;</p>
<p>There are various options to load data and we will go over the most frequently used ones, like the Bulk Insert command, BCP and SSIS. Let’s see what is needed to get best performance and how to monitor!</p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[LINEITEM_CCI]
(
    [L_SHIPDATE] [smalldatetime] <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
    [L_ORDERKEY] [bigint] <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
...
    [L_SHIPINSTRUCT] [<span class="kwrd">varchar</span>](50) <span class="kwrd">COLLATE</span> Latin1_General_100_BIN2 <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
    [L_COMMENT] [<span class="kwrd">varchar</span>](255) <span class="kwrd">COLLATE</span> Latin1_General_100_BIN2 <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>
)
<span class="kwrd">go</span>
<span class="rem">-- Create CCI:</span>
<span class="kwrd">CREATE</span> <span class="kwrd"><font style="background-color: #ffff00">CLUSTERED</font></span> COLUMNSTORE <span class="kwrd">INDEX</span> [CCI_IDX] <span class="kwrd">ON</span> [dbo].[LINEITEM_CCI] 
<span class="kwrd">WITH</span> 
(DROP_EXISTING = <span class="kwrd">OFF</span>, 
 COMPRESSION_DELAY = 0) 
<span class="kwrd">ON</span> [<span class="kwrd">PRIMARY</span>]
go</pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>&#160;</p>
<h3>1) T-SQL Bulk Insert</h3>
<p>Let us start with a the BULK INSERT command:</p>
<pre class="csharpcode"><p>    <span class="kwrd">BULK</span> INSERT [LINEITEM_CCI]
    <span class="kwrd">FROM</span> <span class="str">'F:\TPCH\lineitem.tbl'</span>
    <span class="kwrd">WITH</span> 
   ( <strong>TABLOCK,</strong><strong><font style="background-color: #ffff00"></font></strong>
    FIELDTERMINATOR = <span class="str">'|'</span>,
    ROWTERMINATOR   = <span class="str">'\n'</span>,<br />    -- <font style="background-color: #00ff00">Batchsize = 102400</font><br />   );</p></pre>
<h4>&#160;</h4>
<h4>Checking Data Load Progress</h4>
<p>To check the Number of Rows that already got loaded into the CCI, even when the Table Lock option is being used, query a new dmv called ‘sys.dm_db_column_store_row_group_physical_stats’:</p>
<pre class="csharpcode"><span class="rem">--  Check data loading progress on bulkinserts:</span>
<span class="kwrd">select</span> <span class="kwrd">SUM</span> (total_rows) <span class="kwrd">as</span> Rows_loaded
<span class="kwrd">from</span> sys.dm_db_column_store_row_group_physical_stats
<span class="kwrd">where</span> object_id  = object_id(<span class="str">'LINEITEM_CCI1'</span>)</pre>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image10.png" rel="lightbox[1240]" title="Check inflight on the rows loaded!"><img decoding="async" loading="lazy" title="Check inflight on the rows loaded!" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="sys.dm_db_column_store_row_group_physical_stats" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image10_thumb.png" width="192" height="83" /></a></p>
<p>This DMV will also reveal the possible Resource Group states in more detail while loading. There are four possible Row Group states while loading data. When you see the state ‘INVISBILE’ like in the picture below means that data is being compressed into a RowGroup. </p>
<blockquote>
<p><font color="#666666">0: INVISIBLE (= RowGroup is in the process of being built from data in the delta store)<br />
      <br />1: OPEN&#160;&#160;&#160;&#160;&#160;&#160;&#160; (= RowGroup is accepting new records) </p>
<p>2: CLOSED&#160;&#160;&#160; (= RowGroup is filled but not yet compressed by the tuple mover process) 3: COMPRESSED&#160; (= RowGroup is filled and compressed). </p>
<p></font>4 = TOMBSTONE&#160; (= RowGroup is ready to be garbage collected and removed)</p>
</blockquote>
<pre class="csharpcode"><span class="kwrd">SELECT</span>
object_name (object_id) <span class="kwrd">as</span> Table_name, 
transition_to_compressed_state <span class="kwrd">as</span> compressed_state,
transition_to_compressed_state_desc,
delta_store_hobt_id,
state_desc,
total_rows
<span class="kwrd">from</span> sys.dm_db_column_store_row_group_physical_stats
 <span class="kwrd">where</span> object_id  = object_id(<span class="str">'LINEITEM_CCI'</span>)
<span class="kwrd">order</span> <span class="kwrd">by</span> closed_time desc</pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>&#160;</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-2.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-2.png" width="685" height="123" /></a></p>
<p>By specifying the <u>Batch Size</u> with a value of 102400 or higher you will achieve maximum performance and data will get streamed and directly compressed into its final RG; this behavior will show up as ‘COMPRESSED’.</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-3.png" rel="lightbox[1240]" title="Max performance!"><img decoding="async" loading="lazy" title="Max performance!" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-3.png" width="685" height="154" /></a></p>
<p>You can also check a DMV that got introduced with SQL2014 to check on the RowGroup State, which is the sys.column_store_row_groups DMV:</p>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
</p>
<pre class="csharpcode"><span class="rem">-- Check the ColumnStore RowGroup state:</span>
<span class="kwrd">select</span> <span class="kwrd">state</span>, state_description, <span class="kwrd">count</span>(*) <span class="kwrd">as</span> <span class="str">'RowGroup Count'</span>
<span class="kwrd">from</span> sys.column_store_row_groups
<span class="kwrd">where</span> object_id = object_id(<span class="str">'LINEITEM_CCI'</span>)
<span class="kwrd">group</span> <span class="kwrd">by</span> <span class="kwrd">state</span>, state_description
<span class="kwrd">order</span> <span class="kwrd">by</span> <span class="kwrd">state</span>;</pre>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-4.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-4.png" width="335" height="75" /></a></p>
<p>&#160;</p>
<h3>Test Result</h3>
<p>Bulk inserting data into a table with CCI via the Bulk Insert command can slightly be improved by adding the ‘Batchsize=102400’ and TABLOCK options. This brings an ~8% improvement in throughput.</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-5.png" rel="lightbox[1240]" title="Bulk Insert duration &amp; throughput"><img decoding="async" loading="lazy" title="Bulk Insert duration &amp; throughput" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-5.png" width="452" height="373" /></a></p>
<h3>2) BCP.exe</h3>
<p>The BCP utility is still being used quite heavily in many production environments so worth to check on it quickly: by default, the BCP sents 1000 rows at the time to SQL Server. <a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-6.png" rel="lightbox[1240]" title="BCP utility sends 1K rows per batch as default."><img decoding="async" loading="lazy" title="BCP utility sends 1K rows per batch as default." style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="BCP" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-6.png" width="685" height="154" /></a></p>
<p>The time it takes to load 7.2GB of data via BCP: 530 seconds , or&#160; 113K rows/sec</p>
<p>The RowGroup state shows ‘ÍNVISIBLE’ which means that with the default settings the Delta Store is being used.</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-7.png" rel="lightbox[1240]" title="State_description = INVISIBLE"><img decoding="async" loading="lazy" title="State_description = INVISIBLE" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-7.png" width="360" height="103" /></a></p>
<p>To make sure the BCP command streams the data directly into the compressed RG’s you have to add the <strong>batchsize ‘–b’ </strong>option with a value of at least 102400. I ran various tests with larger batch sizes: up to 1048576, but the 102400 gave best me the result.</p>
<p>Command:</p>
<p>BCP [DB].[dbo].[LINEITEM_CCI] in F:\TPCH\lineitem.tbl –S . -c -T -t&quot;|&quot; <font style="background-color: #00ff00">-b 102400 –h tablock</font></p>
<p>
  <br />The RowGroup state now shows ‘COMPRESSED’ which means we bypass the Delta Store and data streams into the compressed RG’s:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-8.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-8.png" width="282" height="86" /></a></p>
<p>Result: the BCP completed in 457 seconds, or 133K rows per second or ~14% faster:<a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-9.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-9.png" width="512" height="386" /></a></p>
<h3>3) SSIS </h3>
<p>During testing I noticed that the default SSIS 2016 settings use memory buffer sizes that can also potentially limit the batch size to become less than 100K Rows. In the example below you see that data landed in delta stores: the RG states are ‘Closed’ and the ‘delta_store_hobt_id’ fields are populated, which means the delta stores are leveraged. </p>
<pre class="csharpcode"><span class="rem">-- Monitoring CCI:</span>
<span class="rem">-- Look at RG's  /  delta stores in use? </span>
<span class="kwrd">select</span> 
object_name (object_id) <span class="kwrd">as</span> Table_name, * 
<span class="kwrd">from</span> sys.dm_db_column_store_row_group_physical_stats
 <span class="kwrd">where</span> object_id  = object_id(<span class="str">'LINEITEM_CCI'</span>)
<span class="kwrd">order</span> <span class="kwrd">by</span> closed_time <span class="kwrd">desc</span></pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>&#160;</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-10.png" rel="lightbox[1240]" title="SSIS-may leverage DeltaStores - Some buffer tuning maybe needed!"><img decoding="async" loading="lazy" title="SSIS-may leverage DeltaStores - Some buffer tuning maybe needed!" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-10.png" width="674" height="328" /></a></p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-11.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-11.png" width="244" height="91" /></a></p>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>This was the moment to reach out and check with to my colleagues who luckily have noticed this to and a solution is already there! (see: “<a title="SSIS Data Flow Buffer Auto Sizing capability benefits data loading on Clustered ColumStore tables" href="https://blogs.msdn.microsoft.com/sqlcat/2016/02/29/sql-server-2016-ssis-data-flow-buffer-auto-sizing-capability-benefits-data-loading-on-clustered-columnstore-tables/" target="_blank">Data Flow Buffer Auto Sizing capability benefits data loading into CCI</a>”). To fully leverage the CCI streaming capabilities you have to increase the Default memory BufferSize &amp; MaxRows settings: </p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-12.png" rel="lightbox[1240]" title="The Data Flow Task Defaults"><img decoding="async" loading="lazy" title="The Data Flow Task Defaults" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-12.png" width="333" height="243" /></a></p>
<p>Change the these into 10x larger values:</p>
<p>&#8211; DefaultMaxBufferRows from 10000 into <u>1024000</u> and the most important one:</p>
<p>&#8211; <font style="background-color: #00ff00">DefaultBufferSize</font> from 10485760 into <u>104857600</u>. </p>
<p>Note: the new ‘AutoAdjustBufferSize’ setting should be set to ‘True’ when you load very wide Rows of data. </p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-13.png" rel="lightbox[1240]" title="Optimized SSIS Data Flow Task memory settings"><img decoding="async" loading="lazy" title="Optimized SSIS Data Flow Task memory settings" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-13.png" width="530" height="414" /></a></p>
<p>Change also the values for the Destination adapter:</p>
<p>&#8211; Rows per Batch:&#160; from none into <u>102400</u></p>
<p>&#8211; Maximum Insert commit size: from 2147483647 into <u>102400</u></p>
<p>&#160;&#160; <a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image-14.png" rel="lightbox[1240]" title="Change &#39;Rows per batch&#39; &amp; &#39;Max. insert commit size&#39; to 102400 "><img decoding="async" loading="lazy" title="Change &#39;Rows per batch&#39; &amp; &#39;Max. insert commit size&#39; to 102400 " style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image_thumb-14.png" width="456" height="307" /></a>&#160;&#160; </p>
<p><font color="#000000"></font></p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/12/image123.png" rel="lightbox[1240]" title="image"><img decoding="async" loading="lazy" title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; border-left: 0px; display: block; padding-right: 0px; margin-right: auto" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/12/image123_thumb.png" width="470" height="349" /></a></p>
<p><strong><font color="#808080" size="4"></font></strong></p>
<p><strong><font color="#808080" size="4">Wrap-Up</font></strong></p>
<p>The feature parity introduced with SQL Server 2016 SP1 opens up a whole new range of possibilities to benefit from! Hopefully the walkthroughs above help you to max out Bulk Insert, BCP and SSIS performance when loading data into a Clustered ColumnStore Index! </p>
]]></content:encoded>
					
					<wfw:commentRss>http://henkvandervalk.com/data-loading-into-a-clustered-columnstore-index/feed</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title>How-to load data fast into SQL Server 2016</title>
		<link>http://henkvandervalk.com/how-to-load-data-fast-into-sql-server-2016</link>
					<comments>http://henkvandervalk.com/how-to-load-data-fast-into-sql-server-2016#comments</comments>
		
		<dc:creator><![CDATA[Henk]]></dc:creator>
		<pubDate>Mon, 24 Oct 2016 21:30:00 +0000</pubDate>
				<category><![CDATA[Performance tip]]></category>
		<category><![CDATA[SQL2016]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[Workload & Capacity Planning]]></category>
		<category><![CDATA[performance]]></category>
		<guid isPermaLink="false">http://henkvandervalk.com/?p=1198</guid>

					<description><![CDATA[What will be the absolute fastest way to load data from a flatfile into a table within SQL Server 2016? A lot has changed since my initial post on this topic many years ago, ike the introduction of In-memory optimized tables and Updateable Columnstore table indexes. Also the list of data transport vehicles to choose [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>What will be the absolute fastest way to load data from a flatfile into a table within SQL Server 2016? A lot has changed since my <a href="http://henkvandervalk.com/speeding-up-ssis-bulk-inserts-into-sql-server" target="_blank">initial post</a> on this topic many years ago, ike the introduction of In-memory optimized tables and Updateable Columnstore table indexes. Also the list of data transport vehicles to choose from is growing: besides BCP, the T-SQL Bulk Insert command, SSIS as ETL tool and PowerShell there are some new ones added, like PolyBase, External R Script or ADF. In this post I will start with the checking how much faster the new durable &amp; non-durable In-memory tables are!</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image8.png" rel="lightbox[1198]" title="Overview Data Load options"><img decoding="async" loading="lazy" title="Overview Data Load options" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="Overview" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image8_thumb.png" width="548" height="199" /></a></p>
<h2>Setting the Baseline</h2>
<p>For these tests I’m using an Azure DS4_V2 Standard VM with 8 cores/28 GB of RAM and 2 HDD Volumes with host caching R/W enabled. (Both Luns provide ~275 MB/sec R/W throughput although the GUI states a limit of 60MB/sec). I generated a single ~60 Million row/7.2 Gigabyte TPCH lineitem flat file as data to load. </p>
<p>As baseline to for use for comparison we will use the time it takes to load the file into a Heap table:</p>
<pre class="csharpcode"><span class="kwrd">BULK</span> INSERT [LINEITEM_HEAP]
<span class="kwrd">FROM</span> <span class="str">'F:\TPCH\lineitem.tbl'</span>

<span class="kwrd">WITH</span> 
  (
    FIELDTERMINATOR = <span class="str">'|'</span>,
    ROWTERMINATOR   = <span class="str">'\n'</span>
  );</pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>This regular Bulk Insert command completes within 7 minutes with an average of 143K rows/sec.</p>
<p>&#160;</p>
<p><strong><font color="#808080" size="4">Enabling the test database for Memory optimized tables</font></strong></p>
<p><strong><font color="#808080" size="4"></font></strong><strong><font color="#808080" size="4"></font></strong></p>
<p>The (in SQL2014/2016 Enterprise &amp; developer Edition ) introduced in-memory tables are designed for very fast OLTP with many small transactions and high concurrency, which is a completely different type of workload as bulk inserting… but, just out of curiosity…lets give it a try!</p>
<p>There are 2 types of in-memory tables: durable and non-durable tables. The durable ones will persist data on disk, the non-durable ones won’t. To enable this option we have to do some housekeeping and assign a fast disk volume for hosting these files. </p>
<p>First, alter the database to enable the ‘Contains MEMORY_OPTIMIZED_DATA’ option followed by adding a File location and Filegroup that will contain the Memory-Optimized tables:</p>
<pre class="csharpcode"><span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> [TicketReservations] 
<span class="kwrd">ADD</span> FILEGROUP [memory_optimized_filegroup_0] <span class="kwrd">CONTAINS</span> MEMORY_OPTIMIZED_DATA 
<span class="kwrd">GO</span>

<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> [TicketReservations] 
<span class="kwrd">ADD</span> <span class="kwrd">FILE</span> ( NAME = N<span class="str">'memory_optimized_file'</span>, 
FILENAME = N<span class="str">'E:\data\memory_optimized_files'</span> ) 
<span class="kwrd">TO</span> FILEGROUP [memory_optimized_filegroup_0]
<span class="kwrd">GO</span></pre>
<p>The third thing to do is to add a separate memory pool to the SQL Server instance so it can keep all the data we will load into in-memory tables separate from its default memory pool:</p>
<h4>&#160;</h4>
<h4>Binding a database to a memory pool</h4>
<p>
  <br />The steps to define a separate memory pool and to bind a database to it are listed below:</p>
<p>Extra memory pools are managed via the SQL Resource Governor. </p>
<pre class="csharpcode"><span class="kwrd">CREATE</span> RESOURCE POOL [PoolHk] <span class="kwrd">WITH</span>(
min_cpu_percent=0, 
max_cpu_percent=100, 
min_memory_percent=0, 
max_memory_percent=70, 
AFFINITY SCHEDULER = AUTO
)
<span class="rem">-- Dynamically change the value of MAX_MEMORY_PERCENT  </span>
<span class="kwrd">ALTER</span> RESOURCE POOL PoolHk  
<span class="kwrd">WITH</span>  (MAX_MEMORY_PERCENT = 80 )  
  
<span class="rem">-- RECONFIGURE enables resource governor  </span>
<span class="kwrd">ALTER</span> RESOURCE GOVERNOR <span class="kwrd">RECONFIGURE</span> 
<span class="kwrd">ALTER</span> RESOURCE GOVERNOR RESET <span class="kwrd">STATISTICS</span> </pre>
<p>The 4th and last step is to bind the test database to the new memory Pool with the <strong>sys.sp_xtp_bind_db_resource_pool</strong> command.&#160; In order for the binding to become effective we have to take the database offline and bring it back online. Once bound we can dynamically change the amount of memory assigned to its pool via <span class="kwrd">the ‘<strong>ALTER</strong></span><strong> RESOURCE POOL</strong> PoolHk <span class="kwrd">WITH</span> (MAX_MEMORY_PERCENT = 80 )’ command.</p>
<pre class="csharpcode"><span class="rem">-- BIND the Database to the Pool</span>
<span class="kwrd">exec</span> sys.sp_xtp_bind_db_resource_pool 
N<span class="str">'TicketReservations'</span>, N<span class="str">'PoolHk'</span>;

<span class="rem">-- take DB offline:</span>
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> [TicketReservations] <span class="kwrd">SET</span> OFFLINE 
<span class="kwrd">WITH</span> <span class="kwrd">ROLLBACK</span> <span class="kwrd">IMMEDIATE</span>

<span class="rem">-- Bring the Database back Online:</span>
<span class="kwrd">ALTER</span> <span class="kwrd">DATABASE</span> [TicketReservations] <span class="kwrd">SET</span> ONLINE
 
<span class="rem">-- Deleting the Resource Pool:</span>
<span class="rem">-- sys.sp_xtp_unbind_db_resource_pool N'TicketReservations'</span>
<span class="rem">-- DROP RESOURCE POOL [PoolHk]</span></pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p><strong><font color="#808080" size="4"></font></strong></p>
<h3>Bulk Insert into Durable In-Memory table </h3>
<p>Now we are all set with the In-memory option enabled, we can create an in-memory table. Every memory-optimized table must have at least one index (either a Range- or Hash index ) which are completely (re-)composed in memory and are never stored on disk.</p>
<p>A durable table must have a declared primary key, which could then be supported by the required index. To support a primary key I added an extra row_number ROWID1 column to the table:</p>
<pre class="csharpcode">---<span class="rem">--  Durable table:</span>
<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[LINEITEM_MO_Final]
(
[ROWID1] [bigint] <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>, 
[L_SHIPDATE] [smalldatetime]  <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
..
[L_COMMENT] [<span class="kwrd">varchar</span>](44) <span class="kwrd">COLLATE</span> Latin1_General_100_BIN2 <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
<span class="kwrd">constraint</span> RowID_key <span class="kwrd">primary</span> <span class="kwrd">key</span> <span class="kwrd">nonclustered</span> hash (ROWID1) 
<span class="kwrd">WITH</span> (bucket_count=10000000)
)
<span class="kwrd">WITH</span> 
(MEMORY_OPTIMIZED = <span class="kwrd">ON</span>, DURABILITY = SCHEMA_AND_DATA)</pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>Specifying a batch size of 1 (up to 5) Million rows to the bulk insert command helps to persist data to disk while the bulk insert is ongoing (instead of saving it all at the end); doing so minimizes memory pressure on the memory pool PookHK we created. </p>
<pre class="csharpcode">----------------------------------------------------<span class="rem">--</span>
<span class="rem">-- Bulk insert into Durable Memory Optimized table  --</span>
----------------------------------------------------<span class="rem">--</span>
    <span class="kwrd">BULK</span> INSERT [LINEITEM_Memory_Optimized_D]
    <span class="kwrd">FROM</span> <span class="str">'F:\TPCH\lineitem_export.csv'</span>
    <span class="kwrd">WITH</span> 
      (
    FIELDTERMINATOR = <span class="str">'|'</span>,
    ROWTERMINATOR   = <span class="str">'\n'</span>,
    Batchsize = 5000000
      );</pre>
<p>The data load into the <strong>durable</strong> In-Memory table completes in 5 minutes 28 seconds, or 183K Rows/sec. That’s an ‘okay’ time but not that much faster than our baseline. Looking at the sys.dm_os_wait_stats shows that the no.1 waitstat is ‘IMPPROV_IOWAIT’ which occurs when SQL Server waits for a bulk load I/O to finish. Looking at the Performance counter ‘Bulk Copy Rows/sec and Disk Write Bytes/sec shows the flushing to disk spikes of 275 MB/sec once a batch got in (the green spikes). That is the maximum of what the disk can deliver but doesn’t explain it all. Given the minor gain, we will park this one for future investigation.</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image.png" rel="lightbox[1198]" title="Bulk Inserting into In-Memory Durable table"><img decoding="async" loading="lazy" title="Bulk Inserting into In-Memory Durable table" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="Batch gets persisted to disk-5mill rows in 10sec" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb.png" width="685" height="279" /></a></p>
<h3>Monitoring the Memory Pool</h3>
<p>Via the ‘sys.dm_resource_governor_resource_pools’ dmv can we check if our in-memory table leverages the newly created ‘PoolHK’ memory Pool:</p>
<pre class="csharpcode"><span class="kwrd">SELECT</span> pool_id  
     , Name  
     , min_memory_percent  
     , max_memory_percent  
     , max_memory_kb/1024 <span class="kwrd">AS</span> Max_memory_MB  
     , used_memory_kb/1024 <span class="kwrd">AS</span> Used_memory_MB
     , target_memory_kb/1024 <span class="kwrd">AS</span> Target_memory_MB
   <span class="kwrd">FROM</span> sys.dm_resource_governor_resource_pools  </pre>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<p>The output shows this is the case; the 7.2GB (+ some extra for the Rowid) got uncompressed loaded into the memory poolHk pool:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image-1.png" rel="lightbox[1198]" title="Output of the &#39;sys.dm_resource_governor_resource_pools&#39; DMV"><img decoding="async" loading="lazy" title="Output of the &#39;sys.dm_resource_governor_resource_pools&#39; DMV" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb-1.png" width="685" height="92" /></a></p>
<p>If you try to load more data than you have memory available to the pool you will get an proper message like this one:</p>
<blockquote>
<p>The statement has been terminated.<br />
    <br /><font color="#ff0000">Msg 701, Level 17, State 103, Line 5<br />
      <br />There is insufficient system memory in resource pool &#8216;PookHK’ to run this query.</font></p>
</blockquote>
<p>To look one level deeper at memory space allocation on a per In-memory table basis you can run the following query (<a href="http://download.microsoft.com/download/8/3/6/8360731A-A27C-4684-BC88-FC7B5849A133/SQL_Server_2016_In_Memory_OLTP_White_Paper.pdf" target="_blank">taken from the SQL Server In-Memory OLTP Internals for SQL Server 2016 document</a>):</p>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<pre class="csharpcode"><p><span class="kwrd">USE TicketReservations</span></p><p><span class="kwrd">SELECT</span> object_name(c.object_id) <span class="kwrd">AS</span> table_name,  
allocated_bytes /1024/1024 <span class="kwrd">as</span> alloc_Mbytes ,
a.xtp_object_id, a.type_desc,    
minor_id, memory_consumer_id <span class="kwrd">as</span> consumer_id,   
memory_consumer_type_desc <span class="kwrd">as</span> consumer_type_desc,    
memory_consumer_desc <span class="kwrd">as</span> consumer_desc
<span class="kwrd">FROM</span> sys.memory_optimized_tables_internal_attributes a     
<span class="kwrd">JOIN</span> sys.dm_db_xtp_memory_consumers c     
<span class="kwrd">ON</span> a.object_id = c.object_id <span class="kwrd">and</span> a.xtp_object_id = c.xtp_object_id       
<span class="kwrd">LEFT</span> <span class="kwrd">JOIN</span> sys.indexes i  <span class="kwrd">ON</span> c.object_id = i.object_id         
<span class="kwrd">AND</span> c.index_id = i.index_id   
<span class="kwrd">WHERE</span> c.object_id = object_id(<span class="str">'dbo.LINEITEM_Memory_Optimized_D'</span>)
<span class="kwrd">ORDER</span> <span class="kwrd">BY</span> allocated_bytes <span class="kwrd">DESC</span>
 </p></pre>
<p>The data we just loaded is stored as a varheap structure with a hash index:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image-2.png" rel="lightbox[1198]" title="Memory Table Heap allocation"><img decoding="async" loading="lazy" title="Memory Table Heap allocation" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="sys.dm_db_xtp_memory_consumers" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb-2.png" width="685" height="72" /></a></p>
<p>So far so good! Now lets move on and check out how staging in a non-durable table performs! </p>
<pre class="csharpcode"><p>---<span class="rem">-- Non-Durable table:</span> </p><p><span class="kwrd">if</span> object_id(<span class="str">'LINEITEM_Memory_Optimized_ND'</span>) <span class="kwrd">is</span> <span class="kwrd">not</span> <span class="kwrd">null</span>
<span class="kwrd">drop</span> <span class="kwrd">TABLE</span> [dbo].[LINEITEM_Memory_Optimized_ND]
<span class="kwrd">go</span>

<span class="kwrd">CREATE</span> <span class="kwrd">TABLE</span> [dbo].[LINEITEM_Memory_Optimized_ND]
(
    [L_SHIPDATE] [smalldatetime] <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
    [L_ORDERKEY] [bigint] <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
..</p><p>    [L_COMMENT] [<span class="kwrd">varchar</span>](44) <span class="kwrd">COLLATE</span> Latin1_General_100_BIN2 <span class="kwrd">NOT</span> <span class="kwrd">NULL</span>,
,<span class="kwrd">INDEX</span> [Staging_index] 
    <span class="kwrd">NONCLUSTERED</span> HASH ([L_SHIPDATE])
    <span class="kwrd">WITH</span> ( BUCKET_COUNT = 2048576)
)
  <span class="kwrd">WITH</span> ( MEMORY_OPTIMIZED = <span class="kwrd">ON</span> , 
  DURABILITY = SCHEMA_ONLY )</p></pre>
<h3>Bulk Insert into <u>Non-Durable</u> In-Memory table</h3>
<p>For IMND tables we do not need a Primary key so we just add and Non-clustered Hash index and set DURABILITY = SCHEMA_ONLY. The Bulk insert Data loading into the non-durable table completes within 3 minutes with a throughput of 335K rows/sec (vs 7 minutes)! This is 2.3x faster then inserting into a heap table. For the staging of data this definitely a quick win!</p>
<p>&#160;</p>
<style type="text/css">
<p>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }</style>
<h3>SSIS Single Bulk Insert into a Non-Durable table</h3>
<p>Traditionally SSIS is the fastest way to load a file quickly into SQL Server because SSIS will handle all the data pre-processing so the SQL Server engine can spend its CPU ticks on persisting the data to disk. Will this still be the case when inserting the data into a non-durable table? </p>
<p>Below a summary of the tests I ran with SSIS for this post: the SSIS Fastparse option and&#160; the ‘DefaultBufferMaxRows’ and ‘DefaultBufferSize’ settings are the main performance boosters. Also the Native OLE DB (SQLOLEDB.1) provider performs slightly better than the SQL Native Client (SQLNCLI11.1). When you run SSIS and SQL Server side by side,increasing the network packet size isn’t needed.&#160;&#160; </p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image-3.png" rel="lightbox[1198]" title="Dataflow buffer properties"><img decoding="async" loading="lazy" title="Dataflow buffer properties" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-top-width: 0px" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb-3.png" width="685" height="288" /></a></p>
<p>Net result: a basic SSIS package that reads a flat file source and writes the data out directly to the Non-Durable table via an OLE DB destination performs similar as the Bulk Insert command into a IMND table: the 60 Million rows are loaded in 2minutes 59seconds or 335K rows/sec, identical to the Bulk insert command. </p>
<h3>SSIS with Balanced Data Distributor </h3>
<p>But wait&#8230;&#160; the in-memory tables are designed to work ‘lock &amp; latch’ free so this means that we can load data also via multiple streams! That is easy to achieve with SSIS; the Balanced Data Distributor will bring just that! (the BDD is listed in the Common section of the SSIS Toolbox) Adding the BDD component and inserting the data into the same Non-durable table with 3 streams provides the best throughput: we are now up to 526000 Rows/sec!</p>
<blockquote>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image-4.png" rel="lightbox[1198]" title="image"><img decoding="async" loading="lazy" title="image" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="image" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb-4.png" width="448" height="115" /></a></p>
</blockquote>
<p>&#160;</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image25.png" rel="lightbox[1198]" title="SSIS package with DBB and 3 Destinations"><img decoding="async" loading="lazy" title="SSIS package with DBB and 3 Destinations" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image25_thumb.png" width="536" height="272" /></a></p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image-5.png" rel="lightbox[1198]" title="The BDD automatically balances the rows across  multiple destinations"><img decoding="async" loading="lazy" title="The BDD automatically balances the rows across  multiple destinations" style="border-left-width: 0px; border-right-width: 0px; background-image: none; border-bottom-width: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; display: block; padding-right: 0px; border-top-width: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image_thumb-5.png" width="536" height="208" /></a></p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2016/10/image13.png" rel="lightbox[1198]" title="Bulk Copy Rows/sec impression"><img decoding="async" loading="lazy" title="Bulk Copy Rows/sec impression" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; float: none; padding-top: 0px; padding-left: 0px; margin-left: auto; border-left: 0px; display: block; padding-right: 0px; margin-right: auto" border="0" alt="" src="http://henkvandervalk.com/wp-content/uploads/2016/10/image13_thumb.png" width="536" height="217" /></a></p>
<p>Looking at this very flat line, with only 160% of CPU time used by SQLServer, it seems we are hitting some bottleneck: I quickly tried to be creative by leveraging the modulo function and added 2 more data flows within the package (each processing 1/3 of the data)&#160; but it that isn’t improving much (1 min52sec) so a great topic to investigate for a future post!&#160;&#160; </p>
<p>&#160;</p>
<h3>Wrap-up </h3>
<p>The In-Memory Non-Durable table option brings some serious performance improvement for staging of data! loading data 1.5x faster with a regular Bulk insert and up to 3.6x times faster with SSIS. This option, primarily designed to speed up OLTP, can also make a huge difference to shrink your batch window quickly!</p>
<p>(To be continued!)</p>
]]></content:encoded>
					
					<wfw:commentRss>http://henkvandervalk.com/how-to-load-data-fast-into-sql-server-2016/feed</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
			</item>
		<item>
		<title>Introducing Microsoft SQL Server 2016 R Services</title>
		<link>http://henkvandervalk.com/introducing-microsoft-sql-server-2016-r-services</link>
					<comments>http://henkvandervalk.com/introducing-microsoft-sql-server-2016-r-services#comments</comments>
		
		<dc:creator><![CDATA[Henk]]></dc:creator>
		<pubDate>Wed, 28 Oct 2015 17:35:14 +0000</pubDate>
				<category><![CDATA[R]]></category>
		<category><![CDATA[SQL2016]]></category>
		<category><![CDATA[SQL R Services]]></category>
		<guid isPermaLink="false">http://henkvandervalk.com/?p=1169</guid>

					<description><![CDATA[With the release of CTP3 SQL Server 2016 and its native In-database support for the open source R language you can now call both R, RevoScaleR functions and scripts directly from within a SQL query and benefit from multi-threaded and multi-core in-DB computations. Yet another SQLServer milestone! The R integration brings a lot of benefits [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>With the release of CTP3 SQL Server 2016 and its native In-database support for the open source R language you can now call both R, RevoScaleR functions and scripts directly from within a SQL query and benefit from multi-threaded and multi-core in-DB computations. Yet another SQLServer milestone! The R integration brings a lot of benefits and new scenarios to your end users; it brings the utility of data science to your applications without the need to ‘export’ the data to your R environment!</p>
<h3>Installing &amp; Enabling SQLR integration</h3>
<p>To get started you will need to install and configure SQL Server 2016 CTP3, Revolution R Open (RRO) and Revolution E Enterprise (RRE) on the server.</p>
<p>&nbsp;</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2015/10/image4.png" rel="lightbox[1169]" title="SQLServer 2016 CTP3 Setup - Advanced Analytics Extensions"><img decoding="async" loading="lazy" style="background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;" title="SQLServer 2016 CTP3 Setup - Advanced Analytics Extensions" src="http://henkvandervalk.com/wp-content/uploads/2015/10/image_thumb4.png" alt="SQLServer 2016 CTP3 Setup" width="465" height="241" border="0" /></a>   <a href="http://henkvandervalk.com/wp-content/uploads/2015/10/image.png" rel="lightbox[1169]" title="Introducing Microsoft SQL Server 2016 R Services"><img decoding="async" loading="lazy" style="background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;" title="" src="http://henkvandervalk.com/wp-content/uploads/2015/10/image_thumb.png" alt="RRE" width="114" height="142" border="0" /></a></p>
<p>The versions I installed:<br />
&#8211; SQL Server 2016 (CTP3.0) &#8211; 13.0.700.139 (X64)<br />
&#8211; Revolution R Open, The Enhanced Open Source R distribution (R version 3.2.2 (2015-08-14) )<br />
&#8211; Revolution R Enterprise for SQL Server 2016 Community Technology Preview (CTP) 3 – (7.5.0)</p>
<p>(To download the bits: <a title="https://msdn.microsoft.com/en-us/library/mt604847.aspx" href="https://msdn.microsoft.com/en-us/library/mt604847.aspx">https://msdn.microsoft.com/en-us/library/mt604847.aspx</a>)</p>
<p>The R integration uses a new service called ‘SQL Server Launchpad’  to ‘launch’ Advanced Analytics Extensions processes; it enables the integration with Microsoft R Open using standard T-SQL statements. (Disabling this service will make Advanced Analytics features of SQL Server unavailable).</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2015/10/image1.png" rel="lightbox[1169]" title="New SQL Server Launchpad service"><img decoding="async" loading="lazy" style="background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;" title="New SQL Server Launchpad service" src="http://henkvandervalk.com/wp-content/uploads/2015/10/image_thumb1.png" alt="New SQL Server Launchpad service" width="362" height="299" border="0" /></a></p>
<p>To enable the R integration, you have to run the sp_configure ‘external scripts enabled’ command and grant permissions to users to execute R scripts via the new db_rrerole role:</p>
<pre class="csharpcode">sp_configure <span class="str">'external scripts enabled'</span>, 1
reconfigure
GO

alter role db_rrerole add member HenkR;</pre>
<p>&nbsp;</p>
<h3>Connecting to SQL Server from your favorite R environment</h3>
<p>Traditionally you would connect from your favorite R IDE to SQL Server to retrieve data for analytics; or you can use the R GUI environment that is part of the installers bits. The default installation path to RGUI is: &#8220;C:\Program Files\RRO\RRO-3.2.2-for-RRE-7.5.0\R-3.2.2\bin\x64\Rgui.exe&#8221;. Also popular is the <a title="Download RStudio Desktop" href="https://www.rstudio.com/products/RStudio/#Desktop" target="_blank">R Studio Desktop</a> that is separate downloadable from: <a href="http://www.Rstudio.com">www.Rstudio.com</a>. (the version I installed to get started is V0.99.486).</p>
<p>&nbsp;</p>
<h4>1) Connecting R to SQLServer</h4>
<p>To connect your RStudio environment to SQL Server the traditional way to read data (and import it into memory or a local *.xdf file) would like something like this:  (it requires  to setup a proper connection string;</p>
<p>use the ‘hostname<a href="file://\\instancename’  to">\\instancename’  </a>to connect to a SQLServer named instance and pull some data from a table:)</p>
<p>sqlServerConnString &lt;- &#8220;Driver=<span class="kwrd">SQL</span> Server;Server=SQL2016EE\\SQL2016CTP3;<br />
<span class="kwrd">Database</span>=Oldskool_DB;Uid=sa;Pwd=MyPassword&#8221;</p>
<p>&nbsp;</p>
<p>sqlServerDataDS &lt;- RxSqlServerData(sqlQuery =</p>
<p>&#8220;<span class="kwrd">SELECT</span> * <span class="kwrd">FROM</span> LINEITEM_Small&#8221;,connectionString = sqlServerConnString )</p>
<p>&nbsp;</p>
<h4>2)  Offloading  R scripts to SQLServer</h4>
<p>&nbsp;</p>
<p>Pulling a large dataset from a database and processing it locally would have required the writing into a local file which is a single threaded process. This can take a long time&#8230; So luckily with the CTP3 we can now bring the R script to the Data and process it there! To execute and R script directly from an SQL query use the new ‘sp_execute_external_script’ to, for example, calculate the mean of L_Quantity via R:</p>
<pre class="csharpcode"><span class="kwrd">EXECUTE</span>  sp_execute_external_script
                @<span class="kwrd">language</span> = N<span class="str">'R'</span>
              , @script = N<span class="str">'OutputDataSet &lt;- data.frame(mean(InputDataSet[,1]))'</span>
              , @input_data_1 = N<span class="str">'SELECT  L_QUANTITY from LINEITEM_small as L_Quantity '</span>
              <span class="kwrd">WITH</span> <span class="kwrd">RESULT</span> <span class="kwrd">SETS</span> ((col <span class="kwrd">int</span> <span class="kwrd">not</span> <span class="kwrd">null</span>));
go</pre>
<p>SQL Server will execute this external script via its new launchpad service in the BxLserver.exe process (The Revolution Analytics Communication Component for SQL Server).</p>
<h3>R Memory allocation</h3>
<p>What would an article about ‘R’ be without a note about memory usage!<br />
By default the SQLR integration service in the CTP3 build will use up to 20% of the total physical memory available on the host for all its In-DB running R sessions.  You can monitor the Memory usage of the BxLserver.exe process via the Taskmanager Working Set memory column.</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2015/10/image2.png" rel="lightbox[1169]" title="Monitor the BxlServer process memory usage via Windows Task manager "><img decoding="async" loading="lazy" style="background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;" title="Monitor the BxlServer process memory usage via Windows Task manager " src="http://henkvandervalk.com/wp-content/uploads/2015/10/image_thumb2.png" alt="Monitor the BxlServer process memory usage via Windows Task manager " width="685" height="152" border="0" /></a></p>
<p>If you cross the line your query will fail with a message similar to the one below:</p>
<p><span style="color: #ff0000;">Msg 39004, Level 16, State 20, Line 8</span></p>
<p>A &#8216;R&#8217; script error occurred during execution of &#8216;sp_execute_external_script&#8217; with HRESULT 0x80004004.</p>
<p>Msg 39019, Level 16, State 1, Line 8</p>
<p>An external script error occurred:</p>
<p>Error in sqlSatelliteCall() :</p>
<p>Calls: sqlSatelliteCall -&gt; .Call</p>
<p>Execution halted</p>
<p><span style="color: #ff0000;">Invalid BXL stream</span></p>
<p><span style="color: #0000ff;">STDOUT message(s) from external script:</span></p>
<p>Failed to allocate or reallocate memory.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>The resolution for running out of memory like this would be to increase the max. memory allocation which is a setting in the Rlauncher.config file (which is located in C:\Program Files\Microsoft SQL Server\MSSQL13… \MSSQL\Binn directory). Increase the default setting to a value that is still safe so it will not interfere with the sqlserver memory allocation.</p>
<p>The last line in the file ‘MEMORY_LIMIT_PERCENT=20’  controls the maximum percentage of physical memory which gets allocated to all the R sessions. Increase the value to allow the R service to consume more memory:</p>
<p><a href="http://henkvandervalk.com/wp-content/uploads/2015/10/image3.png" rel="lightbox[1169]" title="Rlauncher Config file"><img decoding="async" loading="lazy" style="background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;" title="Rlauncher Config file" src="http://henkvandervalk.com/wp-content/uploads/2015/10/image_thumb3.png" alt="Rlauncher Config file" width="656" height="297" border="0" /></a></p>
<p>&nbsp;</p>
<h3>Wrap-Up</h3>
<p>Embracing and operationalizing R functionality via the new SQLServer R Services integration has become very simple! It brings a lot of new possibilities to apply advanced analytics to your SQL data, ranging from data exploration to Predictive Modeling. Getting started with R might be a bit challenging but it will enrich your reports for sure!</p>
]]></content:encoded>
					
					<wfw:commentRss>http://henkvandervalk.com/introducing-microsoft-sql-server-2016-r-services/feed</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
			</item>
	</channel>
</rss>
