<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com/blog/</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2012 (c) 2013</copyright><ttl>60</ttl><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/AyendeRahien" /><feedburner:info uri="ayenderahien" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item><title>Raven Xyz: Trying out some ideas</title><description>&lt;p&gt;One of the things that we are planning for Raven 3.0 is the introducing of additional options. In addition to having RavenDB, we will also have RavenFS, which is a replicated file system with an eye toward very large files. But that isn’t what I want to talk about today. Today I would like to talk about something that is currently just in my head. I don’t even have a proper name for it yet. &lt;/p&gt; &lt;p&gt;Here is the deal, RavenDB is very good for data that you care about individually. Orders, customers, etc. You track, modify and work with each document independently. If you are writing a lot of data that isn’t really relevant on its own, but only as an aggregate, that is probably not a good use case for RavenDB.&lt;/p&gt; &lt;p&gt;Examples for such things include logs, click streams, event tracking, etc. The trivial example would be any reality show, where you have a lot of users sending messages to vote for a particular candidate, and you don’t really care for the individual data points, only the aggregate. Other things might be to want to track how many items were sold in a particular period based on region, etc.&lt;/p&gt; &lt;p&gt;The API that I had in mind would be something like:&lt;/p&gt; &lt;blockquote&gt; &lt;div id="codeSnippetWrapper"&gt; &lt;div id="codeSnippet" style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum1" style="color: #606060"&gt;   1:&lt;/span&gt; foo.Write(&lt;span style="color: #0000ff"&gt;new&lt;/span&gt; PurchaseMade { Region = &lt;span style="color: #006080"&gt;"Asia"&lt;/span&gt;, Product = &lt;span style="color: #006080"&gt;"products/1"&lt;/span&gt;, Amount = 23 } );&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum2" style="color: #606060"&gt;   2:&lt;/span&gt; foo.Write(&lt;span style="color: #0000ff"&gt;new&lt;/span&gt; PurchaseMade { Region = &lt;span style="color: #006080"&gt;"Europe"&lt;/span&gt;, Product = &lt;span style="color: #006080"&gt;"products/3"&lt;/span&gt;, Amount = 3 } );&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;/div&gt;&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;And then you can write map/reduce statements on them like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;div id="codeSnippetWrapper"&gt;
&lt;div id="codeSnippet" style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum1" style="color: #606060"&gt;   1:&lt;/span&gt; &lt;span style="color: #008000"&gt;// map&lt;/span&gt;&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum2" style="color: #606060"&gt;   2:&lt;/span&gt; from purchase &lt;span style="color: #0000ff"&gt;in&lt;/span&gt; purchases&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum3" style="color: #606060"&gt;   3:&lt;/span&gt; select &lt;span style="color: #0000ff"&gt;new&lt;/span&gt;&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum4" style="color: #606060"&gt;   4:&lt;/span&gt; {&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum5" style="color: #606060"&gt;   5:&lt;/span&gt;     purchase.Region,&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum6" style="color: #606060"&gt;   6:&lt;/span&gt;     purchase.Item,&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum7" style="color: #606060"&gt;   7:&lt;/span&gt;     purchase.Amount&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum8" style="color: #606060"&gt;   8:&lt;/span&gt; }&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum9" style="color: #606060"&gt;   9:&lt;/span&gt;&amp;nbsp; &lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum10" style="color: #606060"&gt;  10:&lt;/span&gt; &lt;span style="color: #008000"&gt;// reduce&lt;/span&gt;&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum11" style="color: #606060"&gt;  11:&lt;/span&gt; from result &lt;span style="color: #0000ff"&gt;in&lt;/span&gt; results&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum12" style="color: #606060"&gt;  12:&lt;/span&gt; group result by &lt;span style="color: #0000ff"&gt;new&lt;/span&gt; { result.Region, result.Item }&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum13" style="color: #606060"&gt;  13:&lt;/span&gt; into g&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum14" style="color: #606060"&gt;  14:&lt;/span&gt; select &lt;span style="color: #0000ff"&gt;new&lt;/span&gt;&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum15" style="color: #606060"&gt;  15:&lt;/span&gt; {&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum16" style="color: #606060"&gt;  16:&lt;/span&gt;     g.Key.Region,&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum17" style="color: #606060"&gt;  17:&lt;/span&gt;     g.Key.Item,&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum18" style="color: #606060"&gt;  18:&lt;/span&gt;     Amount = g.Sum(x=&amp;gt;x.Amount)&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-top-style: none; overflow: visible; font-size: 8pt; border-left-style: none; font-family: 'Courier New', courier, monospace; border-bottom-style: none; color: black; padding-bottom: 0px; direction: ltr; text-align: left; padding-top: 0px; border-right-style: none; padding-left: 0px; margin: 0em; line-height: 12pt; padding-right: 0px; width: 100%; background-color: #f4f4f4"&gt;&lt;span id="lnum19" style="color: #606060"&gt;  19:&lt;/span&gt; }&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;/div&gt;&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;Yes, this looks pretty much like you would have in RavenDB, but there are important distinctions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We don’t allow modifying writes, nor deleting them.&lt;/li&gt;
&lt;li&gt;Most of the operations are assumed to be made on the result of the map/reduce statements.&lt;/li&gt;
&lt;li&gt;The assumption is that you don’t really care for each data point.&lt;/li&gt;
&lt;li&gt;There is going to be a &lt;em&gt;lot&lt;/em&gt; of those data points, and they are likely to be coming in at a relatively high rate.&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;Thoughts?&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/7JTkik5n1bk" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/7JTkik5n1bk/raven-xyz-trying-out-some-ideas</link><guid isPermaLink="false">http://ayende.com/blog/162273/raven-xyz-trying-out-some-ideas?key=ffe21554-c27c-48df-90a0-0b386ea1e1ff</guid><pubDate>Fri, 24 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162273/raven-xyz-trying-out-some-ideas?key=ffe21554-c27c-48df-90a0-0b386ea1e1ff</feedburner:origLink></item><item><title>And that is why I wrote my own db…</title><description>&lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/And-that-is-why-I-wrote-my-own-db_110D0/image_2.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/And-that-is-why-I-wrote-my-own-db_110D0/image_thumb.png" width="677" height="143"&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/XOFN_56FmKw" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/XOFN_56FmKw/and-that-is-why-i-wrote-my-own-db</link><guid isPermaLink="false">http://ayende.com/blog/162242/and-that-is-why-i-wrote-my-own-db?key=51595403-8f57-410f-9f2b-2de2b1db11e1</guid><pubDate>Thu, 23 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162242/and-that-is-why-i-wrote-my-own-db?key=51595403-8f57-410f-9f2b-2de2b1db11e1</feedburner:origLink></item><item><title>RavenDB Dynamic Aggregation Webinar</title><description>&lt;div id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:3f8acdda-3816-4eaf-8bfe-1864b1a0867c" class="wlWriterEditableSmartContent" style="float: none; padding-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; display: inline; padding-right: 0px"&gt;&lt;div&gt;&lt;object width="448" height="252"&gt;&lt;param name="movie" value="http://www.youtube.com/v/yuHQDrjgf7E?hl=en&amp;amp;hd=1"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/yuHQDrjgf7E?hl=en&amp;amp;hd=1" type="application/x-shockwave-flash" width="448" height="252"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/RrwVUGKSZB4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/RrwVUGKSZB4/ravendb-dynamic-aggregation-webinar</link><guid isPermaLink="false">http://ayende.com/blog/162274/ravendb-dynamic-aggregation-webinar?key=b4c97fe6-82f1-40c1-a0cd-e2f2c25a45dd</guid><pubDate>Thu, 23 May 2013 08:13:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162274/ravendb-dynamic-aggregation-webinar?key=b4c97fe6-82f1-40c1-a0cd-e2f2c25a45dd</feedburner:origLink></item><item><title>Rhino Mocks' future</title><description>&lt;p&gt;
	Well, it seems like Rhino Mocks has a new daddy &lt;img alt="Smile" class="wlEmoticon wlEmoticon-smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/Rhino-Mocks-future_7BE8/wlEmoticon-smile_2.png" style="border-top-style: none; border-left-style: none; border-bottom-style: none; border-right-style: none" /&gt;. In particular, &lt;a href="https://twitter.com/meisinger2"&gt;Mike Meisinger&lt;/a&gt; has kindly agreed to take the project as the project lead.&lt;/p&gt;
&lt;p&gt;
	You can discuss all about this in &lt;a href="http://groups.google.com/group/rhinomocks"&gt;the mailing list for Rhino Mocks&lt;/a&gt;, or hear more about Mike&amp;rsquo;s plans, &lt;a href="http://meisinger2.wordpress.com/2013/05/19/rhino-mocks-new-home/"&gt;visit his blog&lt;/a&gt;.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/TNw3ikdmG00" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/TNw3ikdmG00/rhino-mocks-future</link><guid isPermaLink="false">http://ayende.com/blog/162241/rhino-mocks-future?key=77d77553-2cd5-486c-b7d3-6400b74f3b99</guid><pubDate>Wed, 22 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162241/rhino-mocks-future?key=77d77553-2cd5-486c-b7d3-6400b74f3b99</feedburner:origLink></item><item><title>RavenDB, Victory</title><description>&lt;p&gt;Jeremy Miller’s post about &lt;a href="http://jeremydmiller.com/2013/05/13/would-i-use-ravendb-again/"&gt;Would I use RavenDB again&lt;/a&gt; has been making the round. It is a good post, and I was asked to comment on it by multiple people.&lt;/p&gt; &lt;p&gt;I wanted to comment very briefly on some of the issues that were brought up:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Memory consumption – this is probably mostly related to the long term session usage, which we expect to be much more short lived.&lt;/li&gt; &lt;ul&gt; &lt;li&gt;The 2nd level cache is mostly there to speed things up when you have relatively small documents. If you have very large documents, or routinely have requests that return many documents, that can be a memory hog. That said, the 2nd level cache is limited to 2,048 items by default, so that shouldn’t really be a big issue. And you can change that (or even turn it off) with ease.&lt;/li&gt;&lt;/ul&gt; &lt;li&gt;Don’t abstract RavenDB too much – yeah, that is pretty much has been our recommendation for a while. &lt;/li&gt; &lt;ul&gt; &lt;li&gt;I don’t see this as a problem. You have just the same issue if you are using any OR/M against an RDBMS.&lt;/li&gt;&lt;/ul&gt; &lt;li&gt;Bulk Insert – the issue has already been fixed. In fact, IIRC, it was fixed within a day or two of the issue being brought up.&lt;/li&gt; &lt;li&gt;Eventual Consistency – Yes, you need to decide how to handle that. As Jeremy said, there are several ways of handling that, from using natural keys with no query latency associated with them to calling WaitForNonStaleResultsAsOfNow();&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Truthfully, the thing that really caught my eye wasn’t Jeremy’s post, but one of the comments:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-Victory_C687/image_2.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-Victory_C687/image_thumb.png" width="682" height="314"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Thanks you, we spend a &lt;em&gt;lot&lt;/em&gt; of time on that!&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/lXllFzY0rCk" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/lXllFzY0rCk/ravendb-victory</link><guid isPermaLink="false">http://ayende.com/blog/162209/ravendb-victory?key=a6981724-0863-4dcb-82c5-889f1ce15203</guid><pubDate>Tue, 21 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162209/ravendb-victory?key=a6981724-0863-4dcb-82c5-889f1ce15203</feedburner:origLink></item><item><title>Fixing up the build process</title><description>&lt;p&gt;There is a big problem in the RavenDB build process. To be rather more exact, there is a… long problem in the RavenDB build process.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Fixing-up-the-build-process_7889/image_2.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Fixing-up-the-build-process_7889/image_thumb.png" width="335" height="402"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;As you can imagine, when the build process run for that long, it doesn’t' get run too often. We already had several runs of “let us optimize the build”. But… the actual reason for the tests taking this long is a bit sneaky.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Fixing-up-the-build-process_7889/image_4.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Fixing-up-the-build-process_7889/image_thumb_1.png" width="767" height="133"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;To save you from having to do the math, this means an average of 1.15 seconds per test.&lt;/p&gt; &lt;p&gt;In most tests, we actually have to create a RavenDB instance. That doesn’t take too long, but it does take some time. And we have a lot of tests that uses the network, because we need to test how RavenDB works on the wire.&lt;/p&gt; &lt;p&gt;From that perspective, it means that we don’t seem to have any real options. Even if we cut the average cost of running the tests by half, it would still be a 30 minutes build process. &lt;/p&gt; &lt;p&gt;Instead, we are going to create a layered approach. We are going to freeze all of our existing tests, move them to an Integration Tests project. We will create a small suite of tests that cover just core stuff with RavenDB, and use that. Over time, we will be adding tests to the new test project. When that becomes too slow, we will have another migration.&lt;/p&gt; &lt;p&gt;What about the integration tests? Well, those will be run solely by our build server, and we will setup things so we can automatically test when running from our own forks, not the main code line.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/QOzFLOprVdE" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/QOzFLOprVdE/fixing-up-the-build-process</link><guid isPermaLink="false">http://ayende.com/blog/162177/fixing-up-the-build-process?key=8e868876-2277-447f-9a99-0b68668bb112</guid><pubDate>Mon, 20 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162177/fixing-up-the-build-process?key=8e868876-2277-447f-9a99-0b68668bb112</feedburner:origLink></item><item><title>RavenDB Webinar: Aggregation just jump a grade or two…</title><description>&lt;p&gt;In tomorrow’s Webinar, we will discuss handle dynamic aggregation using RavenDB. A new feature in 2.5, this is meant to give you more options for reporting queries, including complex aggregation, dynamic selection, etc.&lt;/p&gt; &lt;p&gt;You can register here: &lt;a href="https://www2.gotomeeting.com/register/789291530"&gt;https://www2.gotomeeting.com/register/789291530&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/V6wIx6qdcUc" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/V6wIx6qdcUc/ravendb-webinar-aggregation-just-jump-a-grade-or-two</link><guid isPermaLink="false">http://ayende.com/blog/162145/ravendb-webinar-aggregation-just-jump-a-grade-or-two?key=51d50d4e-ea7f-47a1-a2f1-90aae19ab135</guid><pubDate>Fri, 17 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162145/ravendb-webinar-aggregation-just-jump-a-grade-or-two?key=51d50d4e-ea7f-47a1-a2f1-90aae19ab135</feedburner:origLink></item><item><title>The difference between Ordering &amp; Boosting</title><description>&lt;p&gt;This seems to be a pretty common issue with people getting the two of them confused. As an example, let us take the users in Stack Overflow:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/86db27cbdef3_B46E/image_4.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/86db27cbdef3_B46E/image_thumb_1.png" width="694" height="132"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Here, we want to get the users &lt;strong&gt;in order&lt;/strong&gt;. We want to get all the users in descending order of reputation.&lt;/p&gt; &lt;p&gt;But what happens when we want to do an actual &lt;strong&gt;search&lt;/strong&gt;, for example, we want to get users by tag. Perhaps we want to get someone that knows some ravendb. &lt;/p&gt; &lt;p&gt;Here is the data that we have to work with:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/86db27cbdef3_B46E/image_6.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/86db27cbdef3_B46E/image_thumb_2.png" width="487" height="535"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Now, when searching, we want to be able to do the following. Find users that match what the tags that we specified, that are relevant and have them show up in reputation order.&lt;/p&gt; &lt;p&gt;And that is where it kills us. Relevancy &amp;amp; order are pretty much exclusive. Before we can explain that, we need to understand that order is absolute, but relevancy is not. If I have 10,000 tags, there is very little meaning to me having a tag or not. But if I have 10 tags, me having a tag or not is a lot more important. You want to talk with an expert in a specific field, not just someone who is a jack of all trades.&lt;/p&gt; &lt;p&gt;Now, it might be that you want to apply some boost factor to users with high reputation, because there are people who are jack of all trades and master of most. That is the difference between boosting and ordering.&lt;/p&gt; &lt;p&gt;Ordering is absolute, while boosting is a factor applied against the relative relevancy of the current query.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/mU9QrnJsOWU" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/mU9QrnJsOWU/the-difference-between-ordering-boosting</link><guid isPermaLink="false">http://ayende.com/blog/162018/the-difference-between-ordering-boosting?key=346944eb-3860-4570-b9fb-1602491e8d1c</guid><pubDate>Thu, 16 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162018/the-difference-between-ordering-boosting?key=346944eb-3860-4570-b9fb-1602491e8d1c</feedburner:origLink></item><item><title>How not to deal with Replication Lag</title><description>&lt;p&gt;Because RavenDB replication is async in nature, there is a period of time between a write has been committed on the master and until it is visible to the clients.&lt;/p&gt; &lt;p&gt;A user has requested that we would provide a low latency way to provide a solution to that. The idea was that the master server would report to the secondaries that a write happened, and then they would mark all reads from them for those documents as dirty, until replication caught up.&lt;/p&gt; &lt;p&gt;Implementation wise, this is all ready to happen. We have the Changes API, which is an easy way to get changes from a db. We have the ability to return a 204 Non Authoritative response, so it looks easy. &lt;/p&gt; &lt;p&gt;In theory, it sounds reasonable, but this idea just doesn’t hold water. Let us talk about normal operations. Even with the “low latency” notifications (and replication is about as low latency as it already get), we have to deal with a window of time between the write completing on the master and the notification arriving on the secondaries. In fact, it is the &lt;em&gt;exact&lt;/em&gt; same window as with replication. Sure, if you have a high replication load, that might be different, but those tend to be rare (high write load, very big documents, etc).&lt;/p&gt; &lt;p&gt;But let us assume that this is really the case. What about failures? &lt;/p&gt; &lt;p&gt;Let us assume Server A &amp;amp; B and client C. Client C makes a write to A, A notifies B and when C reads from B, it would get 204 response until A replicates to B. All nice &amp;amp; dandy. But what happens when A can’t talk to B ? Remember a server being down is the easiest scenario, the hard part is when both A &amp;amp; B are operational, but can’t talk to one another. RavenDB&amp;nbsp; is designed to gracefully handle network splits and merges, so what would happen in this case?&lt;/p&gt; &lt;p&gt;Client C writes to A, but A can’t notify B or replicate to it. Client C reads from B, but since B got no notification about a change, it return 200 Ok response, which means that this is the latest version. Problem.&lt;/p&gt; &lt;p&gt;In this case, this is actually a bigger problem than you might consider. If we support the notifications under the standard scenario, user will make assumptions about this. They will have separate code paths for non authoritative responses, for example. But as we have seen, we have a window of time where the reply would say it is authoritative even though it isn’t (a very short one, sure, but still) and under failure scenarios we will out right lie.&lt;/p&gt; &lt;p&gt;It is better not to have this “feature” at all, and let the user handle that on his own (and there are ways to handle that, reading from the master for important stuff, for example). &lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/Qhn1yoj1fhg" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/Qhn1yoj1fhg/how-not-to-deal-with-replication-lag</link><guid isPermaLink="false">http://ayende.com/blog/162017/how-not-to-deal-with-replication-lag?key=f682e2b0-eb63-4f23-b2fe-b57f3b9d9281</guid><pubDate>Wed, 15 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/162017/how-not-to-deal-with-replication-lag?key=f682e2b0-eb63-4f23-b2fe-b57f3b9d9281</feedburner:origLink></item><item><title>RavenDB Clusters &amp; Write Assurances</title><description>&lt;p&gt;RavenDB handles replication in an async manner. Let us say that you have 5 nodes in your cluster, set to use master/master replication.&lt;/p&gt; &lt;p&gt;That means that you call SaveChanges(), the value is saved to the a node, and then replicated to other nodes.&amp;nbsp; But what happens when you have safety requirements? What happens if a node goes down after the call to SaveChanges() was completed, but before it replicate the information out?&lt;/p&gt; &lt;p&gt;In other systems, you have the ability to specify W factor, to how many nodes this value will be written before it is considered “safe”. In RavenDB, we decided to go in a similar route. Here is the code:&lt;/p&gt; &lt;blockquote&gt; &lt;div id="codeSnippetWrapper"&gt; &lt;div style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: #f4f4f4; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px" id="codeSnippet"&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: white; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum1"&gt;   1:&lt;/span&gt; await session.StoreAsync(user);&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: #f4f4f4; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum2"&gt;   2:&lt;/span&gt; await session.SaveChangesAsyng(); &lt;span style="color: #008000"&gt;// save to one of the nodes&lt;/span&gt;&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: white; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum3"&gt;   3:&lt;/span&gt;&amp;nbsp; &lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: #f4f4f4; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum4"&gt;   4:&lt;/span&gt; var userEtag = session.Advanced.GetEtagFor(user);&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: white; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum5"&gt;   5:&lt;/span&gt;&amp;nbsp; &lt;/pre&gt;&lt;!--CRLF--&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: #f4f4f4; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum6"&gt;   6:&lt;/span&gt; var replicas = await store.Replication.WaitAsync(etag: userEtag, repliacs: 1);&lt;/pre&gt;&lt;!--CRLF--&gt;&lt;!--CRLF--&gt;&lt;/div&gt;&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;As you can see, we now have a way to actually wait until replication is completed. We will ping all of the replicas, waiting to see that replication has matched or exceeded the etag that we just wrote.&amp;nbsp; You can specify the number of replicas that are required for this to complete.&lt;/p&gt;
&lt;p&gt;Practically speaking, you can specify a timeout, and if the nodes aren’t reachable, you will get an error about that. &lt;/p&gt;
&lt;p&gt;This gives you the ability to handle write assurances very easily. And you can choose how to handle this, on a case by case basis (you care to wait for users to be created, but not for new comments, for example) or globally.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/y9o7jNH6Pi4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/y9o7jNH6Pi4/ravendb-clusters-write-assurances</link><guid isPermaLink="false">http://ayende.com/blog/161985/ravendb-clusters-write-assurances?key=34d0b26c-8251-486c-a5b4-705d6f94d984</guid><pubDate>Tue, 14 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161985/ravendb-clusters-write-assurances?key=34d0b26c-8251-486c-a5b4-705d6f94d984</feedburner:origLink></item><item><title>RavenDB &amp; Locking indexes</title><description>&lt;p&gt;One of the things that we keep thinking about with RavenDB is how to make it easier for you to run in production.&lt;/p&gt; &lt;p&gt;To that end, we introduce a new feature in 2.5, Index Locking. This looks like this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB--Locking-indexes_A5EC/image_2.png"&gt;&lt;img title="image" style="border-top: 0px; border-right: 0px; background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; margin: 0px; border-left: 0px; display: inline; padding-right: 0px" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB--Locking-indexes_A5EC/image_thumb.png" width="478" height="307"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;But what does this mean, to lock an index?&lt;/p&gt; &lt;p&gt;Well, let us consider a production system, in which you have the following index:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;from u &lt;span class="kwrd"&gt;in&lt;/span&gt; docs.Users
select &lt;span class="kwrd"&gt;new&lt;/span&gt;
{
   Query = &lt;span class="kwrd"&gt;new&lt;/span&gt;[] { u.Name, u.Email, u.Email.Split(&lt;span class="str"&gt;'@'&lt;/span&gt;) }
}&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;After &lt;/em&gt;you go to production, you realize that you actually needed to also include the FullName in the search queries as well. You can, obviously, do a full deployment from scratch, but it is generally so much easier to just fix the index definition on the production server, update the index definition on the codebase, and wait for the next deploy for them to match.&lt;/p&gt;
&lt;p&gt;This works, except that in many cases, RavenDB applications call IndexCreation.CreateIndexes() on start up. Which means that on the next startup of your application, the change you just did will be reverted. These options allows you to lock an index for changes, either in such a way that gives you the ability ignore changes to this index, or by raising an error when someone tries to modify the index&lt;/p&gt;
&lt;p&gt;It is important to note that this is not a security feature, you can at any time unlock the index. This is there to make help operations, that is all.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/jLbh5N4JOj0" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/jLbh5N4JOj0/ravendb-locking-indexes</link><guid isPermaLink="false">http://ayende.com/blog/161954/ravendb-locking-indexes?key=4bf8b939-d018-4b12-a9b6-2855be3b2075</guid><pubDate>Mon, 13 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161954/ravendb-locking-indexes?key=4bf8b939-d018-4b12-a9b6-2855be3b2075</feedburner:origLink></item><item><title>Better patching API for RavenDB: Creating New Documents</title><description>&lt;p&gt;A while ago we introduced the ability to &lt;a href="http://ayende.com/blog/157185/awesome-ravendb-feature-of-the-day-evil-patching"&gt;send js scripts to RavenDB for server side execution&lt;/a&gt;. And we have just recently completed a nice improvement on that feature, the ability to create new documents from existing ones. &lt;/p&gt; &lt;p&gt;Here is how it works:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;store.DatabaseCommands.UpdateByIndex(&lt;span class="str"&gt;"TestIndex"&lt;/span&gt;,
                                     &lt;span class="kwrd"&gt;new&lt;/span&gt; IndexQuery {Query = &lt;span class="str"&gt;"Exported:false"&lt;/span&gt;},
                                     &lt;span class="kwrd"&gt;new&lt;/span&gt; ScriptedPatchRequest { Script = script }
  ).WaitForCompletion();
&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;Where the script looks like this:&lt;/p&gt;
&lt;blockquote&gt;&lt;pre class="csharpcode"&gt;
&lt;span class="kwrd"&gt;for&lt;/span&gt;(var i = 0; i &amp;lt; &lt;span class="kwrd"&gt;this&lt;/span&gt;.Comments.length; i++ ) {
   PutDocument(&lt;span class="str"&gt;'comments/'&lt;/span&gt;, {
    Title: &lt;span class="kwrd"&gt;this&lt;/span&gt;.Comments[i].Title,
    User: &lt;span class="kwrd"&gt;this&lt;/span&gt;.Comments[i].User.Name,
    By: &lt;span class="kwrd"&gt;this&lt;/span&gt;.Comments[i].User.Id
  });
}

&lt;span class="kwrd"&gt;this&lt;/span&gt;.Export = &lt;span class="kwrd"&gt;true&lt;/span&gt;;&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;This will create a set of documents for each of the embedded documents.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/s_A0gpugoU4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/s_A0gpugoU4/better-patching-api-for-ravendb-creating-new-documents</link><guid isPermaLink="false">http://ayende.com/blog/161953/better-patching-api-for-ravendb-creating-new-documents?key=7c65cdc9-21ef-46a0-820c-571ede88b455</guid><pubDate>Fri, 10 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161953/better-patching-api-for-ravendb-creating-new-documents?key=7c65cdc9-21ef-46a0-820c-571ede88b455</feedburner:origLink></item><item><title>RavenDB Map/Reduce optimizations</title><description>&lt;p&gt;So I was diagnosing a customer problem, which required me to write the following:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/d70d0fdd34fa_5ABE/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/d70d0fdd34fa_5ABE/image_thumb_1.png" width="825" height="409"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;This is working on a data set of about half a million records.&lt;/p&gt; &lt;p&gt;I took a peek at the stats and I saw this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/d70d0fdd34fa_5ABE/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/d70d0fdd34fa_5ABE/image_thumb.png" width="541" height="572"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;You can ignore everything before 03:23, this is the previous index run. I reset it to make sure that I have a clean test.&lt;/p&gt; &lt;p&gt;What you can see is that we start out with a mapping &amp;amp; reducing values. And you can see that initially this is quite expensive. But very quickly we recognize that we are reducing a single value, and we switch strategies to a more efficient method, and we suddenly have very little cost involved in here. In fact, you can see that the entire process took about 3 minutes from start to finish, and very quickly we got to the point where are bottle neck was actually the maps pushing data our way.&lt;/p&gt; &lt;p&gt;That is pretty cool.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/V-uYKeR5YUQ" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/V-uYKeR5YUQ/ravendb-map-reduce-optimizations</link><guid isPermaLink="false">http://ayende.com/blog/161890/ravendb-map-reduce-optimizations?key=4eef0e0e-a4dc-4e01-a3af-62fd6f74429f</guid><pubDate>Thu, 09 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161890/ravendb-map-reduce-optimizations?key=4eef0e0e-a4dc-4e01-a3af-62fd6f74429f</feedburner:origLink></item><item><title>The state of Rhino Mocks</title><description>&lt;p&gt;I was asked to comment on the current state of Rhino Mocks. The current codebase is located here: &lt;a title="https://github.com/hibernating-rhinos/rhino-mocks" href="https://github.com/hibernating-rhinos/rhino-mocks"&gt;https://github.com/hibernating-rhinos/rhino-mocks&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The last commit was 2 years ago. And I am no longer actively / passively monitoring the mailing list.&lt;/p&gt; &lt;p&gt;From my perspective, Rhino Mocks is done. Done in the sense that I don’t have any interest in extending it, done in the sense that I don’t really use mocking any longer.&lt;/p&gt; &lt;p&gt;If there is anyone in the community that wants to steps in and take charge as the Rhino Mocks project leader, I would love that. Failing that, the code it there, it works quite nicely, but that is all I am going to be doing with this for the time being and the foreseeable future.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/6tWKxvE88Ww" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/6tWKxvE88Ww/the-state-of-rhino-mocks</link><guid isPermaLink="false">http://ayende.com/blog/161826/the-state-of-rhino-mocks?key=972aa9a4-53d4-466a-abd8-20c5e1363db9</guid><pubDate>Wed, 08 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161826/the-state-of-rhino-mocks?key=972aa9a4-53d4-466a-abd8-20c5e1363db9</feedburner:origLink></item><item><title>RavenDB 2.5 Features: Import data to Excel</title><description>&lt;p&gt;I wonder what it says about RavenDB that we spend time doing excel integration &lt;img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/wlEmoticon-smile_2.png"&gt;.&lt;/p&gt; &lt;p&gt;At any rate, we have the following documents inside RavenDB:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb.png" width="615" height="356"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;And we want to get this data into Excel. Not only that, but we want this to be something more than just a flat file. We want something that will auto update itself.&lt;/p&gt; &lt;p&gt;We start by defining the shape of the output, using a transformer.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_1.png" width="445" height="337"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Then we go an visit the following url:&lt;/p&gt; &lt;p&gt;&lt;a title="http://localhost:8080/databases/MusicBox/streams/query/Raven/DocumentsByEntityName?query=Tag:Albums&amp;amp;resultsTransformer=Albums/ShapedForExcel" href="http://localhost:8080/databases/MusicBox/streams/query/Raven/DocumentsByEntityName?query=Tag:Albums&amp;amp;resultsTransformer=Albums/ShapedForExcel&amp;amp;format=excel"&gt;http://localhost:8080/databases/MusicBox/streams/query/Raven/DocumentsByEntityName?query=Tag:Albums&amp;amp;resultsTransformer=Albums/ShapedForExcel&amp;amp;format=excel&lt;/a&gt;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;http://localhost:8080/databases/MusicBox – The server &amp;amp; database that we are querying.&lt;/li&gt; &lt;li&gt;streams/query/Raven/DocumentsByEntityName?query=Tag:Albums – Stream the results of querying the index Raven/DocumentsByEntityName for all Tag:Albums (effectively, give me all the albums).&lt;/li&gt; &lt;li&gt;resultsTransformer=Albums/ShapedForExcel – transform the results using the specified transformer.&lt;/li&gt; &lt;li&gt;format=excel – output this in a format that excel will find easy to understand&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;The output looks like this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_6.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_2.png" width="679" height="173"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Now, let us take this baby and push this to Excel. We create a new document, and then go to the Data tab, and then to From Text:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_8.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_3.png" width="306" height="250"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;We have a File Open Dialog, and we paste the previous URL as the source, then hit enter.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_10.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_4.png" width="850" height="438"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;We have to deal with the import wizard, just hit next on the first page.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_12.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_5.png" width="589" height="424"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;We mark the input as comma delimited, and then hit finish.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_14.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_6.png" width="589" height="424"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;We now need to select where it would go on the document:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_16.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_7.png" width="427" height="404"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;And now we have the data inside Excel:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_18.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_8.png" width="795" height="437"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;We aren’t done yet, we have the data in, now we need to tell Excel to refresh it:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_20.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_9.png" width="484" height="172"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Click on the connections button, where you’ll see something like this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_24.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_11.png" width="573" height="397"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Go to Properties:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_26.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-.5-Features-Import-data-to-Excel_DE4C/image_thumb_12.png" width="423" height="510"&gt;&lt;/a&gt;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;strong&gt;Uncheck&lt;/strong&gt; Prompt for file name on refresh&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Check&lt;/strong&gt; Refresh data when opening the file&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Close the file, go to your database and change something. Open the file again, and you can see the new values in there.&lt;/p&gt; &lt;p&gt;You have now create an Excel file that can automatically pull data from RavenDB and give your users immediate access to the data in a format that they are very comfortable with.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/ySy1wFOJ0S8" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/ySy1wFOJ0S8/ravendb-2-5-features-import-data-to-excel</link><guid isPermaLink="false">http://ayende.com/blog/161825/ravendb-2-5-features-import-data-to-excel?key=75dfb771-111f-4f69-8a61-804593516436</guid><pubDate>Tue, 07 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161825/ravendb-2-5-features-import-data-to-excel?key=75dfb771-111f-4f69-8a61-804593516436</feedburner:origLink></item><item><title>Raven’s Storage: Memtables are tough</title><description>&lt;p&gt;Memtables are conceptually a very simple thing. You have the list of values that you were provided, as well as a skip list for searches.&lt;/p&gt; &lt;p&gt;Complications:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Memtables are meant to be used concurrently.&lt;/li&gt; &lt;li&gt;We are going to have to have to hold all of our values in memory. And I am really not sure that I want to be doing that.&lt;/li&gt; &lt;li&gt;When we switch between mem tables (and under write conditions, we might be doing that a lot), I want to immediately clear the used memory, not wait for the GC to kick in.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;The first thing to do was to port the actual SkipList from the leveldb codebase. That isn’t really hard, but I had to make sure that assumptions made for the C++ memory model are valid for the CLR memory model. In particular, .NET doesn’t have AtomicPointer, but Volatile.Read / Volatile.Write are a good replacement, it seems. I decided to port the one from leveldb because I don’t know what assumptions other list implementations have made. That was the first step in order to create a memtable. The second was to decide where to actually store the data.&lt;/p&gt; &lt;p&gt;Here is the most important method for that part:&lt;/p&gt; &lt;blockquote&gt; &lt;div id="codeSnippetWrapper"&gt;&lt;pre style="border-bottom-style: none; text-align: left; padding-bottom: 0px; line-height: 12pt; background-color: white; margin: 0em; border-left-style: none; padding-left: 0px; width: 100%; padding-right: 0px; font-family: 'Courier New', courier, monospace; direction: ltr; border-top-style: none; color: black; border-right-style: none; font-size: 8pt; overflow: visible; padding-top: 0px"&gt;&lt;span style="color: #606060" id="lnum1"&gt;   &lt;/span&gt;&lt;span style="color: #0000ff"&gt;public&lt;/span&gt; &lt;span style="color: #0000ff"&gt;void&lt;/span&gt; Add(&lt;span style="color: #0000ff"&gt;ulong&lt;/span&gt; seq, ItemType type, Slice key, Stream &lt;span style="color: #0000ff"&gt;val&lt;/span&gt;)&lt;/pre&gt;&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;The problem is that we &lt;em&gt;cannot&lt;/em&gt; just reference this. We &lt;em&gt;have&lt;/em&gt; to copy those values into memory that we control. Why is that? Because the use is free to change the Stream contents or the Slice’s array as soon as we return from this method. By the same token, we can’t just batch this stuff in memory, again, because of the LOH. The way this is handled in leveldb never made much sense to me, so I am going to drastically change that behavior. &lt;/p&gt;
&lt;p&gt;In my implementation, I decided to do the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Copy the keys to our own buffer, and keep them inside the skip list. This is what we will use for actually doing searches.&lt;/li&gt;
&lt;li&gt;Change the SkipList to keep track of values, as well as the key.&lt;/li&gt;
&lt;li&gt;Keep the actual values in unmanaged memory, instead of managed memory. That avoid the whole LOH issue, and give me immediate control on when the memory is disposed.&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;This took some careful coding, because I want to explicitly give up on the GC for this. That means that I need to make &lt;em&gt;damn&lt;/em&gt; sure that I don’t have bugs that would generate memory leak. &lt;/p&gt;
&lt;p&gt;Each memtable would allocate 4MB of unmanaged memory, and would write the values to it. Note that you can write over 4MB (for example, by writing a very large value, or by writing a value whose length exceed the 4MB limit. At that point, we would allocate more unmanaged memory, and hand over the memory table to compaction.&lt;/p&gt;
&lt;p&gt;The whole thing is pretty neat, even if I say so myself &lt;img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Memtable-are_9283/wlEmoticon-smile_2.png"&gt;.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/NqRgjda6j_0" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/NqRgjda6j_0/ravens-storage-memtables-are-tough</link><guid isPermaLink="false">http://ayende.com/blog/161793/ravens-storage-memtables-are-tough?key=b6df41f0-0576-44a5-a0bc-e8dc3c527aeb</guid><pubDate>Mon, 06 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161793/ravens-storage-memtables-are-tough?key=b6df41f0-0576-44a5-a0bc-e8dc3c527aeb</feedburner:origLink></item><item><title>Raven’s Storage: Understanding the SST file format</title><description>&lt;p&gt;This is an example of an SST that stores:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;tests/0000 –&amp;gt; values/0&lt;/li&gt; &lt;li&gt;tests/0001 –&amp;gt; values/1&lt;/li&gt; &lt;li&gt;tests/0002 –&amp;gt; values/2&lt;/li&gt; &lt;li&gt;tests/0003 –&amp;gt; values/3&lt;/li&gt; &lt;li&gt;tests/0004 –&amp;gt; values/4&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;As well as a bloom filter for rapid checks.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_6.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_2.png" width="747" height="273"&gt;&lt;/a&gt;&lt;/p&gt;   &lt;p&gt;If you are wondering about the binary format, that is what this post is all about. We actually start from the end. We have the last 48 bytes of the file are dedicated to the footer.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_10.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_4.png" width="742" height="282"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;The footer format is:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Last 8 bytes of the file is a magic number: 0xdb4775248b80fb57ul – this means that we can quickly identify whatever this is an SST file or not.&lt;br&gt;Here is what this number looks like broken down to bytes:&lt;br&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_12.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_5.png" width="71" height="140"&gt;&lt;/a&gt;&lt;/li&gt; &lt;li&gt;The other 40 bytes are dedicated for the metadata handle and the index handle.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Those are two pair of longs, encoded using 7 bit encoding, in our case, here is what they look like:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_16.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_7.png" width="322" height="56"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Let us see if we can parse them properly:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;100&lt;/li&gt; &lt;li&gt;38&lt;/li&gt; &lt;li&gt;143, 1 = 143&lt;/li&gt; &lt;li&gt;14&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Note that in order to encode 143 properly, we needed two bytes, because it is higher than 127 (and we use the last bit to indicate if there are more items to read). The first two values are actually the metadata handle (position: 100, count: 38), the second are the index handle (position: 143, count: 14).&lt;/p&gt; &lt;p&gt;We will start by parsing the metadata block first:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_18.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_8.png" width="738" height="115"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;You can see the relevant portions in the image.&lt;/p&gt; &lt;p&gt;The actual interesting bits here are the first three bytes. Here we have:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;0 – the number of shared bytes with the previous key (there is no, which is why it is zero).&lt;/li&gt; &lt;li&gt;25 – the number of non shared bytes (in this case ,the full value, which is 25).&lt;/li&gt; &lt;li&gt;2 – the size of the value, in this case ,the value is the handle in the file of the data for the filter.&lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;You can read the actual key name on the left ,”filter.BuiltinBloomFilter”, and then we have the actual filter data:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_23.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_10.png" width="149" height="61"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;You can probably guess that this is the filter handle (position: 82, count: 18).&lt;/p&gt; &lt;p&gt;The rest of the data are two 4 bytes integers. Those are the restart array (position 130 –133) and the restart count (position 134 – 137). Restarts are a very important concept for reducing the size of the SST, but I’ll cover them in depth when talking about the actual data, not the metadata.&lt;/p&gt; &lt;p&gt;Next, we have the actual filter data itself, which looks like this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_25.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_11.png" width="605" height="53"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;This is actually a serialized bloom filter, which allows us to quickly decide if a specified key is here or not. There is a chance for errors, but errors can only be false positive, never false negative. This turn out to be quite useful down the road, when we have multiple SST files are need to work with them in tandem. Even so, we can break it apart into more detailed breakdown:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;The first 8 bytes are the actual serialized bloom filter bits.&lt;/li&gt; &lt;li&gt;The 9th byte is the k value in the bloom filter algorithm. The default value is 6.&lt;/li&gt; &lt;li&gt;The last value (11) is the lg value (also used in the bloom filter algo).&lt;/li&gt; &lt;li&gt;The rest of the data is a bit more interesting. The 4 bytes preceding the 11 (those are 9,0,0,0) are the offset of valid data inside the filter data. &lt;/li&gt; &lt;li&gt;The four zeros preceding that are the position of the relevant data in the file.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Honestly, you can think about this as a black box. Filter data is probably enough.&lt;/p&gt; &lt;p&gt;Now that we are done with the filter, we have to look at the actual index data. This is located on 143, but we know that the filter data is actually ended on 100 + 38, so why the gap? The answer is that after each block, we have a block type (compressed or not, basically) and the block CRC, which is used to determine if the file has been corrupted.&lt;/p&gt; &lt;p&gt;Back to the index block, is tarts at 143 and goes for 14 bytes, looking like this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_27.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_12.png" width="608" height="59"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The last 4 bytes (32 bit int equal to 1) is the number of restarts that we have for this block. And the 4 bytes preceding them (32 bit int equal to 0) is the offset of the restarts in the index block.&lt;/p&gt; &lt;p&gt;In this case, we have just one restart, and the offset of that is 0. Hold on about restarts, I’ll get there. Now let us move to the beginning of the index block. We have the following three bytes: 0,1,2.&lt;/p&gt; &lt;p&gt;Just like in the meta block case, those fall under (shared, non shared, value size) and are all 7 bit encoded ints. That means that there is no shared data with the previous key (because there &lt;em&gt;isn’t &lt;/em&gt;previous key), the non shared data is 1 and the data size is 2. If you memorized your ASCII table, 117 is lower case ‘u’. The actual value is a block handle. This time, for the actual data associated with this index. In this case, a block with position: 0 and count: 70.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_29.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_13.png" width="730" height="138"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Let us start analyzing this data. 0,10, 8 tells us shared, non shared, value . Indeed, the next 10 bytes spell out ‘tests/0000’ and the 8 after that are ‘values/0’. And what about the rest?&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_31.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Understanding-the-SST-fil_105E/image_thumb_14.png" width="728" height="48"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Now we have 9,1,8. Shared is 9, non shared 1 and value size is 8. We take the first 9 bytes of the previous key, giving us ‘tests/000’ and append to it the non shared data, in this case, byte with value 49 (‘1’ in ASCII), giving us a full key of ‘tests/0001’. The next 8 bytes after that spell out ‘values/1’. And the rest is pretty much just like it.&lt;/p&gt; &lt;p&gt;Now, I promised that I would talk about restarts. You see how we can use the shared/non shared data to effectively compress data. However, that has a major hidden cost. In order to figure out what the key is, we need to read all the previous keys.&lt;/p&gt; &lt;p&gt;In order to avoid this, we use the notion of restarts. Every N keys (by default, 16), we we will have a restart point and put the full key into the file. That means that we can skip ahead based on the position specified in the restart offset, and that in turn is governed by the number of restarts that we have in the index block.&lt;/p&gt; &lt;p&gt;And… that is pretty much it. This is the full and unvarnished binary dump of an SST. Obviously real ones are going to be more complex, but they all follow the same structure.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/9KBpGVXlLY4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/9KBpGVXlLY4/ravens-storage-understanding-the-sst-file-format</link><guid isPermaLink="false">http://ayende.com/blog/161764/ravens-storage-understanding-the-sst-file-format?key=1b4fb4bf-8201-4ee9-acfe-b30d556f4a59</guid><pubDate>Fri, 03 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161764/ravens-storage-understanding-the-sst-file-format?key=1b4fb4bf-8201-4ee9-acfe-b30d556f4a59</feedburner:origLink></item><item><title>Raven’s Storage: Reading a Sorted String Table</title><description>&lt;p&gt;When reading a SST, we have to deal with values of potentially large sizes. I want to avoid loading anything into managed memory if I can possible avoid it. That, along with other considerations has led me to use memory mapped files as the underlying abstractions for reading from the table.&lt;/p&gt; &lt;p&gt;Along the way, I think that I made the following assumptions:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Work in little endian only.&lt;/li&gt; &lt;li&gt;Can work in 32 bits, but 64 bits are preferred.&lt;/li&gt; &lt;li&gt;Doesn’t put pressure on the .NET memory manager.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;In particular, I am worried about users wanting to do store large values, or large number of keys.&lt;/p&gt; &lt;p&gt;As it turned out, .NET’s memory mapped files are a pretty good answer for what I wanted to do. Sure, it is a bit of a pain with regards to how to handle things like WinRT / Silverlight, etc. But I am mostly focused on server side for now. And I got some ideas on how to provide the mmap illusion on top regular streams for platforms that don’t support it.&lt;/p&gt; &lt;p&gt;The fact that SST is written by a single thread, and once it is written it is immutable has drastically simplified the codebase. Although I have to admit that looking at hex dump to figure out that I wrote to the wrong position is a bit of a bother, but more on that later. A lot of the code is basically the leveldb code, tweaked for .NET uses. One important difference that I made was with regards to the actual API.&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Keys are assumed to be small. Most of the time, less than 2KB in size, and there are optimizations in place to take advantage of that. (It will still work with keys bigger than that, but will consume more memory).&lt;/li&gt; &lt;li&gt;In general, through the codebase I tried to put major emphasis on performance and memory use even at this early stage.&lt;/li&gt; &lt;li&gt;Values are &lt;em&gt;not&lt;/em&gt; assumed to be small.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;What does the last one mean?&lt;/p&gt; &lt;p&gt;Well, to start with, we aren’t going to map the entire file into our memory space, to start with, it might be big enough to start fragmenting our virtual address space, but mostly because there is no need. We always map just a single blokc at at time, and usually we never bother to read the values into managed memory, instead just accessing the data directly from the memory mapped file.&lt;/p&gt; &lt;p&gt;Here is an example of reading the key for an entry from the SST:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Reading-a-Sorted-String-T_BBCC/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Reading-a-Sorted-String-T_BBCC/image_thumb.png" width="658" height="255"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;As you can see, we need to read just the key itself to memory, the value itself is not even touched. Also, there is some buffer management going on to make sure that we don’t need to re-allocate buffers as we are scanning through the table.&lt;/p&gt; &lt;p&gt;When you want to get a value, you call CreateValueStream, which gives you a Stream that you can work with. Here is how you use the API:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Reading-a-Sorted-String-T_BBCC/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Ravens-Storage-Reading-a-Sorted-String-T_BBCC/image_thumb_1.png" width="733" height="204"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;This is actually part of the internal codebase, we are actually storing data inside the SST that will later help us optimize things, but that is something I’ll touch on a later point in time.&lt;/p&gt; &lt;p&gt;Except for the slight worry that I am going to have to change the underlying approach from memory mapped files to streams if I need to run it outside the server/client, this is &lt;em&gt;very&lt;/em&gt; cool.&lt;/p&gt; &lt;p&gt;Next on my list is to think on how to implement the memtable, again, without impacting too much on the managed memory.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/BvUHoj1xJuk" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/BvUHoj1xJuk/ravens-storage-reading-a-sorted-string-table</link><guid isPermaLink="false">http://ayende.com/blog/161763/ravens-storage-reading-a-sorted-string-table?key=9dbbf44f-12a2-470c-9fec-2c96a45f2643</guid><pubDate>Thu, 02 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161763/ravens-storage-reading-a-sorted-string-table?key=9dbbf44f-12a2-470c-9fec-2c96a45f2643</feedburner:origLink></item><item><title>This is debugging, old school</title><description>&lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/This-is-debugging-old-school_982/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/This-is-debugging-old-school_982/image_thumb.png" width="772" height="265"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Dear god in heaven, how much I did not miss that.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/_zVFkeC5Tuc" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/_zVFkeC5Tuc/this-is-debugging-old-school</link><guid isPermaLink="false">http://ayende.com/blog/161762/this-is-debugging-old-school?key=7a5c7af6-1e66-448e-9d77-832bd1859ca3</guid><pubDate>Wed, 01 May 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161762/this-is-debugging-old-school?key=7a5c7af6-1e66-448e-9d77-832bd1859ca3</feedburner:origLink></item><item><title>Raven.Storage just passed its first “test”</title><description>&lt;p&gt;Take a look at the code below. This actually completed as expected, and was working beautifully. As I probably mentioned, the architecture of this is really nice, and I think I was able to translate this into .NET code in a way that is both idiomatic and useful. 4:30 AM now, and I think that this is bed time for me now. But I just couldn’t leave this alone.&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Raven.Storage-just_3FDB/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Raven.Storage-just_3FDB/image_thumb.png" width="853" height="555"&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/AyendeRahien/~4/hv0sQZ7cEhA" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/AyendeRahien/~3/hv0sQZ7cEhA/raven-storage-just-passed-its-first-test</link><guid isPermaLink="false">http://ayende.com/blog/161761/raven-storage-just-passed-its-first-test?key=4ffaf842-107b-40e9-b7c2-b32c806686f2</guid><pubDate>Tue, 30 Apr 2013 09:00:00 GMT</pubDate><feedburner:origLink>http://ayende.com/blog/161761/raven-storage-just-passed-its-first-test?key=4ffaf842-107b-40e9-b7c2-b32c806686f2</feedburner:origLink></item></channel></rss>
