<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Snail in a Turtleneck</title>
	
	<link>http://www.snailinaturtleneck.com/blog</link>
	<description>Kristina Chodorow's Blog</description>
	<lastBuildDate>Thu, 02 Feb 2012 21:59:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/kchodorow" /><feedburner:info uri="kchodorow" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>The Comments Conundrum</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/H-BMt86jrkA/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2012/02/02/the-comments-conundrum/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 21:49:53 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[aggregation]]></category>
		<category><![CDATA[pipeline]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1730</guid>
		<description><![CDATA[One of the most common questions we get is: I have a collection of blog posts and each post has an array of comments. How do I get&#8230; &#8230;all comments by a given author &#8230;the most recent comments &#8230;the most popular commenters? And so on. The answer to this has always been &#8220;Well, you can&#8217;t&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/02/th_robot_confused.png" alt="" title="" width="160" height="137" class="alignright size-full wp-image-1814" /></p>
<p>One of the most common questions we get is:</p>
<blockquote><p>
I have a collection of blog posts and each post has an array of comments.  How do I get&#8230;<br />
&#8230;all comments by a given author<br />
&#8230;the most recent comments<br />
&#8230;the most popular commenters?
</p></blockquote>
<p>And so on.  The answer to this has always been &#8220;Well, you can&#8217;t do that on the server side&#8230;&#8221;  You can either do it on the client side or store comments in their own collection. What you really want is the ability to treat embedded documents like a &#8220;real&#8221; collection.</p>
<p>The aggregation pipeline gives you this ability by letting you &#8220;unwind&#8221; arrays into separate documents, then doing whatever else you need to do in subsequent pipeline operators.</p>
<p>For example&#8230;</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/02/seriouscatcover.jpg" alt="" title="" width="297" height="297" class="alignright size-full wp-image-1815" /></p>
<p><b>Getting all comments by Serious Cat</b></p>
<p>Serious Cat&#8217;s comments are scattered between post documents, so there wasn&#8217;t a good way of querying for just those embedded documents.  Now there is.</p>
<p>Let&#8217;s assume we want each comment by Serious Cat, along with the title and url of the post Serious Cat was commenting on.  So, the steps we need to take are:</p>
<ol>
<li>Extract the fields we want (title, url, comments)
<li>Unwind the comments field: make each comment into a &#8220;real&#8221; document
<li>Query our new &#8220;comments collection&#8221; for &#8220;Serious Cat&#8221;
</ol>
<p>Using the aggregation pipeline, this looks like:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;posts&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #006600; font-style: italic;">// extract the fields </span>
   $project<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        title <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        url <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        comments <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// explode the &quot;comments&quot; array into separate documents</span>
    $unwind<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$comments&quot;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// query like a boss</span>
    $match<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>comments.<span style="color: #660066;">author</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Serious Cat&quot;</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>Now, this works well for something like a blog, where you have human-generated (small) data.  If you&#8217;ve got gigs of comments to go through, you probably want to filter out as many as possible (e.g., with <code>$match</code> or <code>$limit</code>) before sending it to the &#8220;everything-in-memory&#8221; parts of the pipeline.</p>
<p><b>Getting the most recent comments</b></p>
<p>Let&#8217;s assume our site lists the 10 most recent comments across all posts, with links back to the posts they appeared on, e.g.,</p>
<blockquote>
<ol>
<li>Great post! -Jerry (February 2nd, 2012) from <a>This is a Great Post</a>
<li>What does batrachophagous mean? -Fred (February 2nd, 2012) from <a>Fun with Crosswords</a>
<li>Where can I get discount Prada shoes? -Tom (February 1st, 2012) from <a>Rant about Spam</a><br />
&#8230;
</ol>
</blockquote>
<p>To extract these comments from a collection of posts, you could do something like:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;posts&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #006600; font-style: italic;">// extract the fields</span>
   $project<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        title <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        url <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        comments <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// explode &quot;comments&quot; array into separate documents</span>
    $unwind<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$comments&quot;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// sort newest first</span>
    $sort<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;comments.date&quot;</span> <span style="color: #339933;">:</span> <span style="color: #339933;">-</span><span style="color: #CC0000;">1</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// get the 10 newest</span>
    $limit<span style="color: #339933;">:</span> <span style="color: #CC0000;">10</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>Let&#8217;s take a moment to look at what <code>$unwind</code> does to a sample document.  </p>
<p>Suppose you have a document that looks like this after the <code>$project</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;url&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;/blog/spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;title&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Rant about Spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;comments&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Where can I get discount Prada shoes?&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;First!&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;I hate spam, too!&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;I love spam.&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Then, after unwinding the <code>comments</code> field, you&#8217;d have:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;url&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;/blog/spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;title&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Rant about Spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;comments&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Where can I get discount Prada shoes?&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;url&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;/blog/spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;title&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Rant about Spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;comments&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;First!&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;url&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;/blog/spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;title&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Rant about Spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;comments&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;I hate spam, too!&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;url&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;/blog/spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;title&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Rant about Spam&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;comments&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
        <span style="color: #009900;">&#123;</span>text <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;I love spam.&quot;</span><span style="color: #339933;">,</span> ...<span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Then we <code>$sort</code>, <code>$limit</code>, and Bob&#8217;s your uncle.</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/02/1291131680_two-thumbs-up.jpg" alt="" title="" width="248" height="320" class="alignright size-full wp-image-1817" /></p>
<p><b>Rank commenters by popularity</b></p>
<p>Suppose we allow users to upvote comments and we want to see who the most popular commenters are.</p>
<p>The steps we want to take are:</p>
<ol>
<li>Project out the fields we need (similar to above)
<li>Unwind the comments array (similar to above)
<li>Group by author, taking a count of votes (this will sum up all of the votes for each comment)
<li>Sort authors to find the most popular commenters
</ol>
<p>Using the pipeline, this would look like:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;posts&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
<span style="color: #009900;">&#123;</span>
   <span style="color: #006600; font-style: italic;">// extract the fields we'll need</span>
   $project<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        title <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        url <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        comments <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// explode &quot;comments&quot; array into separate documents</span>
    $unwind<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$comments&quot;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// count up votes by author</span>
    $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        _id <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$comments.author&quot;</span><span style="color: #339933;">,</span>
        popularity <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$comments.votes&quot;</span><span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// sort by the new popular field</span>
    $sort<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;popularity&quot;</span> <span style="color: #339933;">:</span> <span style="color: #339933;">-</span><span style="color: #CC0000;">1</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>As I mentioned before, there are a couple downsides to using the aggregation pipeline: a lot of the pipeline is done in-memory and can be very CPU- and memory-intensive.  However, used judiciously, it give you a lot more freedom to mush around your embedded documents.</p>
<div class="shr-publisher-1730"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2012%2F02%2F02%2Fthe-comments-conundrum%2F' data-shr_title='The+Comments+Conundrum'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/H-BMt86jrkA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2012/02/02/the-comments-conundrum/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2012/02/02/the-comments-conundrum/</feedburner:origLink></item>
		<item>
		<title>Hacking Chess: Data Munging</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/ThA7BKU0qPU/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2012/01/27/hacking-chess-data-munging/#comments</comments>
		<pubDate>Fri, 27 Jan 2012 15:33:25 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[aggregation]]></category>
		<category><![CDATA[pipeline]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1796</guid>
		<description><![CDATA[This is a supplement to the Hacking Chess with the MongoDB Pipeline. This post has instructions for rolling your own data sets from chess games. Download a collection of chess games you like. I&#8217;m using 1132 wins in less than 10 moves, but any of them should work. These files are in a format called&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p>This is a supplement to the <a href="http://www.snailinaturtleneck.com/blog/2012/01/26/hacking-chess-with-the-mongodb-pipeline/">Hacking Chess with the MongoDB Pipeline</a>.  This post has instructions for rolling your own data sets from chess games.</p>
<p>Download a <a href="http://www.chessopolis.com/chessfiles/pgn_collections.htm">collection of chess games</a> you like.  I&#8217;m using <a href="ftp://ftp.pitt.edu/group/student-activities/chess/PGN/Collections/ten-pg.zip">1132 wins in less than 10 moves</a>, but any of them should work.</p>
<p>These files are in a format called portable game notation (.PGN), which is a human-readable notation for chess games.  For example, the first game in <em>TEN.PGN</em> (helloooo 80s filenames) looks like:</p>
<pre>
[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "Gedult D"]
[Black "Kohn V"]
[Result "1-0"]
[ECO "B33/09"]

1.e4 c5 2.Nf3 Nc6 3.d4 cxd4 4.Nxd4 Nf6
5.Nc3 e5 6.Ndb5 d6 7.Nd5 Nxd5 8.exd5 Ne7
9.c4 a6 10.Qa4  1-0
</pre>
<p>This represents a 10-turn win at an unknown event.  The &#8220;ECO&#8221; field shows which <a href="http://www.chessville.com/misc/misc_codes_ecocodes.htm">opening</a> was used (a Sicilian in the game above).</p>
<p>Unfortunately for us, MongoDB doesn&#8217;t import PGNs in their native format, so we&#8217;ll need to convert them to JSON.  I found a PGN->JSON converter in PHP that did the job <a href="http://www.dhtmlgoodies.com/index.html?whichScript=dhtml-chess">here</a>.  Scroll down to the &#8220;download&#8221; section to get the .zip.  </p>
<p>It&#8217;s one of those zips that vomits its contents into whatever directory you unzip it in, so create a new directory for it.  </p>
<p>So far, we have:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ <span style="color: #c20cb9; font-weight: bold;">mkdir</span> chess
$ <span style="color: #7a0874; font-weight: bold;">cd</span> chess
$
$ <span style="color: #c20cb9; font-weight: bold;">ftp</span> <span style="color: #c20cb9; font-weight: bold;">ftp</span>:<span style="color: #000000; font-weight: bold;">//</span>ftp.pitt.edu<span style="color: #000000; font-weight: bold;">/</span>group<span style="color: #000000; font-weight: bold;">/</span>student-activities<span style="color: #000000; font-weight: bold;">/</span>chess<span style="color: #000000; font-weight: bold;">/</span>PGN<span style="color: #000000; font-weight: bold;">/</span>Collections<span style="color: #000000; font-weight: bold;">/</span>ten-pg.zip .<span style="color: #000000; font-weight: bold;">/</span>
$ <span style="color: #c20cb9; font-weight: bold;">unzip</span> ten-pg.zip
$
$ <span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>www.dhtmlgoodies.com<span style="color: #000000; font-weight: bold;">/</span>scripts<span style="color: #000000; font-weight: bold;">/</span>dhtml-chess<span style="color: #000000; font-weight: bold;">/</span>dhtml-chess.zip
$ <span style="color: #c20cb9; font-weight: bold;">unzip</span> dhtml-chess.zip</pre></div></div>

<p>Now, create a simple script, say <em>parse.php</em>, to run through the chess matches and output them in JSON, one per line:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">&lt;?php</span>
&nbsp;
<span style="color: #b1b100;">require</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;PgnParser.class.php&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$parser</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> PgnParser<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;/path/to/chess/TEN.PGN&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$total</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$parser</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">getNumberOfGames</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span><span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">&lt;</span><span style="color: #000088;">$total</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #000088;">$parser</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">getGameDetailsAsJson</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">.</span><span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">?&gt;</span></pre></div></div>

<p>Run <em>parse.php</em> and dump the results into a file:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ php parse.php <span style="color: #000000; font-weight: bold;">&gt;</span> games.json</pre></div></div>

<p>Now you&#8217;re ready to import <em>games.json</em>.</p>
<p><a href="http://www.snailinaturtleneck.com/blog/2012/01/26/hacking-chess-with-the-mongodb-pipeline/">Back to the original &#8220;hacking&#8221; post</a></p>
<div class="shr-publisher-1796"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2012%2F01%2F27%2Fhacking-chess-data-munging%2F' data-shr_title='Hacking+Chess%3A+Data+Munging'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/ThA7BKU0qPU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2012/01/27/hacking-chess-data-munging/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2012/01/27/hacking-chess-data-munging/</feedburner:origLink></item>
		<item>
		<title>Hacking Chess with the MongoDB Pipeline</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/7UswGJOI5-I/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2012/01/26/hacking-chess-with-the-mongodb-pipeline/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 13:58:37 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[aggregation]]></category>
		<category><![CDATA[pipeline]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1711</guid>
		<description><![CDATA[MongoDB&#8217;s new aggegation framework is now available in the nightly build! This post demonstrates some of its capabilities by using it to analyze chess games. Make sure you have a the &#8220;Development Release (Unstable)&#8221; nightly running before trying out the stuff in this post. The aggregation framework will be in 2.1.0, but as of this&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2010/03/spockAndKirk.jpg" alt="" title="spockAndKirk" width="300" height="219" class="alignleft size-full wp-image-374" /></p>
<p>MongoDB&#8217;s new aggegation framework is now available in the nightly build! This post demonstrates some of its capabilities by using it to analyze chess games.</p>
<p><b><em>Make sure you have a the &#8220;<a href="http://www.mongodb.org/downloads">Development Release (Unstable)</a>&#8221; nightly</em></b> running before trying out the stuff in this post. The aggregation framework will be in 2.1.0, but as of this writing it&#8217;s <em>only</em> in the nightly build.</p>
<p>First, we need some chess games to analyze.  Download <a href="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/11/games.json">games.json</a>, which contains 1132 games that were won in 10 moves or less (crush their soul and do it quick).</p>
<p>You can use <em>mongoimport</em> to import <em>games.json</em> into MongoDB:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ mongoimport <span style="color: #660033;">--db</span> chess <span style="color: #660033;">--collection</span> fast_win games.json
connected to: 127.0.0.1
imported <span style="color: #000000;">1132</span> objects</pre></div></div>

<p>We can take a look at our chess games in the Mongo shell:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> <span style="color: #003366; font-weight: bold;">use</span> chess
switched to db chess
<span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">fast_win</span>.<span style="color: #660066;">count</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>
<span style="color: #CC0000;">1132</span>
<span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">fast_win</span>.<span style="color: #660066;">findOne</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ed3965bf86479436d6f1cd7&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;event&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;?&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;site&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;?&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;date&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;????.??.??&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;round&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;?&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;white&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Gedult D&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;black&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Kohn V&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;result&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;1-0&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;eco&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;B33/09&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;moves&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #3366CC;">&quot;1&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;white&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #3366CC;">&quot;move&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;e4&quot;</span>
			<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;black&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #3366CC;">&quot;move&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;c5&quot;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #3366CC;">&quot;2&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;white&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #3366CC;">&quot;move&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nf3&quot;</span>
			<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;black&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #3366CC;">&quot;move&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nc6&quot;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
                ...
		<span style="color: #3366CC;">&quot;10&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;white&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #3366CC;">&quot;move&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Qa4&quot;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Not exactly the greatest schema, but that&#8217;s how the chess format exporter munged it.  Regardless, now we can use aggregation pipelines to analyze these games.</p>
<p><b><a name="pipeline1">Experiment #1: First Mover Advantage</a></b></p>
<p>White has a slight advantage in chess because you move first (Wikipedia says it&#8217;s a <a href="http://en.wikipedia.org/wiki/First-move_advantage_in_chess">52%-56% chance</a> of winning).  I&#8217;d hypothesize that, in a short game, going first matters even more.</p>
<p>Let&#8217;s find out.</p>
<p>The &#8220;result&#8221; field in these docs is &#8220;1-0&#8243; if white wins and &#8220;0-1&#8243; if black wins.  So, we want to divide our docs into two groups based on the &#8220;result&#8221; field and count how many docs are in each group.  Using the aggregation pipeline, this looks like:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;fast_win&quot;</span><span style="color: #339933;">,</span> pipeline <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
... <span style="color: #009900;">&#123;</span>
...    $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...        _id <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$result&quot;</span><span style="color: #339933;">,</span>      <span style="color: #006600; font-style: italic;">// group by 'result' field</span>
...        <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span> <span style="color: #006600; font-style: italic;">// add 1 for every document in the group</span>
...    <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;result&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;0-1&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">435</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;1-0&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">697</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;ok&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>That gives a 62% chance white will win (697 wins/1132 total games).  Pretty good (although, of course, this isn&#8217;t a very large sample set).</p>
<div id="attachment_1717" class="wp-caption alignright" style="width: 260px"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/11/chsbrd.jpg" alt="" title="Chessboard" width="250" height="254" class="size-full wp-image-1717" /><p class="wp-caption-text">In case you're not familiar with it, a reference chessboard with 1-8, a-h marked.</p></div>
<p><b><a name="pipeline2">Experiment #2: Best Starting Move</a></b></p>
<p>Given a starting move, what percent of the time will that move lead to victory?  This probably depends on whether you&#8217;re playing white or black, so we&#8217;ll just focus on white&#8217;s opening move. </p>
<p>First, we&#8217;ll just determine what starting moves white uses with this series of steps:</p>
<ul>
<li><em>project</em> all of white&#8217;s first moves (the <code>moves.1.white.move</code> field)
<li><em>group</em> all docs with the same starting move together
<li>and count how many documents (games) used that move.
</ul>
<p>These steps look like:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;fast_win&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
... <span style="color: #006600; font-style: italic;">// '$project' is used to extract all of white's opening moves</span>
... <span style="color: #009900;">&#123;</span>
...     $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         <span style="color: #006600; font-style: italic;">// extract moves.1.white.move into a new field, firstMove</span>
...         <span style="color: #660066;">firstMove</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$moves.1.white.move&quot;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// use '$group' to calculate the number of times each move occurred</span>
... <span style="color: #009900;">&#123;</span>
...     $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span> 
...         _id <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$firstMove&quot;</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;result&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;d3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">2</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;e4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">696</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;b4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">17</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;g3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">3</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;e3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">2</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;c4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">36</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;b3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">4</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;g4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">11</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;h4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nf3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">37</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;f3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;f4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">25</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nc3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">14</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;d4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">283</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;ok&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Now let&#8217;s compare those numbers with whether white won or lost.</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;fast_win&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
... <span style="color: #006600; font-style: italic;">// extract the first move</span>
... <span style="color: #009900;">&#123;</span>
...    $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...        <span style="color: #660066;">firstMove</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$moves.1.white.move&quot;</span><span style="color: #339933;">,</span>
...        <span style="color: #006600; font-style: italic;">// create a new field, &quot;win&quot;, which is 1 if white won and 0 if black won</span>
...        <span style="color: #660066;">win</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$cond <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
...            <span style="color: #009900;">&#123;</span>$eq <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$result&quot;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;1-0&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">0</span>
...        <span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span>
...    <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// group by the move and count up how many winning games used it</span>
... <span style="color: #009900;">&#123;</span>
...     $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         _id <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$firstMove&quot;</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numWins</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$win&quot;</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// calculate the percent of games won with this starting move</span>
... <span style="color: #009900;">&#123;</span>
...     $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         _id <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">percentWins</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...             $multiply <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #CC0000;">100</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#123;</span>
...                 $divide <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$numWins&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;$numGames&quot;</span><span style="color: #009900;">&#93;</span>
...             <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span>
...         <span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// discard moves that were used in less than 10 games (probably not representative) </span>
... <span style="color: #009900;">&#123;</span>
...     $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gte <span style="color: #339933;">:</span> <span style="color: #CC0000;">10</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// order from worst to best</span>
... <span style="color: #009900;">&#123;</span>
...     $sort <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         <span style="color: #660066;">percentWins</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;result&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;f4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">25</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">24</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;b4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">17</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">35.294117647058826</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;c4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">36</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">50</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;d4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">283</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">50.53003533568905</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;g4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">11</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">63.63636363636363</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nf3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">37</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">67.56756756756756</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;e4&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">696</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">68.24712643678161</span>
		<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nc3&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;numGames&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">14</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;percentWins&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">78.57142857142857</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;ok&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Pawn to e4 seems like the most dependable winner here. Knight to c3 also seems like a good choice (at a nearly 80% win rate), but it was only used in 14 winning games.</p>
<p><b><a name="pipeline3">Experiment #3: Best and Worst Moves for Black</a></b></p>
<p>We basically want to do a similar pipeline to Experiment 2, but for black. At the end, we want to find the best and worst percent.</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;fast_win&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
... <span style="color: #006600; font-style: italic;">// extract the first move</span>
... <span style="color: #009900;">&#123;</span>
...    $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...        <span style="color: #660066;">firstMove</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$moves.1.black.move&quot;</span><span style="color: #339933;">,</span>
...        <span style="color: #660066;">win</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$cond <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
...            <span style="color: #009900;">&#123;</span>$eq <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$result&quot;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;0-1&quot;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">0</span>
...        <span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span>
...    <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// group by the move and count up how many winning games used it</span>
... <span style="color: #009900;">&#123;</span>
...     $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         _id <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$firstMove&quot;</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numWins</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$win&quot;</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// calculate the percent of games won with this starting move</span>
... <span style="color: #009900;">&#123;</span>
...     $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         _id <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
...         <span style="color: #660066;">percentWins</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...             $multiply <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #CC0000;">100</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#123;</span>
...                 $divide <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$numWins&quot;</span><span style="color: #339933;">,</span><span style="color: #3366CC;">&quot;$numGames&quot;</span><span style="color: #009900;">&#93;</span>
...             <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span>
...         <span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// discard moves that were used in less than 10 games (probably not representative) </span>
... <span style="color: #009900;">&#123;</span>
...     $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...         <span style="color: #660066;">numGames</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gte <span style="color: #339933;">:</span> <span style="color: #CC0000;">10</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
... <span style="color: #006600; font-style: italic;">// get the best and worst</span>
... <span style="color: #009900;">&#123;</span>
...     $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
...          _id <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
...          <span style="color: #660066;">best</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$max <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$_id&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
...          <span style="color: #660066;">worst</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$min <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$_id&quot;</span><span style="color: #009900;">&#125;</span>
...     <span style="color: #009900;">&#125;</span>
... <span style="color: #009900;">&#125;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;result&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
		<span style="color: #009900;">&#123;</span>
			<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;best&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;g6&quot;</span><span style="color: #339933;">,</span>
			<span style="color: #3366CC;">&quot;worst&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Nc6&quot;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;ok&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>&#8220;Nc6&#8243; means &#8220;move the knight to c6.&#8221;  Or, rather, don&#8217;t, because it doesn&#8217;t tend to work out that well.</p>
<p>I like this new aggregation functionality because it&#8217;s feels simpler than MapReduce.  You can start with a one-operation pipeline and build it up, step-by-step, seeing exactly what a given operation does to your output.  (And no Javascript required, which is always a plus.)</p>
<p>There&#8217;s lots more documentation on aggregation pipelines in <a href="http://www.mongodb.org/display/DOCS/Aggregation+Framework">the docs</a> and I&#8217;ll be doing a couple more posts on it.</p>
<div class="shr-publisher-1711"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2012%2F01%2F26%2Fhacking-chess-with-the-mongodb-pipeline%2F' data-shr_title='Hacking+Chess+with+the+MongoDB+Pipeline'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/7UswGJOI5-I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2012/01/26/hacking-chess-with-the-mongodb-pipeline/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2012/01/26/hacking-chess-with-the-mongodb-pipeline/</feedburner:origLink></item>
		<item>
		<title>And now, for something completely different</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/-bmKc69CkVo/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2012/01/17/and-now-for-something-completely-different/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 00:38:38 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[women in tech]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1776</guid>
		<description><![CDATA[Probably only relevant to a limited portion of my audience, but Silicon Valley Ryan Gosling is awesome. I have never seen anything like and I&#8217;m not sure what the point is, but I know I&#8217;m a fan. Go forth and be sexy and supportive for the female programmers you know.]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p><a href="http://siliconvalleyryangosling.tumblr.com/post/15941208165/hey-girl-oh-your-whiteboard-is-full-here-use"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/01/svrg.jpg" alt="" title="Hey, girl. Oh, your whiteboard is full? Here, use my chest." width="400" height="404" class="alignleft size-full wp-image-1777" /></a></p>
<p>Probably only relevant to a limited portion of my audience, but <a href="http://siliconvalleyryangosling.tumblr.com">Silicon Valley Ryan Gosling</a> is awesome.  I have never seen anything like and I&#8217;m not sure what the point is, but I know I&#8217;m a fan.</p>
<p>Go forth and be sexy and supportive for the female programmers you know.</p>
<div class="shr-publisher-1776"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2012%2F01%2F17%2Fand-now-for-something-completely-different%2F' data-shr_title='And+now%2C+for+something+completely+different'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/-bmKc69CkVo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2012/01/17/and-now-for-something-completely-different/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2012/01/17/and-now-for-something-completely-different/</feedburner:origLink></item>
		<item>
		<title>Replica Set Internals Bootcamp: Part I – Elections</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/i5kHe-8vMoM/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2012/01/04/replica-set-internals-bootcamp-part-i-elections/#comments</comments>
		<pubDate>Wed, 04 Jan 2012 22:22:47 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[bootcamp]]></category>
		<category><![CDATA[elections]]></category>
		<category><![CDATA[internals]]></category>
		<category><![CDATA[replica sets]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1752</guid>
		<description><![CDATA[I&#8217;ve been doing replica set &#8220;bootcamps&#8221; for new hires. It&#8217;s mainly focused on applying this to debug replica set issues and being able to talk fluently about what&#8217;s happening, but it occurred to me that you (blog readers) might be interested in it, too. There are 8 subjects I cover in my bootcamp: Elections Creating&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p>I&#8217;ve been doing replica set &#8220;bootcamps&#8221; for new hires. It&#8217;s mainly focused on applying this to debug replica set issues and being able to talk fluently about what&#8217;s happening, but it occurred to me that you (blog readers) might be interested in it, too.  </p>
<p>There are 8 subjects I cover in my bootcamp: </p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/01/boot-camp-300x203.jpg" alt="" title="boot-camp" width="300" height="203" class="alignright size-medium wp-image-1759" /></p>
<ol>
<li><a href="http://www.snailinaturtleneck.com/blog/?p=1752">Elections</a>
<li>Creating a set
<li>Reconfiguring
<li>Syncing
<li>Initial Sync
<li>Rollback
<li>Authentication
<li>Debugging
</ol>
<p>I&#8217;m going to do one subject per post, we&#8217;ll see how many I can get through.</p>
<p><em>Prerequisites: I&#8217;m assuming you know what replica sets are and you&#8217;ve configured a set, written data to it, read from a secondary, etc.  You understand the terms primary and secondary.</em>  </p>
<p>The most obvious feature of replica sets is their ability to elect a new primary, so the first thing we&#8217;ll cover is this election process.  </p>
<h3>Replica Set Elections</h3>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/01/heartbeat.gif" alt="" title="heartbeat" width="350" height="247" class="alignright size-full wp-image-1755" /></p>
<p>Let&#8217;s say we have a replica set with 3 members: <em>X</em>, <em>Y</em>, and <em>Z</em>.  Every two seconds, each server sends out a <em>heartbeat request</em> to the other members of the set.  So, if we wait a few seconds, <em>X</em> sends out heartbeats to <em>Y</em> and <em>Z</em>.  They respond with information about their current situation: the state they&#8217;re in (primary/secondary), if they are eligible to become primary, their current clock time, etc.</p>
<p><em>X</em> receives this info and updates its &#8220;map&#8221; of the set: if members have come up or gone down, changed state, and how long the roundtrip took.</p>
<p>At this point, if <em>X</em> map changed, <em>X</em> will check a couple of things: if <em>X</em> is primary and a member went down, it will make sure it can still reach a majority of the set.  If it cannot, it&#8217;ll demote itself to a secondary.</p>
<h4>Demotions</h4>
<p>There is one wrinkle with <em>X</em> demoting itself: in MongoDB, writes default to fire-and-forget.  Thus, if people are doing fire-and-forget writes on the primary and it steps down, they might not realize <em>X</em> is no longer primary and keep sending writes to it. The secondary-formerly-known-as-primary will be like, &#8220;I&#8217;m a secondary, I can&#8217;t write that!&#8221;  But because the writes don&#8217;t get a response on the client, the client wouldn&#8217;t know.  </p>
<p>Technically, we could say, &#8220;well, they should use safe writes if they care,&#8221; but that seems dickish.  So, when a primary is demoted, it also closes all connections to clients so that they will get a socket error when they send the next message.  All of the client libraries know to re-check who is primary if they get an error.  Thus, they&#8217;ll be able to find who the new primary is and not accidentally send an endless stream of writes to a secondary.</p>
<h4>Elections</h4>
<p>Anyway, getting back to the heartbeats: if <em>X</em> is a secondary, it&#8217;ll occasionally check if it should elect itself, even if its map hasn&#8217;t changed.  First, it&#8217;ll do a sanity check: does another member think it&#8217;s primary? Does <b><em>X</em></b> think it&#8217;s already primary? Is <em>X</em> ineligible for election? If it fails any of the basic questions, it&#8217;ll continue puttering along as is.</p>
<p><a href="http://digitaldocsinabox.org/images/WomensSuffrage/ballots.html"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/01/suffragist-casting-ballots-22-300x227.jpg" alt="" title="suffragist casting ballots" width="300" height="227" class="alignleft size-medium wp-image-1756" /></a></p>
<p>If it seems as though a new primary is needed, <em>X</em> will proceed to the first step in election: it sends a message to <em>Y</em> and <em>Z</em>, telling them &#8220;I am considering running for primary, can you advise me on this matter?&#8221;  </p>
<p>When <em>Y</em> and <em>Z</em> get this message, they quickly check their world view.  Do they already know of a primary?  Do they have more recent data than <em>X</em>? Does anyone they know of have more recent data than <em>X</em>?  They run through a huge list of sanity checks and, if everything seems satisfactory, they tentatively reply &#8220;go ahead.&#8221;  If they find a reason that <em>X</em> cannot be elected, they&#8217;ll reply &#8220;stop the election!&#8221;</p>
<p>If <em>X</em> receives any &#8220;stop the election!&#8221; messages, it cancels the election and goes back to life as a secondary.</p>
<p>If everyone says &#8220;go ahead,&#8221; <em>X</em> continues with the second (and final) phase of the election process.</p>
<p>For the second phase, <em>X</em> sends out a second message that is basically, &#8220;I am formally announcing my candidacy.&#8221;  At this point, <em>Y</em> and <em>Z</em> make a final check: do all of the conditions that held true before still hold?  If so, they allow <em>X</em> to take their <em>election lock</em> and send back a vote.  The election lock prevents them from voting for another candidate for 30 seconds.  </p>
<p>If one of the checks doesn&#8217;t pass the second time around (fairly unusual, at least in 2.0), they send back a veto.  If anyone vetos, the election fails. </p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2012/01/dog-boot-camp-2.jpg" alt="" title="dog-boot-camp" width="265" height="286" class="alignright size-full wp-image-1760" /></p>
<p>Suppose that <em>Y</em> votes for <em>X</em> and <em>Z</em> <b>vetos</b> <em>X</em>.  At that point, <em>Y</em>&#8216;s election lock is taken, it cannot vote in another election for 30 seconds.  That means that, if <em>Z</em> wants to run for primary, it had better be able to get <em>X</em>&#8216;s vote.  That said, it should be able to if <em>Z</em> is a viable candidate: it&#8217;s not like the members hold grudges (except for <em>Y</em>, for 30 seconds).</p>
<p>If no one vetos <b>and</b> the candidate member receives votes from a majority of the set, the candidate becomes primary.</p>
<h4>Confused?</h4>
<p>Feel free to ask questions in the comments below.  This is a loving, caring bootcamp (as bootcamps go).</p>
<div class="shr-publisher-1752"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2012%2F01%2F04%2Freplica-set-internals-bootcamp-part-i-elections%2F' data-shr_title='Replica+Set+Internals+Bootcamp%3A+Part+I+-+Elections'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/i5kHe-8vMoM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2012/01/04/replica-set-internals-bootcamp-part-i-elections/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2012/01/04/replica-set-internals-bootcamp-part-i-elections/</feedburner:origLink></item>
		<item>
		<title>Popping Timestamps into ObjectIds</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/nRKvpu3cYS4/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2011/12/20/querying-for-timestamps-using-objectids/#comments</comments>
		<pubDate>Tue, 20 Dec 2011 22:02:45 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[ObjectId]]></category>
		<category><![CDATA[querying]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1742</guid>
		<description><![CDATA[ObjectIds contain a timestamp, which tells you when the document was created. Because the _id field is always indexed, that means you have a &#8220;free&#8221; index on your &#8220;created at&#8221; time (unless you have persnickety requirements for creation times, like resolutions of less than a second, synchronization across app servers, etc.). Actually using this index&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p>ObjectIds contain a timestamp, which tells you when the document was created.  Because the <em>_id</em> field is always indexed, that means you have a &#8220;free&#8221; index on your &#8220;created at&#8221; time (unless you have persnickety requirements for creation times, like resolutions of less than a second, synchronization across app servers, etc.).</p>
<p>Actually using this index can seem daunting (how do you use an ObjectId to query for a certain date?) so let&#8217;s run through an example.</p>
<p>First, let&#8217;s insert 100 sample docs, 10 docs/second.</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> <span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span>i<span style="color: #339933;">=</span><span style="color: #CC0000;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span><span style="color: #CC0000;">10</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> 
... <span style="color: #000066;">print</span><span style="color: #009900;">&#40;</span>i<span style="color: #339933;">+</span><span style="color: #3366CC;">&quot;: &quot;</span><span style="color: #339933;">+</span>Date.<span style="color: #660066;">now</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> 
... <span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span>j<span style="color: #339933;">=</span><span style="color: #CC0000;">0</span><span style="color: #339933;">;</span> j<span style="color: #339933;">&lt;</span><span style="color: #CC0000;">10</span><span style="color: #339933;">;</span> j<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> 
...    <span style="color: #660066;">db</span>.<span style="color: #660066;">foo</span>.<span style="color: #660066;">insert</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>x<span style="color: #339933;">:</span>i<span style="color: #339933;">,</span>y<span style="color: #339933;">:</span>j<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> 
... <span style="color: #009900;">&#125;</span> 
... <span style="color: #660066;">sleep</span><span style="color: #009900;">&#40;</span><span style="color: #CC0000;">1000</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> 
... <span style="color: #009900;">&#125;</span>
<span style="color: #CC0000;">0</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417241111</span>
<span style="color: #CC0000;">1</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417242112</span>
<span style="color: #CC0000;">2</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417243112</span>
<span style="color: #CC0000;">3</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417244113</span>
<span style="color: #CC0000;">4</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417245114</span>
<span style="color: #CC0000;">5</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417246115</span>
<span style="color: #CC0000;">6</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417247115</span>
<span style="color: #CC0000;">7</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417248116</span>
<span style="color: #CC0000;">8</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417249117</span>
<span style="color: #CC0000;">9</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">1324417250117</span></pre></div></div>

<p>Let&#8217;s find all entries created after 1324417246115 (when <em>i</em>=5).  </p>
<p>The time is currently in milliseconds (that&#8217;s how JavaScript does dates), so we&#8217;ll have to convert it to seconds:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> secs <span style="color: #339933;">=</span> Math.<span style="color: #660066;">floor</span><span style="color: #009900;">&#40;</span><span style="color: #CC0000;">1324417246115</span><span style="color: #339933;">/</span><span style="color: #CC0000;">1000</span><span style="color: #009900;">&#41;</span>
<span style="color: #CC0000;">1324417246</span></pre></div></div>

<p>(Your <code>secs</code> will be different than mine, of course.)  </p>
<p>ObjectIds can be constructed from a 24-character string, each two characters representing a byte (e.g., &#8220;ff&#8221; is 255).  So, we need to convert <code>secs</code> to hexidecimal, which luckily is super-easy in JavaScript:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> hexSecs <span style="color: #339933;">=</span> secs.<span style="color: #660066;">toString</span><span style="color: #009900;">&#40;</span><span style="color: #CC0000;">16</span><span style="color: #009900;">&#41;</span>
4ef100de</pre></div></div>

<p>Now, we create an ObjectId from this:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> id <span style="color: #339933;">=</span> ObjectId<span style="color: #009900;">&#40;</span>hexSecs<span style="color: #339933;">+</span><span style="color: #3366CC;">&quot;0000000000000000&quot;</span><span style="color: #009900;">&#41;</span>
ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de0000000000000000&quot;</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>If you get the wrong number of zeros here, you&#8217;ll get an error message that is, er, hard to miss.</p>
<p>Now, we query for everything created after this timestamp:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">foo</span>.<span style="color: #660066;">find</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>_id <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gt <span style="color: #339933;">:</span> id<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c3016405&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c3016406&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c3016407&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">2</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c3016408&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">3</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c3016409&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">4</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c301640a&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c301640b&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c301640c&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">7</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c301640d&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">8</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de7d435c39c301640e&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">9</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c301640f&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016410&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016411&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">2</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016412&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">3</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016413&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">4</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016414&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016415&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016416&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">7</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016417&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">8</span> <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#123;</span> <span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100df7d435c39c3016418&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;x&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">6</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;y&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">9</span> <span style="color: #009900;">&#125;</span>
Type <span style="color: #3366CC;">&quot;it&quot;</span> <span style="color: #000066; font-weight: bold;">for</span> more</pre></div></div>

<p>If we look at the <em>explain</em> for the query, you can see that it&#8217;s using the index:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">foo</span>.<span style="color: #660066;">find</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>_id<span style="color: #339933;">:</span><span style="color: #009900;">&#123;</span>$gt<span style="color: #339933;">:</span>id<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span>.<span style="color: #660066;">explain</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
	<span style="color: #3366CC;">&quot;cursor&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;BtreeCursor _id_&quot;</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;nscanned&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">50</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;nscannedObjects&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">50</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;n&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">50</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;millis&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;nYields&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;nChunkSkips&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;isMultiKey&quot;</span> <span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;indexOnly&quot;</span> <span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span>
	<span style="color: #3366CC;">&quot;indexBounds&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #3366CC;">&quot;_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
			<span style="color: #009900;">&#91;</span>
				ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;4ef100de0000000000000000&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
				ObjectId<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;ffffffffffffffffffffffff&quot;</span><span style="color: #009900;">&#41;</span>
			<span style="color: #009900;">&#93;</span>
		<span style="color: #009900;">&#93;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>We&#8217;re not quite done, because we&#8217;re actually not returning what we wanted: we&#8217;re getting all docs <em>greater than or equal to</em> the &#8220;created at&#8221; time, not just greater than.  To fix this, we&#8217;d just need to add 1 to the <code>secs</code> before doing anything else.  Or I can claim that we were querying for documents created after <em>i</em>=4 all along.</p>
<div class="shr-publisher-1742"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2011%2F12%2F20%2Fquerying-for-timestamps-using-objectids%2F' data-shr_title='Popping+Timestamps+into+ObjectIds'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/nRKvpu3cYS4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2011/12/20/querying-for-timestamps-using-objectids/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2011/12/20/querying-for-timestamps-using-objectids/</feedburner:origLink></item>
		<item>
		<title>SQL to MongoDB: An Updated Mapping</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/6NrTSqweP7s/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2011/12/09/sql-to-mongodb-an-updated-mapping/#comments</comments>
		<pubDate>Fri, 09 Dec 2011 17:48:54 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[aggregation]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[pipeline]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1694</guid>
		<description><![CDATA[The aggregation pipeline code has finally been merged into the main development branch and is scheduled for release in 2.2. It lets you combine simple operations (like finding the max or min, projecting out fields, taking counts or averages) into a pipeline of operations, making a lot of things that were only possible by using&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><div id="attachment_1738" class="wp-caption aligncenter" style="width: 910px"><a href="http://rickosborne.org/download/SQL-to-MongoDB.pdf"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/12/SQL-to-MongoDB2.jpg" alt="" title="SQL-to-MongoDB2" width="900" height="695" class="aligncenter size-full wp-image-1738" /></a><p class="wp-caption-text">Rick Osborne&#039;s original chart.</p></div>
<p>The aggregation pipeline code has finally been merged into the main development branch and is scheduled for release in 2.2.  It lets you combine simple operations (like finding the max or min, projecting out fields, taking counts or averages) into a pipeline of operations, making a lot of things that were only possible by using MapReduce doable with a &#8220;normal&#8221; query.</p>
<p>In celebration of this, I thought I&#8217;d re-do the very popular <a href="http://rickosborne.org/download/SQL-to-MongoDB.pdf">MySQL to MongoDB</a> mapping using the aggregation pipeline, instead of MapReduce.</p>
<p>Here is the original SQL:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span>
  Dim1<span style="color: #66cc66;">,</span> Dim2<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">SUM</span><span style="color: #66cc66;">&#40;</span>Measure1<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MSum<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">*</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> RecordCount<span style="color: #66cc66;">,</span>
  AVG<span style="color: #66cc66;">&#40;</span>Measure2<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MAvg<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">MIN</span><span style="color: #66cc66;">&#40;</span>Measure1<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MMin
  <span style="color: #993333; font-weight: bold;">MAX</span><span style="color: #66cc66;">&#40;</span><span style="color: #993333; font-weight: bold;">CASE</span>
    <span style="color: #993333; font-weight: bold;">WHEN</span> Measure2 <span style="color: #66cc66;">&lt;</span> <span style="color: #cc66cc;">100</span>
    <span style="color: #993333; font-weight: bold;">THEN</span> Measure2
  <span style="color: #993333; font-weight: bold;">END</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MMax
<span style="color: #993333; font-weight: bold;">FROM</span> DenormAggTable
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>Filter1 <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #66cc66;">&#40;</span>’A’<span style="color: #66cc66;">,</span>’B’<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
  <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>Filter2 <span style="color: #66cc66;">=</span> ‘C’<span style="color: #66cc66;">&#41;</span>
  <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>Filter3 <span style="color: #66cc66;">&gt;</span> <span style="color: #cc66cc;">123</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> Dim1<span style="color: #66cc66;">,</span> Dim2
<span style="color: #993333; font-weight: bold;">HAVING</span> <span style="color: #66cc66;">&#40;</span>MMin <span style="color: #66cc66;">&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> RecordCount <span style="color: #993333; font-weight: bold;">DESC</span>
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">4</span><span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">8</span></pre></div></div>

<p>We can break up this statement and replace each piece of SQL with the new aggregation pipeline syntax:</p>
<table>
<tr>
<th>MongoDB Pipeline</th>
<th>MySQL</th>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;">aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;DenormAggTable&quot;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">FROM</span> DenormAggTable</pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        Filter1 <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$in <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">'A'</span><span style="color: #339933;">,</span><span style="color: #3366CC;">'B'</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        Filter2 <span style="color: #339933;">:</span> <span style="color: #3366CC;">'C'</span><span style="color: #339933;">,</span>
        Filter3 <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gt <span style="color: #339933;">:</span> <span style="color: #CC0000;">123</span><span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>Filter1 <span style="color: #993333; font-weight: bold;">IN</span> <span style="color: #66cc66;">&#40;</span>’A’<span style="color: #66cc66;">,</span>’B’<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
  <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>Filter2 <span style="color: #66cc66;">=</span> ‘C’<span style="color: #66cc66;">&#41;</span>
  <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>Filter3 <span style="color: #66cc66;">&gt;</span> <span style="color: #cc66cc;">123</span><span style="color: #66cc66;">&#41;</span></pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        Dim1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Dim2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Measure1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Measure2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        lessThanAHundred <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
            $cond<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> 
                <span style="color: #009900;">&#123;</span>$lt<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">100</span><span style="color: #009900;">&#93;</span> <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
                <span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #339933;">,</span> <span style="color: #006600; font-style: italic;">// if</span>
                <span style="color: #CC0000;">0</span><span style="color: #009900;">&#93;</span>           <span style="color: #006600; font-style: italic;">// else</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">CASE</span>
  <span style="color: #993333; font-weight: bold;">WHEN</span> Measure2 <span style="color: #66cc66;">&lt;</span> <span style="color: #cc66cc;">100</span>
  <span style="color: #993333; font-weight: bold;">THEN</span> Measure2
<span style="color: #993333; font-weight: bold;">END</span></pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        _id <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>Dim1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span> Dim2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MSum <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure1&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        RecordCount <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MAvg <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$avg <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MMin <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$min <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure1&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MMax <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$max <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$lessThanAHundred&quot;</span><span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">SELECT</span>
  Dim1<span style="color: #66cc66;">,</span> Dim2<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">SUM</span><span style="color: #66cc66;">&#40;</span>Measure1<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MSum<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">COUNT</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">*</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> RecordCount<span style="color: #66cc66;">,</span>
  AVG<span style="color: #66cc66;">&#40;</span>Measure2<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MAvg<span style="color: #66cc66;">,</span>
  <span style="color: #993333; font-weight: bold;">MIN</span><span style="color: #66cc66;">&#40;</span>Measure1<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MMin
  <span style="color: #993333; font-weight: bold;">MAX</span><span style="color: #66cc66;">&#40;</span><span style="color: #993333; font-weight: bold;">CASE</span>
    <span style="color: #993333; font-weight: bold;">WHEN</span> Measure2 <span style="color: #66cc66;">&lt;</span> <span style="color: #cc66cc;">100</span>
    <span style="color: #993333; font-weight: bold;">THEN</span> Measure2
  <span style="color: #993333; font-weight: bold;">END</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> MMax
&nbsp;
<span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> Dim1<span style="color: #66cc66;">,</span> Dim2</pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>MMin <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gt <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">HAVING</span> <span style="color: #66cc66;">&#40;</span>MMin <span style="color: #66cc66;">&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span></pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $sort <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>RecordCount <span style="color: #339933;">:</span> <span style="color: #339933;">-</span><span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> RecordCount <span style="color: #993333; font-weight: bold;">DESC</span></pre></div></div>

</td>
</tr>
<tr>
<td>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
    $limit <span style="color: #339933;">:</span> <span style="color: #CC0000;">8</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $skip <span style="color: #339933;">:</span> <span style="color: #CC0000;">4</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

</td>
<td>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">4</span><span style="color: #66cc66;">,</span> <span style="color: #cc66cc;">8</span></pre></div></div>

</td>
</tr>
</table>
<p>Putting all of these together gives you your pipeline:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #339933;">&gt;</span> db.<span style="color: #660066;">runCommand</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>aggregate<span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;DenormAggTable&quot;</span><span style="color: #339933;">,</span> pipeline<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
<span style="color: #009900;">&#123;</span>
    $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        Filter1 <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$in <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">'A'</span><span style="color: #339933;">,</span><span style="color: #3366CC;">'B'</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        Filter2 <span style="color: #339933;">:</span> <span style="color: #3366CC;">'C'</span><span style="color: #339933;">,</span>
        Filter3 <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gt <span style="color: #339933;">:</span> <span style="color: #CC0000;">123</span><span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $project <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        Dim1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Dim2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Measure1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        Measure2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span>
        lessThanAHundred <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$cond<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#123;</span>$lt<span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">100</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #339933;">,</span>
            <span style="color: #CC0000;">0</span><span style="color: #009900;">&#93;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $group <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        _id <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>Dim1 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #339933;">,</span> Dim2 <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MSum <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure1&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        RecordCount <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$sum <span style="color: #339933;">:</span> <span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MAvg <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$avg <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure2&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MMin <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$min <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$Measure1&quot;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
        MMax <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$max <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;$lessThanAHundred&quot;</span><span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $match <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>MMin <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>$gt <span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $sort <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>RecordCount <span style="color: #339933;">:</span> <span style="color: #339933;">-</span><span style="color: #CC0000;">1</span><span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $limit <span style="color: #339933;">:</span> <span style="color: #CC0000;">8</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #009900;">&#123;</span>
    $skip <span style="color: #339933;">:</span> <span style="color: #CC0000;">4</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span></pre></div></div>

<p>As you can see, the SQL matches the pipeline operations pretty clearly.  If you want to play with it, it&#8217;ll be available soon to a the development nightly build.</p>
<p>If you&#8217;re at MongoSV today (December 9th, 2011), check out Chris Westin&#8217;s talk on the new aggregation framework at 3:45 in room B4.</p>
<div class="shr-publisher-1694"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2011%2F12%2F09%2Fsql-to-mongodb-an-updated-mapping%2F' data-shr_title='SQL+to+MongoDB%3A+An+Updated+Mapping'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/6NrTSqweP7s" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2011/12/09/sql-to-mongodb-an-updated-mapping/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2011/12/09/sql-to-mongodb-an-updated-mapping/</feedburner:origLink></item>
		<item>
		<title>On working at 10gen</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/yvy50T5n6CQ/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2011/10/18/on-working-at-10gen/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 17:42:11 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[10gen]]></category>
		<category><![CDATA[hiring]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1655</guid>
		<description><![CDATA[10gen is trying to hire a gazillion people, so I&#8217;m averaging two interviews a day (bleh). A lot of people have asked what it&#8217;s like to work on MongoDB, so I thought I&#8217;d write a bit about it. A Usual Day Get in around 10am. Check if there are any commercial support questions that need&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p>10gen is trying to hire <a href="http://it-jobs.fins.com/Articles/SBB0001424052970203388804576617333830584192/Database-Start-Up-10gen-to-Hire-100">a gazillion people</a>, so I&#8217;m averaging two interviews a day (bleh).  A lot of people have asked what it&#8217;s like to work on MongoDB, so I thought I&#8217;d write a bit about it. </p>
<p><b>A Usual Day</b></p>
<div id="attachment_1656" class="wp-caption alignright" style="width: 310px"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/10/coffeart-300x225.jpg" alt="" title="coffeart" width="300" height="225" class="size-medium wp-image-1656" /><p class="wp-caption-text">Coffee: the lynchpin of my day.</p></div>
<ul>
<li>Get in around 10am.
<li>Check if there are any commercial support questions that need to be answered <em>right now</em>.
<li>Have a cup of coffee and code until lunch.
<li>Eat lunch.
<li>If nothing dire has happened, go out for coffee+writing.  This refuels my brain and is a creative outlet: that&#8217;s where I am now.  My coffee does not look nearly as awesome as the coffee on the right.
<li>Go back to the office, code all afternoon.
<li>Depending on the day, usually between 5:30 and 6:30 the programmers will naturally start discussing problems we had over the day, interviews, support, the latest geek news, etc.  Often beers are broken out.
<li>Wrap up, go home.
</ul>
<p>There are some variations on this: as I mentioned, a lot of time lately is taken up by interviewing.  Other coworkers spend a lot more time than I do at consults, trainings, speaking at conferences, etc.  </p>
<p><b>Other General Workday Stuff</b></p>
<p>On Fridays, we have lunch as a team.  After lunch, we have a tech talk where someone presents on what they&#8217;re working on (e.g., the inspiration for <a href="http://www.snailinaturtleneck.com/blog/2011/06/08/mongo-in-flatland/">my geospatial post</a>) or general info that&#8217;s good to know (e.g., the inspiration for <a href="http://www.snailinaturtleneck.com/blog/2011/08/30/playing-with-virtual-memory/">my virtual memory post</a>).  This is a nice way to end the week, especially since Fridays often wrap up earlier than other days.</p>
<p>A couple people use OS X or Windows for development, most people use Linux.  You can use whatever you want.  I&#8217;d like to encourage emacs users, in particular, to apply, as we&#8217;re falling slightly behind vi in numbers.  </p>
<p>We sit in an open office plan, everyone at tables in a big room (including the CEO and CTO, who are both programmers).  The only people in separate rooms are the people who have to be on the phone all day (sales, marketers, basketweavers&#8230; I&#8217;m not really clear on what non-technical people do).</p>
<p>And speaking of what people actually do, here are three examples of my job (that are more specific than &#8220;coding&#8221;):</p>
<p><b>Fixing Other People&#8217;s Bugs</b></p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/10/storage-unit-300x225.jpg" alt="" title="storage-unit" width="300" height="225" class="alignright size-medium wp-image-1666" /></p>
<p>Recently, a developer was using MongoDB and IBM&#8217;s DB2 with PHP.  After he installed the MongoDB driver, PHP started segfaulting all over the place.  I downloaded the ibm_db2 PHP extension to take a look.  </p>
<p>PHP keeps a &#8220;storage unit&#8221; for extensions&#8217; long-term memory use.  Every extension shares the space and can store things there.  </p>
<p>The DB2 extension was basically fire-bombing the storage unit.</p>
<p>It went through the storage, object by object, casting the objects into DB2 types and then freeing them.  This worked fine when DB2 was the only PHP extension being used, but broke down when anyone else tried to use that storage.  I gave the user a small patch that stopped the DB2 extension from destroying objects it didn&#8217;t create, and everything worked fine for them, after that.  </p>
<p><b>The Game is Afoot</b></p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/10/sherlock-holmes-287x300.jpg" alt="" title="sherlock-holmes" width="287" height="300" class="alignleft size-medium wp-image-1658" /></p>
<p>A user reported that they couldn&#8217;t initialize their replica set: a member wasn&#8217;t coming online. The trick with this type of bug is to get enough evidence before the user wants to beat you over the head with the 800th log you&#8217;ve requested.</p>
<p>I asked them to send the first round of logs. It was weird, nothing was wrong from <em>server1</em>&#8216;s point of view: it initialized properly and could connect to everyone in the set.  I puzzled over the messages, figuring out that once <em>server1</em> had created the set, <em>server2</em> had accepted the connection from <em>server1</em> but then somehow failed to connect back to <em>server1</em> and so couldn&#8217;t pick up the set config.  However, according to <em>server1</em>, it could connect fine to <em>server2</em> and thought it was perfectly healthy! </p>
<p>I finally realized what must be happening: &#8220;It looks like <em>server2</em> couldn&#8217;t connect to any of the others, but all of them could connect to it.  Could you check your firewall?&#8221;</p>
<p>&#8220;Oh, that server was blocking all outgoing connections!  Now its working fine.&#8221;  </p>
<p>Elementary, my dear Watson.</p>
<p><b>You know you&#8217;re not at a big company when&#8230;</b></p>
<div id="attachment_1657" class="wp-caption alignleft" style="width: 188px"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/10/powerpc-178x300.jpg" alt="" title="powerpc" width="178" height="300" class="size-medium wp-image-1657" /><p class="wp-caption-text">At least it had &quot;handles.&quot;</p></div>
<p>Someone on Sparc complained that the Perl driver wasn&#8217;t working at all for them.  My first thought was that Sparc is big-endian, so maybe the Perl driver wasn&#8217;t flipping memory correctly.  I asked <a href="http://twitter.com/#!/eliothorowitz">Eliot</a> where our Power PC was, and he said we must have forgotten it when we moved: it was still in our old office around the corner.  </p>
<p>&#8220;Bring someone to help carry it,&#8221; he told me.  &#8220;It&#8217;s heavy.&#8221;  </p>
<p><em>Pshaw</em>, I thought.  <em>How heavy could an old desktop be?</em></p>
<p>I went around the corner and the other company graciously let me walk into their server room, choose a server, and walk out with it.  Unfortunately, it weighed about 50 pounds, and I have a traditional geek physique (no muscles). The trip back to our office involved me staggering a couple steps, putting it down, shaking out my arms, and repeat.</p>
<p>When I got to our office, I just dragged it down the hallway to our server closet.  Eliot saw me tugboating the thing down the hallway.</p>
<p>&#8220;You didn&#8217;t bring someone to help?&#8221;</p>
<p>&#8220;It&#8217;s <em>*oof*</em> fine!&#8221;</p>
<p>Unfortunately, once it was all set up, the Perl driver worked perfectly on it.  So it wasn&#8217;t big-endian specific.  </p>
<p>I was now pretty sure it was Sparc-specific (another person had reported the same problem on a Sparc), so I bought an elderly Sparc server for a couple hundred bucks off eBay.  When it arrived a couple days later, Eliot showed me how to rack it and I spent a day fighting with the Solaris/Oracle package manager.  However, it was all worth it: I tried running the Perl driver and it instantly failed (success!).  </p>
<p>After some debugging, I realized that Sparc was much more persnickety than Intel about byte alignment. The Perl driver was playing fast and loose with a byte buffer, casting pieces of it into other types (which Sparc didn&#8217;t like).  I changed some casts to <code>memcpy</code>s and the Perl driver started working beautifully.</p>
<p><b>But every day is different</b></p>
<p>The episodes above are a very small sample of what I do: there are hundreds of other things I&#8217;ve worked on over the last few years from speaking to working on the database to writing a freakin Facebook app.</p>
<p>So, if this sounded interesting, please go to our <a href="http://www.10gen.com/careers/positions">jobs website</a> and submit an application!</p>
<div class="shr-publisher-1655"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2011%2F10%2F18%2Fon-working-at-10gen%2F' data-shr_title='On+working+at+10gen'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/yvy50T5n6CQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2011/10/18/on-working-at-10gen/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2011/10/18/on-working-at-10gen/</feedburner:origLink></item>
		<item>
		<title>Getting Started with MMS</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/pWubGWtNgJQ/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2011/09/28/mms-the-mongo-monitoring-service/#comments</comments>
		<pubDate>Wed, 28 Sep 2011 17:39:09 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[MMS]]></category>
		<category><![CDATA[monitoring]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1631</guid>
		<description><![CDATA[Edit: since this was written, Sam has written some excellent documentation on using MMS. I recommend reading through it as you explore MMS. Telling someone &#8220;You should set up monitoring&#8221; is kind of like telling someone &#8220;You should exercise 20 minutes three times a week.&#8221; Yes, you know you should, but your chair is so&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p><b>Edit: since this was written, <a href="http://www.tychoish.com/">Sam</a> has written some <a href="https://mms.10gen.com/help/">excellent documentation</a> on using MMS.  I recommend reading through it as you explore MMS.</b></p>
<p>Telling someone &#8220;You should set up monitoring&#8221; is kind of like telling someone &#8220;You should exercise 20 minutes three times a week.&#8221; Yes, you know you should, but your chair is so comfortable and you haven&#8217;t keeled over dead yet.</p>
<p>For years<a href="#mms-in-code">*</a>, 10gen has been planning to do monitoring &#8220;right,&#8221; making it painless to monitor your database.  Today, we released the MongoDB Monitoring Service: MMS.</p>
<p><a href="http://blog.10gen.com/post/10764757533/announcing-mongodb-monitoring-service-mms">MMS</a> is free hosted monitoring for MongoDB.  I&#8217;ve been using it to help out paying customers for a while, so I thought I&#8217;d do a quick post on useful stuff I&#8217;ve discovered (documentation is&#8230; uh&#8230; a little light, so far).  </p>
<p>So, first: you <a href="https://mms.10gen.com/user/register">sign up</a>.  </p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-12.48.34-PM-300x95.png" alt="" title="MMS sign up" width="300" height="95" class="aligncenter size-medium wp-image-1633" /></p>
<p>There are two options: register a company and register another account for an existing company.  For example, let&#8217;s say I wanted to monitor the servers for Snail in a Turtleneck Enterprises.  I&#8217;ll create a new account and company group.  Then Andrew, sys admin of my heart, can create an account with Snail in a Turtleneck Enterprises and have access to all the same monitoring info.  </p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-12.49.34-PM-300x182.png" alt="" title="MMS registration" width="300" height="182" class="aligncenter size-medium wp-image-1634" /></p>
<p>Once you&#8217;re registered, you&#8217;ll see a page encouraging you to download the MMS agent. Click on the &#8220;download the agent&#8221; link.</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-12.57.35-PM-300x90.png" alt="" title="Download agent" width="300" height="90" class="aligncenter size-medium wp-image-1635" /></p>
<p>This is a little Python program that collects stats from MongoDB, so you need to have pymongo installed, too.  Starting from scratch on Ubuntu, do:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ <span style="color: #666666; font-style: italic;"># prereqs</span>
$ <span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">apt-get</span> <span style="color: #c20cb9; font-weight: bold;">install</span> python python-setuptools
$ <span style="color: #c20cb9; font-weight: bold;">sudo</span> easy_install pymongo
$
$ <span style="color: #666666; font-style: italic;"># set up agent</span>
$ <span style="color: #c20cb9; font-weight: bold;">unzip</span> name-of-agent.zip
$ <span style="color: #7a0874; font-weight: bold;">cd</span> name-of-agent
$ <span style="color: #c20cb9; font-weight: bold;">mkdir</span> logs
$
$ <span style="color: #666666; font-style: italic;"># start agent</span>
$ <span style="color: #c20cb9; font-weight: bold;">nohup</span> python agent.py <span style="color: #000000; font-weight: bold;">&gt;</span> logs<span style="color: #000000; font-weight: bold;">/</span>agent.log <span style="color: #000000;">2</span><span style="color: #000000; font-weight: bold;">&gt;&amp;</span><span style="color: #000000;">1</span> <span style="color: #000000; font-weight: bold;">&amp;</span></pre></div></div>

<p>Last step! Back to the website: see that &#8220;+&#8221; button next to the &#8220;Hosts&#8221; title?  </p>
<div id="attachment_1645" class="wp-caption aligncenter" style="width: 136px"><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-1.02.07-PM.png" alt="" title="Add host" width="126" height="54" class="size-full wp-image-1645" /><p class="wp-caption-text">Designed by programmers, for Vulcans</p></div>
<p>Click on that and type a hostname.  If you have a sharded cluster, add a mongos.  If you have a replica set, add any member.  </p>
<p>Now go have a nice cup of coffee.  This is an important part of the process.</p>
<p>When you get back, tada, you&#8217;ll have buttloads of graphs.  They probably won&#8217;t have much on them, since MMS will have been monitoring them for all of a few minutes. </p>
<p><b>Cool stuff to poke</b></p>
<p>This is the top bar of buttons:</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-1.06.30-PM-300x30.png" alt="" title="Top bar" width="300" height="30" class="aligncenter size-medium wp-image-1646" /></p>
<p>Of immediate interest: click &#8220;Hosts&#8221; to see a list of hosts.  </p>
<p>You&#8217;ll see hostname, role, and the last time the MMS agent was able to reach this host.  Hosts that it hasn&#8217;t reached recently will have a red ping time.</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-1.09.15-PM1.png" alt="" title="Agent ping" width="175" height="87" class="aligncenter size-full wp-image-1640" /></p>
<p>Now click on a server&#8217;s name to see all of the info about it.  Let&#8217;s look at a single graph.</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-1.16.07-PM-300x178.png" alt="" title="Connection chart" width="300" height="178" class="aligncenter size-medium wp-image-1643" /></p>
<p>You can click &#038; drag to see a smaller bit of time on the graph.  See those icons in the top right? Those give you:</p>
<dl>
<dt>+</dt>
<dd>Add to dashboard: you can create a custom dashboard with any charts you&#8217;re interested in.  Click on the &#8220;Dashboard&#8221; link next to &#8220;Hosts&#8221; to see your dashboard.</dd>
<dt>Link</dt>
<dd>Link to a <em>private</em> URL for this chart.  You&#8217;ll have to be logged in to see it.</dd>
<dt>Email</dt>
<dd>Email a jpg of this chart to someone.</dd>
<dt>i</dt>
<dd>This is maybe the most important one: a description of what this chart represents.</dd>
</dl>
<p>That&#8217;s the basics.  Some other points of interest:</p>
<ul>
<li>You can set up alerts by clicking on &#8220;Alerts&#8221; in the top bar
<li>&#8220;Events&#8221; shows you when hosts went down or came up, because primary or secondary, or were upgraded.
<li>Arbiters don&#8217;t have their own chart, since they don&#8217;t have data.  However, there is an &#8220;Arbiters&#8221; tab that lists them if you have some.
<li>The &#8220;Last Ping&#8221; tab contains all of the info sent by MMS on the last ping, which I find interesting.
<li>If you are confused, there is an &#8220;FAQ&#8221; link in the top bar that answers some common questions.
</ul>
<p>If you have any problems with MMS, there&#8217;s a little form at the bottom to let you complain:</p>
<p><img src="http://www.snailinaturtleneck.com/blog/wp-content/uploads/2011/09/Screen-shot-2011-09-28-at-1.12.48-PM-300x69.png" alt="" title="Complain" width="300" height="69" class="aligncenter size-medium wp-image-1642" /></p>
<p>This will file a <a href="http://jira.mongodb.org/browse/MMS">bug report</a> for you.    This is a &#8220;private&#8221; bug tracker, only 10gen and people in your group will be able to see the bugs you file.</p>
<p><a name="mms-in-code">*</a> If you ran <code>mongod --help</code> using MongoDB version 1.0.0 or higher, you might have noticed some options that started with <a href="https://github.com/mongodb/mongo/blob/v1.0/db/db.cpp#L494"><code>--mms</code></a>.  In other words, we&#8217;ve been planning this for a little while.  </p>
<div class="shr-publisher-1631"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2011%2F09%2F28%2Fmms-the-mongo-monitoring-service%2F' data-shr_title='Getting+Started+with+MMS'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/pWubGWtNgJQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2011/09/28/mms-the-mongo-monitoring-service/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2011/09/28/mms-the-mongo-monitoring-service/</feedburner:origLink></item>
		<item>
		<title>More PHP Internals: References</title>
		<link>http://feedproxy.google.com/~r/kchodorow/~3/s2NAEKTO4fA/</link>
		<comments>http://www.snailinaturtleneck.com/blog/2011/09/07/more-php-internals-references/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 03:49:18 +0000</pubDate>
		<dc:creator>Kristina Chodorow</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[references]]></category>

		<guid isPermaLink="false">http://www.snailinaturtleneck.com/blog/?p=1615</guid>
		<description><![CDATA[By request, a quick post on using PHP references in extensions. To start, here&#8217;s an example of references in PHP we&#8217;ll be translating into C: &#60;?php &#160; // just for displaying output function display&#40;$x&#41; &#123; echo &#34;x is $x\n&#34;; &#125; &#160; // pass in an argument by making a copy of it function not_by_ref&#40;$arg&#41; &#123;&#8230;]]></description>
			<content:encoded><![CDATA[<!-- Start Shareaholic LikeButtonSetTop --><!-- End Shareaholic LikeButtonSetTop --><p>By request, a quick post on using PHP references in extensions.</p>
<p>To start, here&#8217;s an example of references in PHP we&#8217;ll be translating into C:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">&lt;?php</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// just for displaying output</span>
<span style="color: #000000; font-weight: bold;">function</span> display<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;x is <span style="color: #006699; font-weight: bold;">$x</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// pass in an argument by making a copy of it</span>
<span style="color: #000000; font-weight: bold;">function</span> not_by_ref<span style="color: #009900;">&#40;</span><span style="color: #000088;">$arg</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;called not_by_ref(<span style="color: #006699; font-weight: bold;">$arg</span>)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$arg</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// pass in an argument by reference</span>
<span style="color: #000000; font-weight: bold;">function</span> by_ref<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$arg</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">&quot;called by_ref(<span style="color: #006699; font-weight: bold;">$arg</span>)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
    <span style="color: #000088;">$arg</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">3</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
display<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
not_by_ref<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
display<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #666666; font-style: italic;">// when x is passed by reference, the function can change the value</span>
by_ref<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
display<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">?&gt;</span></pre></div></div>

<p>This will print:</p>
<pre>
x is 1
called not_by_ref(1)
x is 1
called by_ref(1)
x is 3
</pre>
<p>If you want your C extension&#8217;s function to officially have a signature with ampersands in it, you have to declare to PHP that you want to pass in refs as arguments.  Remember how we declared functions in this struct?</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">zend_function_entry rlyeh_functions<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#123;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>cthulhu<span style="color: #339933;">,</span> NULL<span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#123;</span> NULL<span style="color: #339933;">,</span> NULL<span style="color: #339933;">,</span> NULL <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span></pre></div></div>

<p>The second argument to <code>PHP_FE</code>, NULL, can optional be the argument spec.  For example, let&#8217;s say we&#8217;re implementing <code>by_ref()</code> in C.  We would add this to <em>php_rlyeh.c</em>:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">// the 1 indicates pass-by-reference</span>
ZEND_BEGIN_ARG_INFO<span style="color: #009900;">&#40;</span>arginfo_by_ref<span style="color: #339933;">,</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span>
ZEND_END_ARG_INFO<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
zend_function_entry rlyeh_functions<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#123;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>cthulhu<span style="color: #339933;">,</span> NULL<span style="color: #009900;">&#41;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>by_ref<span style="color: #339933;">,</span> arginfo_by_ref<span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#123;</span> NULL<span style="color: #339933;">,</span> NULL<span style="color: #339933;">,</span> NULL <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
&nbsp;
PHP_FUNCTION<span style="color: #009900;">&#40;</span>by_ref<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  zval <span style="color: #339933;">*</span>zptr <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>zend_parse_parameters<span style="color: #009900;">&#40;</span>ZEND_NUM_ARGS<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> TSRMLS_CC<span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;z&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>zptr<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> FAILURE<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  php_printf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;called (the c version of) by_ref(%d)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span>Z_LVAL_P<span style="color: #009900;">&#40;</span>zptr<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  ZVAL_LONG<span style="color: #009900;">&#40;</span>zptr<span style="color: #339933;">,</span> <span style="color: #0000dd;">3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Suppose we also add <code>not_by_ref()</code>.  This might look something like:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">ZEND_BEGIN_ARG_INFO<span style="color: #009900;">&#40;</span>arginfo_not_by_ref<span style="color: #339933;">,</span> <span style="color: #0000dd;">0</span><span style="color: #009900;">&#41;</span>
ZEND_END_ARG_INFO<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
zend_function_entry rlyeh_functions<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#123;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>cthulhu<span style="color: #339933;">,</span> NULL<span style="color: #009900;">&#41;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>by_ref<span style="color: #339933;">,</span> arginfo_by_ref<span style="color: #009900;">&#41;</span>
  PHP_FE<span style="color: #009900;">&#40;</span>not_by_ref<span style="color: #339933;">,</span> arginfo_not_by_ref<span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#123;</span> NULL<span style="color: #339933;">,</span> NULL<span style="color: #339933;">,</span> NULL <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
&nbsp;
PHP_FUNCTION<span style="color: #009900;">&#40;</span>not_by_ref<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  zval <span style="color: #339933;">*</span>zptr <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>copy <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>zend_parse_parameters<span style="color: #009900;">&#40;</span>ZEND_NUM_ARGS<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> TSRMLS_CC<span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;z&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>zptr<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> FAILURE<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  php_printf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;called (the c version of) not_by_ref(%d)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span>Z_LVAL_P<span style="color: #009900;">&#40;</span>zptr<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  ZVAL_LONG<span style="color: #009900;">&#40;</span>zptr<span style="color: #339933;">,</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>However, if we try running this, we&#8217;ll get:</p>
<pre>
x is 1
called (the c version of) not_by_ref(1)
x is 2
called (the c version of) by_ref(2)
x is 3
</pre>
<p>What happened?  <code>not_by_ref</code> used our variable like a reference!</p>
<p>This is really weird and annoying behavior (if anyone knows why PHP does this, please comment below).</p>
<p>To work around it, if you want non-reference behavior, you have to manually make a copy of the argument.</p>
<p>Our <code>not_by_ref()</code> function becomes:</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">PHP_FUNCTION<span style="color: #009900;">&#40;</span>not_by_ref<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  zval <span style="color: #339933;">*</span>zptr <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>copy <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>zend_parse_parameters<span style="color: #009900;">&#40;</span>ZEND_NUM_ARGS<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> TSRMLS_CC<span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;z&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>zptr<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> FAILURE<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// make a copy                                                                                                                                                          </span>
  MAKE_STD_ZVAL<span style="color: #009900;">&#40;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  memcpy<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> zptr<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>zval<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// set refcount to 1, as we're only using &quot;copy&quot; in this function                                                                                                         </span>
  Z_SET_REFCOUNT_P<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  php_printf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;called (the c version of) not_by_ref(%d)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span>Z_LVAL_P<span style="color: #009900;">&#40;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  ZVAL_LONG<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  zval_ptr_dtor<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Note that we set the refcount of <code>copy</code> to 1.  This is because the refcount for <code>zptr</code> is 2: 1 ref from the calling function + 1 ref from the <code>not_by_ref</code> function.  However, we don&#8217;t want the copy of <code>zptr</code> to have a refcount of 2, because it&#8217;s only being used by the current function.</p>
<p>Also note that <code>memcpy</code>-ing the zval only works because this is a scalar: if this were an array or object, we&#8217;d have to use PHP API functions to make a deep copy of the original.</p>
<p>If we run our PHP program again, it gives us:</p>
<pre>
x is 1
called (the c version of) not_by_ref(1)
x is 1
called (the c version of) by_ref(1)
x is 3
</pre>
<p>Okay, this is pretty good&#8230; but we&#8217;re actually missing a case.  What happens if we pass in a reference to <code>not_by_ref()</code>?  In PHP, this looks like:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">function</span> not_by_ref<span style="color: #009900;">&#40;</span><span style="color: #000088;">$arg</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
   <span style="color: #000088;">$arg</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000088;">$x</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
not_by_ref<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
display<span style="color: #009900;">&#40;</span><span style="color: #000088;">$x</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>&#8230;which displays &#8220;x is 2&#8243;.  Unfortunately, we&#8217;ve overridden this behavior in our <code>not_by_ref()</code> C function, so we have to special case: if this is a reference, change its value, otherwise make a copy and change the copy&#8217;s value.</p>

<div class="wp_syntax"><div class="code"><pre class="c" style="font-family:monospace;">PHP_FUNCTION<span style="color: #009900;">&#40;</span>not_by_ref<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  zval <span style="color: #339933;">*</span>zptr <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>copy <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>zend_parse_parameters<span style="color: #009900;">&#40;</span>ZEND_NUM_ARGS<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> TSRMLS_CC<span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;z&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">&amp;</span>zptr<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> FAILURE<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">return</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// NEW CODE</span>
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>Z_ISREF_P<span style="color: #009900;">&#40;</span>zptr<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #666666; font-style: italic;">// if this is a reference, make copy point to zptr</span>
    copy <span style="color: #339933;">=</span> zptr<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// adding a reference so we can indiscriminately delete copy later</span>
    zval_add_ref<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span>zptr<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
  <span style="color: #666666; font-style: italic;">// OLD CODE</span>
  <span style="color: #b1b100;">else</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #666666; font-style: italic;">// make a copy                                                                                                                                  </span>
    MAKE_STD_ZVAL<span style="color: #009900;">&#40;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    memcpy<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> zptr<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span>zval<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// set refcount to 1, as we're only using &quot;copy&quot; in this function                                                                                                       </span>
    Z_SET_REFCOUNT_P<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  php_printf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;called (the c version of) not_by_ref(%d)<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span><span style="color: #009900;">&#41;</span>Z_LVAL_P<span style="color: #009900;">&#40;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  ZVAL_LONG<span style="color: #009900;">&#40;</span>copy<span style="color: #339933;">,</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  zval_ptr_dtor<span style="color: #009900;">&#40;</span><span style="color: #339933;">&amp;</span>copy<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Now it&#8217;ll behave &#8220;properly.&#8221;</p>
<p>There may be a better way to do this, please leave a comment if you know of one.  However, as far as I know, this is the only way to emulate the PHP reference behavior.</p>
<p>If you would like to read more about PHP references, Derick Rethans wrote <a href="http://derickrethans.nl/php-references-article.html">a great article on it</a> for PHP Architect.</p>
<div class="shr-publisher-1615"></div><!-- Start Shareaholic LikeButtonSetBottom --><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><div class='shareaholic-like-buttonset' style='float:none;height:30px;'><a class='shareaholic-googleplusone' data-shr_size='medium' data-shr_count='true' data-shr_href='http%3A%2F%2Fwww.snailinaturtleneck.com%2Fblog%2F2011%2F09%2F07%2Fmore-php-internals-references%2F' data-shr_title='More+PHP+Internals%3A+References'></a></div><div style="clear: both; min-height: 1px; height: 3px; width: 100%;"></div><!-- End Shareaholic LikeButtonSetBottom --><img src="http://feeds.feedburner.com/~r/kchodorow/~4/s2NAEKTO4fA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.snailinaturtleneck.com/blog/2011/09/07/more-php-internals-references/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.snailinaturtleneck.com/blog/2011/09/07/more-php-internals-references/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 4.488 seconds. --><!-- Cached page generated by WP-Super-Cache on 2012-02-02 14:59:57 -->

