<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
 
  <title>Dev @ AboutUs.org </title>
  <link href="http://dev.aboutus.org/" />
  
  <updated>2011-08-22T11:47:06-07:00</updated>
  <id>http://dev.aboutus.org/</id>
  <author>
    <name>AboutUs</name>
    <email>info@aboutus.org</email>
  </author>

  
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/DevAboutusorg" /><feedburner:info uri="devaboutusorg" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:browserFriendly></feedburner:browserFriendly><entry>
    <id>http://dev.aboutus.org/2011/08/22/cassandra-truncate-means-slow-test-suites</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/08/22/cassandra-truncate-means-slow-test-suites.html" />
    <title>Cassandra Truncate Means Slow Test Suites</title>
    <updated>2011-08-22T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;We&amp;#8217;ve been using Cassandra as the primary data store on several applications for a while. Of course this means that it&amp;#8217;s integrated into our test, and ci environments. A couple days ago Brad and I noticed that one of our test suites, which had been running in about 1 minute was now taking 10 minutes. It was painful.&lt;/p&gt;

&lt;p&gt;We tracked the slowness down to this line in our Rspec spec_helper.rb file:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='n'&gt;config&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;before&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='ss'&gt;:each&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='p'&gt;{&lt;/span&gt; &lt;span class='vg'&gt;$cassandra&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;clear_keyspace!&lt;/span&gt; &lt;span class='p'&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;The intention here was to clear the database before each test ran, to ensure that tests were isolated from each other. Great, except as we added more Column Families this got slower and slower, until the call was taking ~1 second per test case. About 90% of the time in the test suite was being spent deleting data from cassandra. Oh the pain&amp;#8230;&lt;/p&gt;

&lt;p&gt;Luckily we got a tip from &lt;strong&gt;thobbs&lt;/strong&gt; in the #cassandra irc channel. He said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;doing a get_range() while deleting everything is faster that&amp;#8217;s what I do for most of the pycassa test cases&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;He pointed me to &lt;a href='https://github.com/pycassa/pycassa/blob/master/tests/test_columnfamily.py#L37'&gt;an example in pycassa&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Translated into ruby it looks like this:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='n'&gt;config&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;before&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='ss'&gt;:each&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
  &lt;span class='vg'&gt;$cassandra&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;schema&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;cf_defs&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;each&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt; &lt;span class='o'&gt;|&lt;/span&gt;&lt;span class='n'&gt;cf&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt;
    &lt;span class='vg'&gt;$cassandra&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;get_range&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;cf&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='ss'&gt;:key_count&lt;/span&gt; &lt;span class='o'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='mi'&gt;10000&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;each&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt; &lt;span class='o'&gt;|&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;row_key&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;_&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt;
      &lt;span class='vg'&gt;$cassandra&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;remove&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;cf&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;row_key&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;
&lt;span class='k'&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;With this change the test suite was back down to ~1 minute run time. And the people rejoice.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/07/18/managing-cassandra-s-schema-from-rails</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/07/18/managing-cassandra-s-schema-from-rails.html" />
    <title>Managing Cassandra's Schema from Rails</title>
    <updated>2011-07-18T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;The other day I was tracking down a bug on one of our rails projects that uses Cassandra as its primary data store. I needed to find a way to manage changes to the Cassandra schema across all of our development, and production boxes.&lt;/p&gt;

&lt;p&gt;Rails normally uses ActiveRecord Migrations to manage an application&amp;#8217;s schema. Obviously these don&amp;#8217;t work with Cassandra since it is not a relational database. Fortunately we were able to quickly create a system for managing Cassandra&amp;#8217;s schema from the rails app. It&amp;#8217;s brand new so YMMV, but I thought the approach might be useful to others trying to use Cassandra from rails.&lt;/p&gt;

&lt;h2 id='declarative_schema'&gt;Declarative Schema&lt;/h2&gt;

&lt;p&gt;Column families and their attributes are defined in a YAML file. Our project has a &lt;code&gt;schema.yml&lt;/code&gt; file that looks like this:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='yaml'&gt;&lt;span class='nn'&gt;---&lt;/span&gt;
&lt;span class='l-Scalar-Plain'&gt;ReportHandles&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt;
  &lt;span class='l-Scalar-Plain'&gt;comparator_type&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt; &lt;span class='s'&gt;&amp;#39;UTF8Type&amp;#39;&lt;/span&gt;
&lt;span class='l-Scalar-Plain'&gt;Reports&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt;
  &lt;span class='l-Scalar-Plain'&gt;column_type&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt; &lt;span class='l-Scalar-Plain'&gt;Super&lt;/span&gt;
  &lt;span class='l-Scalar-Plain'&gt;comparator_type&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt; &lt;span class='s'&gt;&amp;#39;UTF8Type&amp;#39;&lt;/span&gt;
&lt;span class='l-Scalar-Plain'&gt;Runs&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt;
  &lt;span class='l-Scalar-Plain'&gt;column_type&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt; &lt;span class='l-Scalar-Plain'&gt;Super&lt;/span&gt;
  &lt;span class='l-Scalar-Plain'&gt;default_validation_class&lt;/span&gt;&lt;span class='p-Indicator'&gt;:&lt;/span&gt; &lt;span class='s'&gt;&amp;#39;BytesType&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;This file defines all of our column families and the properties such as &lt;code&gt;comparator_type&lt;/code&gt; which we care about. This differs from Rails&amp;#8217; built-in approach to migrations. Instead of defining a series of migrations that are applied in order, we declare the schema that we want, and the system brings Cassandra in line with it. This felt like a simpler approach for our use case, and made sense since Cassandra (not being a relational db) has a simpler, more flexible schema system, and less dependency between column families than, say, MySQL tables. You might look at &lt;a href='http://blog.carbonfive.com/2011/01/06/database-migrations-for-cassandra-with-activecolumn/'&gt;active_column&lt;/a&gt; if you want Cassandra migrations that are closer in style to ActiveRecord.&lt;/p&gt;

&lt;p&gt;To accomplish the actual schema changes we have a &lt;code&gt;Schema&lt;/code&gt; module that looks like this:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='k'&gt;module&lt;/span&gt; &lt;span class='nn'&gt;Schema&lt;/span&gt;
  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;migrate&lt;/span&gt;
    &lt;span class='n'&gt;config&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;each&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt; &lt;span class='o'&gt;|&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;properties&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt;
      &lt;span class='n'&gt;migrate_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;properties&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;
    &lt;span class='n'&gt;wait_for_schema_agreement&lt;/span&gt;
    &lt;span class='n'&gt;schema&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;schema_agreement?&lt;/span&gt;
    &lt;span class='n'&gt;cassandra_client&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;schema_agreement?&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;config&lt;/span&gt;
    &lt;span class='vi'&gt;@config&lt;/span&gt; &lt;span class='o'&gt;||=&lt;/span&gt;  &lt;span class='n'&gt;config!&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;config!&lt;/span&gt;
    &lt;span class='no'&gt;YAML&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;load_file&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='no'&gt;File&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;join&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='no'&gt;Config&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;root&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;db&amp;#39;&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;schema.yml&amp;#39;&lt;/span&gt;&lt;span class='p'&gt;))&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;wait_for_schema_agreement&lt;/span&gt;
    &lt;span class='k'&gt;return&lt;/span&gt; &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='n'&gt;schema_agreement?&lt;/span&gt;
    &lt;span class='n'&gt;secs&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='mi'&gt;90&lt;/span&gt;
    &lt;span class='no'&gt;Timeout&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;timeout&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;secs&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
      &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;waiting up to &lt;/span&gt;&lt;span class='si'&gt;#{&lt;/span&gt;&lt;span class='n'&gt;secs&lt;/span&gt;&lt;span class='si'&gt;}&lt;/span&gt;&lt;span class='s2'&gt; seconds for schema agreement&amp;quot;&lt;/span&gt;
      &lt;span class='k'&gt;until&lt;/span&gt; &lt;span class='n'&gt;schema_agreement?&lt;/span&gt;
        &lt;span class='nb'&gt;print&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;.&amp;#39;&lt;/span&gt;
      &lt;span class='k'&gt;end&lt;/span&gt;
      &lt;span class='nb'&gt;puts&lt;/span&gt;
      &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;done&amp;quot;&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='c1'&gt;# Migrate a column family to a desired state.&lt;/span&gt;
  &lt;span class='c1'&gt;# NB. Only properties that are explicitly declared are set.&lt;/span&gt;
  &lt;span class='c1'&gt;# Removing a value&lt;/span&gt;
  &lt;span class='c1'&gt;# from properties will not reset it back to default, it will&lt;/span&gt;
  &lt;span class='c1'&gt;# leave it in its&lt;/span&gt;
  &lt;span class='c1'&gt;# current state.&lt;/span&gt;
  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;migrate_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;column_family_name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;properties&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='p'&gt;{})&lt;/span&gt;
    &lt;span class='n'&gt;wait_for_schema_agreement&lt;/span&gt;
    &lt;span class='n'&gt;cf_def&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='n'&gt;find_or_initialize_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;column_family_name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='n'&gt;properties&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;each&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt; &lt;span class='o'&gt;|&lt;/span&gt;&lt;span class='n'&gt;property&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;value&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt;
      &lt;span class='n'&gt;cf_def&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;send&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;&lt;/span&gt;&lt;span class='si'&gt;#{&lt;/span&gt;&lt;span class='n'&gt;property&lt;/span&gt;&lt;span class='si'&gt;}&lt;/span&gt;&lt;span class='s2'&gt;=&amp;quot;&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;value&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;

    &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='n'&gt;column_family_exists?&lt;/span&gt; &lt;span class='n'&gt;cf_def&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;name&lt;/span&gt;
      &lt;span class='n'&gt;cassandra_client&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;update_column_family&lt;/span&gt; &lt;span class='n'&gt;cf_def&lt;/span&gt;
    &lt;span class='k'&gt;else&lt;/span&gt;
      &lt;span class='n'&gt;cassandra_client&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;add_column_family&lt;/span&gt; &lt;span class='n'&gt;cf_def&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;find_or_initialize_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='n'&gt;cf_def&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='n'&gt;column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;to_s&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='o'&gt;||&lt;/span&gt; &lt;span class='p'&gt;(&lt;/span&gt;
      &lt;span class='n'&gt;cf_def&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='no'&gt;CassandraThrift&lt;/span&gt;&lt;span class='o'&gt;::&lt;/span&gt;&lt;span class='no'&gt;CfDef&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;new&lt;/span&gt;
      &lt;span class='n'&gt;cf_def&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;keyspace&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='n'&gt;cassandra_client&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;keyspace&lt;/span&gt;
      &lt;span class='n'&gt;cf_def&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;name&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;to_s&lt;/span&gt;
      &lt;span class='n'&gt;cf_def&lt;/span&gt;
    &lt;span class='p'&gt;)&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='c1'&gt;# SCHEMA INTROSPECTION&lt;/span&gt;
  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;column_family_exists?&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='n'&gt;column_families&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;map&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='o'&gt;&amp;amp;&lt;/span&gt;&lt;span class='ss'&gt;:name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;include?&lt;/span&gt; &lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;to_s&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;schema&lt;/span&gt;
    &lt;span class='n'&gt;cassandra_client&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;schema&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;column_families&lt;/span&gt;
    &lt;span class='n'&gt;schema&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;cf_defs&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

  &lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
    &lt;span class='n'&gt;column_families&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;detect&lt;/span&gt;&lt;span class='p'&gt;{&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt;&lt;span class='n'&gt;cf&lt;/span&gt;&lt;span class='o'&gt;|&lt;/span&gt; &lt;span class='n'&gt;cf&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;name&lt;/span&gt; &lt;span class='o'&gt;==&lt;/span&gt; &lt;span class='nb'&gt;name&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;to_s&lt;/span&gt;&lt;span class='p'&gt;}&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;

&lt;span class='k'&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;It assumes that you have a &lt;code&gt;cassandra_client&lt;/code&gt; method defined, and that the Keyspace you&amp;#8217;re managing exists. At some point I may clean this up and release it as a gem but, like I said, for now YMMV. I&amp;#8217;ve omitted the specs for brevity, but you can see them on &lt;a href='https://gist.github.com/1090146'&gt;gist&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id='cluster_schema_does_not_yet_agree'&gt;Cluster schema does not yet agree&amp;#8230;&lt;/h2&gt;

&lt;p&gt;When I first ran this against a clustered Cassandra setup I started getting &lt;em&gt;cluster schema does not yet agree&lt;/em&gt; messages, followed by a failure. It turns out Cassandra has the concept of &lt;em&gt;schema agreement&lt;/em&gt;. This makes perfect sense when you consider that Cassandra is distributed, and fault tolerant. Schema changes have to be propagated through the cluster, until eventually all nodes agree. It seems that until there is agreement, you can&amp;#8217;t make further schema changes. To deal with this the &lt;code&gt;Schema&lt;/code&gt; manager waits for schema agreement before making changes to the column family:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;migrate_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;column_family_name&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='n'&gt;properties&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='p'&gt;{})&lt;/span&gt;
  &lt;span class='n'&gt;wait_for_schema_agreement&lt;/span&gt;
  &lt;span class='n'&gt;cf_def&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='n'&gt;find_or_initialize_column_family&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;column_family_name&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;.&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Waiting for the schema to propagate usually just takes a second and is easy to do:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='k'&gt;def&lt;/span&gt; &lt;span class='nc'&gt;self&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='nf'&gt;wait_for_schema_agreement&lt;/span&gt;
  &lt;span class='k'&gt;return&lt;/span&gt; &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='n'&gt;schema_agreement?&lt;/span&gt;
  &lt;span class='n'&gt;secs&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='mi'&gt;90&lt;/span&gt;
  &lt;span class='no'&gt;Timeout&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;timeout&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='n'&gt;secs&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
    &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;waiting up to &lt;/span&gt;&lt;span class='si'&gt;#{&lt;/span&gt;&lt;span class='n'&gt;secs&lt;/span&gt;&lt;span class='si'&gt;}&lt;/span&gt;&lt;span class='s2'&gt; seconds for schema agreement&amp;quot;&lt;/span&gt;
    &lt;span class='k'&gt;until&lt;/span&gt; &lt;span class='n'&gt;schema_agreement?&lt;/span&gt;
      &lt;span class='nb'&gt;print&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;.&amp;#39;&lt;/span&gt;
    &lt;span class='k'&gt;end&lt;/span&gt;
    &lt;span class='nb'&gt;puts&lt;/span&gt;
    &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;done&amp;quot;&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;
&lt;span class='k'&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;h2 id='integrating_with_rake_and_capistrano'&gt;Integrating with Rake and Capistrano&lt;/h2&gt;

&lt;p&gt;I wanted to keep the workflow around schema management as close to the rails conventions as possible. The schema can be brought up to date locally by running &lt;code&gt;rake cassandra:migrate&lt;/code&gt; which is defined as:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='n'&gt;namespace&lt;/span&gt; &lt;span class='ss'&gt;:cassandra&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
  &lt;span class='n'&gt;desc&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;Migrate to the current cassandra schema&amp;quot;&lt;/span&gt;
  &lt;span class='n'&gt;task&lt;/span&gt; &lt;span class='ss'&gt;:migrate&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
    &lt;span class='nb'&gt;require&lt;/span&gt; &lt;span class='no'&gt;File&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;expand_path&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='no'&gt;File&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;join&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='no'&gt;Config&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;root&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;lib&amp;#39;&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='s1'&gt;&amp;#39;schema&amp;#39;&lt;/span&gt;&lt;span class='p'&gt;))&lt;/span&gt;
    &lt;span class='no'&gt;Schema&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;migrate&lt;/span&gt;
    &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;Migrated to this Schema:&amp;quot;&lt;/span&gt;
    &lt;span class='nb'&gt;puts&lt;/span&gt; &lt;span class='o'&gt;*&lt;/span&gt;&lt;span class='no'&gt;Schema&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;column_families&lt;/span&gt;&lt;span class='o'&gt;.&lt;/span&gt;&lt;span class='n'&gt;map&lt;/span&gt;&lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='o'&gt;&amp;amp;&lt;/span&gt;&lt;span class='ss'&gt;:inspect&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;
&lt;span class='k'&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Hooking it up to our Capistrano deploy was also easy. By overriding the &lt;code&gt;deploy:migrate&lt;/code&gt; task this gets hooked into our deploy process in place of ActiveRecord Migrations.&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='ruby'&gt;&lt;span class='n'&gt;namespace&lt;/span&gt; &lt;span class='ss'&gt;:deploy&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
  &lt;span class='n'&gt;task&lt;/span&gt; &lt;span class='ss'&gt;:migrate&lt;/span&gt; &lt;span class='k'&gt;do&lt;/span&gt;
    &lt;span class='n'&gt;run&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;cd &lt;/span&gt;&lt;span class='si'&gt;#{&lt;/span&gt;&lt;span class='n'&gt;release_path&lt;/span&gt;&lt;span class='si'&gt;}&lt;/span&gt;&lt;span class='s2'&gt; &amp;amp;&amp;amp; RAILS_ENV=&lt;/span&gt;&lt;span class='si'&gt;#{&lt;/span&gt;&lt;span class='n'&gt;stage&lt;/span&gt;&lt;span class='si'&gt;}&lt;/span&gt;&lt;span class='s2'&gt; rake cassandra:migrate&amp;quot;&lt;/span&gt;
  &lt;span class='k'&gt;end&lt;/span&gt;
&lt;span class='k'&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Now running the standard &lt;code&gt;cap deploy:migrations&lt;/code&gt; task deploys the codebase and brings the Cassandra schema up to date.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/07/03/getting-started-exploratory-parsing</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/07/03/getting-started-exploratory-parsing.html" />
    <title>Getting Started Exploratory Parsing</title>
    <updated>2011-07-03T00:00:00-07:00</updated>
    <author>
      <name>ward</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;A parser reads text to discover structure and meaning. For example, a C language parser can read a C program and understand in a real sense everything that the program has to say. Contrast this to a pattern matcher, such as regular-expression matching, which can find fragments of a program useful in editing but can&amp;#8217;t keep track of enough context to make sense of a whole program.&lt;/p&gt;

&lt;p&gt;We often use the unix grep utility to look through large files. By applying a regular-expression match to each line, grep is able to report just the lines of interest. When we allow ourselves to grep repeatedly, driven by our curiosity, responding to each answer grep provides with another question, when we do this we are exploring.&lt;/p&gt;

&lt;p&gt;The internet is full of text that defies understanding in any sense with simple pattern matching. In response AboutUs built an environment for exploring the internet interactively, using parsers constructed on a whim, returning matches in within the context described by the explorer.&lt;/p&gt;

&lt;h2 id='exploring_the_world_fact_book'&gt;Exploring the World Fact Book&lt;/h2&gt;

&lt;p&gt;The AboutUs exploratory parsing environment has been released as open source on GitHub. Included with the release are scripts to download The World Factbook and English Wikipedia as sample texts for exploring. Let&amp;#8217;s take a look at the Factbook.&lt;/p&gt;

&lt;p&gt;When exploring, we start by looking for something that we know is there. We&amp;#8217;ll start by looking for short strings of characters without any regard for where they are.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;char = &amp;lt;&amp;lt; ......... &amp;gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Our parser uses a dot (.) to match any character. We say we&amp;#8217;re looking for a few of them. And when we find them we want to see them so we add &amp;#8220;eye-balls&amp;#8221; around the dots to tell the parser to remember some of the matches.&lt;/p&gt;

&lt;p&gt;&lt;img src='/images/parser/PastedGraphic-14.png' alt='' /&gt; &lt;img src='/images/parser/PastedGraphic-5.png' alt='' /&gt;&lt;/p&gt;

&lt;p&gt;This says that our parser found over a quarter million matches. When we ask to see some, it shows us the text on the right. This is just a sample match. When the sample was taken the text shown in green had been matched, and that highlighted in yellow is the specific match sampled. This data looks like keys and values separated by colons.&lt;/p&gt;

&lt;p&gt;Let&amp;#8217;s look for keys and values by describing what we think we know about the file. We&amp;#8217;ll offer the parser an alternative for text that isn&amp;#8217;t key-value pairs as we understand them.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;fact = key value | other_char
key = whitespace &amp;lt;&amp;lt; word+ &amp;gt;&amp;gt; &amp;#39;:&amp;#39;
value = &amp;lt;&amp;lt; ( !key . )+ &amp;gt;&amp;gt;
word = [A-Za-z]+ &amp;#39; &amp;#39;*
whitespace = &amp;#39;\n&amp;#39; &amp;#39; &amp;#39;*
other-char = &amp;lt;&amp;lt; . &amp;gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now we&amp;#8217;re getting pretty specific as to what we mean by key and value. We say 1) a key starts with whitespace, 2) the key has one or more words, 3) the words end with a colon, and 4) we only care to have eyeballs on the words of the key. Is happy to read the whole file.&lt;/p&gt;

&lt;p&gt;&lt;img src='/images/parser/PastedGraphic-1.png' alt='' /&gt; &lt;img src='/images/parser/PastedGraphic-6.png' alt='' /&gt;&lt;/p&gt;

&lt;p&gt;The parser found 19,498 keys and an equal number of values. Makes sense. When we look at the sample keys we find a few surprises. Most are capitalized but not all. The subcategories of Imports are lower case. Interesting. Lets see how wide spread this convention is.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;key = whitespace &amp;lt;&amp;lt; ( upper | lower ) &amp;gt;&amp;gt; &amp;#39;:&amp;#39;
upper = &amp;lt;&amp;lt; [A-Z] word+ &amp;gt;&amp;gt;
lower = &amp;lt;&amp;lt; word+ &amp;gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src='/images/parser/PastedGraphic-2.png' alt='' /&gt; &lt;img src='/images/parser/PastedGraphic-10.png' alt='' /&gt;&lt;/p&gt;

&lt;p&gt;When we look at some lowers we see the expected &amp;#8220;commodities&amp;#8221; and &amp;#8220;partners&amp;#8221; and one more outlier, the login instructions to get the Factbook from Project Gutenberg. Let&amp;#8217;s separate out the familiar to see what other lower-case keywords might exist.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;lower = &amp;lt;&amp;lt; ( familiar | other-key ) &amp;gt;&amp;gt;
familiar = &amp;lt;&amp;lt; ( &amp;#39;commodities&amp;#39; | &amp;#39;partners&amp;#39; ) &amp;gt;&amp;gt;
other-key = &amp;lt;&amp;lt; word+ &amp;gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src='/images/parser/PastedGraphic-13.png' alt='' /&gt; &lt;img src='/images/parser/PastedGraphic-11.png' alt='' /&gt;&lt;/p&gt;

&lt;p&gt;Now we&amp;#8217;re down to 77 out of 20 thousand keys. Sampling these we see more that make sense and should be added to the familiar category. We also see a few places where our parsing rules are clearly not working as intended. We could tighten up the rules by saying just how much whitespace we expect before a key, or how many words we expect, or just discover the words that should be familiar and ignore the rest. We have options.&lt;/p&gt;

&lt;p&gt;We also have other branches in our parse to explore. We haven&amp;#8217;t even begun to parse the values. We could, for example, select out &amp;#8220;Climate&amp;#8221; and see how many ways climate is described in the Factbook. Maybe we do the same for &amp;#8220;Terrain&amp;#8221;. Maybe we correlate phrases we find within the two and get some insight into how the two are related. We don&amp;#8217;t have to just sample parser matches. We can take the text of interesting matches and feed that into other programs.&lt;/p&gt;

&lt;h2 id='get_the_exploratory_parser'&gt;Get the Exploratory Parser&lt;/h2&gt;

&lt;p&gt;We&amp;#8217;ve been using a tool made out of two parts, both of them available to other programmers under AboutUs on GitHub. One is &lt;a href='https://github.com/AboutUs/pegleg'&gt;our fork&lt;/a&gt; of Ian Piumarta&amp;#8217;s peg/leg parser generator. The &lt;a href='https://github.com/AboutUs/exploratory-parsing'&gt;other&lt;/a&gt; is our parsing experiment management system.&lt;/p&gt;

&lt;p&gt;The parser generator is written in C and could be rough going for programmers who haven&amp;#8217;t studied compilers at some point in their lives. We&amp;#8217;ve only modified peg/leg as we found our unusual approach to parsing was not anticipated by Ian. Ian provides documentation on his website.&lt;/p&gt;

&lt;p&gt;The experiment manager is a web application written in Ruby to run under Mac or Unix. We run it on our laptops and in Amazon&amp;#8217;s EC2 cloud. We&amp;#8217;ve described how we install it in our GitHub ReadMe. Your Mileage May Vary.&lt;/p&gt;

&lt;h2 id='thinking_different'&gt;Thinking Different&lt;/h2&gt;

&lt;p&gt;We think we&amp;#8217;ve opened up a whole new way to use technology. This can happen when one takes some assumed requirement and reverse it. Wiki, for example, reversed the assumption that only the owner should edit the pages of a web site. Parser generators have traditionally been used to describe exactly what should be written and anything else is a &amp;#8220;syntax error&amp;#8221;. Wiki allows writers to write what they think makes sense. With exploratory parsing we now have a way for the parser writer to discover what has been written after the fact. This inversion of control mirrors the original thinking behind wiki. Let those who know write as they see fit. Trust people to be regular enough to create lasting value. Use the power of our modern computers and networks to organize that value.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/05/17/agile-vs--open-source--they-re-actually-a-little-different</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/05/17/agile-vs--open-source--they-re-actually-a-little-different.html" />
    <title>Agile vs. Open Source: They're Actually a Little Different</title>
    <updated>2011-05-17T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;Last night I was hanging out at the North Portland Coder&amp;#8217;s Night (&lt;a href='http://calagator.org/events/search?tag=nopoconi'&gt;nopoconi&lt;/a&gt;) and Ward and I got to talking about the difference between Agile Software Development and Open Source Software Development. I&amp;#8217;d always thought of the two together, but Ward had some good insight into the differences in these two cultures and methodologies that I realized fit well with my own experience.&lt;/p&gt;

&lt;p&gt;While there&amp;#8217;s a lot of similarities between Agile and OSS philosophies, there&amp;#8217;s a key fundamental difference. Agile methodologies like &lt;a href='http://c2.com/cgi/wiki?ExtremeProgramming'&gt;Extreme Programming&lt;/a&gt; are about getting the &lt;a href='http://c2.com/cgi/wiki?AllEngineersInOneRoom'&gt;right people in the room&lt;/a&gt;. Face to face communication is much more efficient, says XP, and the best way to get results is to have a small, empowered team of smart people in clear and constant communication.&lt;/p&gt;

&lt;p&gt;Open Source starts with a different idea; code should be available and editable by anyone. Letting the people in the room make all the decisions isn&amp;#8217;t fair to the people that &lt;em&gt;aren&amp;#8217;t&lt;/em&gt; in the room. What&amp;#8217;s most important is that the processes are in place that allow people all over the world to contribute to the codebase and improve the project. This means being set up to review patches, and have discussions (in IRC, forums, wikis) that anyone can participate in. While agile strives to put a tight-knit team in close quarters, OSS strives to create a community that anyone can participate in, despite their location or circumstance.&lt;/p&gt;

&lt;p&gt;While we were talking about this, I began to realize I&amp;#8217;d seen these two patterns in lots of software companies. For example, at AboutUs, we all work in one big room together (Ward once called it a Wiki space in &lt;a href='http://www.youtube.com/watch?v=I_75NoC85TE&amp;amp;feature=player_embedded'&gt;video&lt;/a&gt;). We have a small team. When we have a question we can yell across the room. That&amp;#8217;s agile. A good friend of mine works on the Firefox team at Mozilla. His team members are all around the world and do most of their communication over IRC, skype, and through issue trackers.&lt;/p&gt;

&lt;p&gt;A lot of times we talk about building software as a choice between &lt;a href='http://c2.com/cgi/wiki?BigDesignUpFront'&gt;Big Design Up Front&lt;/a&gt; and Agile methodologies, but in fact there&amp;#8217;s a lot more options out there that we can draw upon to build great software.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/05/05/rodents-of-unusual-size</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/05/05/rodents-of-unusual-size.html" />
    <title>Rodents Of Unusual Size</title>
    <updated>2011-05-05T00:00:00-07:00</updated>
    <author>
      <name>matt</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;h3 id='you_keep_using_that_word_i_do_not_think_it_means_what_you_think_it_means'&gt;&amp;#8220;You keep using that word. I do not think it means what you think it means.&amp;#8221;&lt;/h3&gt;
&lt;iframe src='http://www.youtube.com/embed/D58LpHBnvsI' frameborder='0' height='349' width='425'&gt;
&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;
&lt;p&gt;At AboutUs there is no formal code review process. We collectively own the code and we pair a ton, so most code gets seen by at least two people before it hits production. We also have a &lt;a href='http://dev.aboutus.org/2011/04/19/our-continuous-integration-is-a-big-red-flashing-light.html'&gt;Big Red Light&lt;/a&gt; to keep us in line. To keep even better coverage, we have a daily habit of reviewing and commenting on the commits made by others in Github. This is incredibly useful. Most of the chatter is just run of the mill knowledge sharing, but sometimes some good conversations pop up. One of these conversations started the other day when one of our devs used that Princess Bride quote above in response to a commit.&lt;/p&gt;

&lt;p&gt;One of the benefits of working at AboutUs is that we have Ward Cunningham as an advisor and contributor. Ward was CTO when I came on board and although he&amp;#8217;s moved on to other projects he&amp;#8217;s kept a close relationship with AboutUs. Ward often chimes in on commits, and this conversation was no different. Along with his usual pearls of wisdom he mentioned this:&lt;/p&gt;

&lt;p&gt;&lt;img src='../../../images/princess_bride_xp.png' alt='Aside: Kent Beck declared Princess Bride to be the official movie of the Extreme Programming (agile) movement.' /&gt;&lt;/p&gt;

&lt;p&gt;You really can learn something new every day. Chatting in Github commits is a great way to do that.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/05/03/one-and-a-half-minds-are-better-learning-by-pairing</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/05/03/one-and-a-half-minds-are-better-learning-by-pairing.html" />
    <title>One and a Half Minds are Better: Learning by Pairing</title>
    <updated>2011-05-03T00:00:00-07:00</updated>
    <author>
      <name>brad</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;My name is Brad Heller and I&amp;#8217;ve just started at AboutUs.org as an agile developer! AboutUs.org has a pretty interesting way to ramping their new guys: Pairing! To be honest, it&amp;#8217;s not not really all that different from how we &lt;em&gt;normally&lt;/em&gt; do our work, but it is a great way to learn about a company&amp;#8217;s technology if you&amp;#8217;re new.&lt;/p&gt;

&lt;p&gt;In reality, a lot of agile shops introduce people to their stack via pair programming, either formally or informally. Here are a few notes I&amp;#8217;ve compiled from my experience thus far.&lt;/p&gt;

&lt;h2 id='use_a_workstation_that_has_multiple_input_devices'&gt;Use a workstation that has multiple input devices.&lt;/h2&gt;

&lt;p&gt;This is good general advice for pairing. At AboutUs, we have pairing stations with two keyboards, two mice, and one gigantic monitor. This allows you to jump in quickly when you feel inspired, without having to shuffle hardware.&lt;/p&gt;

&lt;h2 id='use_a_tool_set_that_you_are_both_familiar_with'&gt;Use a tool set that you are both familiar with.&lt;/h2&gt;

&lt;p&gt;You want to learn the technology, not the tools (unless, of course, the tools are critical to the technology). Spending a lot of time explaining how the tools work interrupts the &amp;#8220;knowledge stream&amp;#8221; you build in doing the work together.&lt;/p&gt;

&lt;h2 id='figure_out_a_system_that_allows_you_to_take_notes_without_getting_lost'&gt;Figure out a system that allows you to take notes without getting lost.&lt;/h2&gt;

&lt;p&gt;This is actually a lot more difficult than it sounds. When you&amp;#8217;re pairing, the person in the know often moves pretty quickly, so taking notes can be hard. I don&amp;#8217;t know how many times I&amp;#8217;ve asked the same question for the Nth time because I didn&amp;#8217;t take good notes!&lt;/p&gt;

&lt;h2 id='be_aware_of_your_partners_working_style_and_pair_with_someone_whos_compatible'&gt;Be aware of your partners working style and pair with someone who&amp;#8217;s compatible.&lt;/h2&gt;

&lt;p&gt;Avoid problems that stem from the different working styles by pairing with a mentor who works similarly to you. For example, I tend to want to move very quickly through problems by trying many solutions and rolling back anything that doesn&amp;#8217;t work. Then, when I do find a working solution, I reflect on it to figure out if it&amp;#8217;s optimal or not. I&amp;#8217;ve found that Thomas and I work well together (as far as I&amp;#8217;ve been able to tell, anyway) because his working style is similar to mine.&lt;/p&gt;

&lt;h2 id='dont_be_afraid_to_drive'&gt;Don&amp;#8217;t be afraid to drive.&lt;/h2&gt;

&lt;p&gt;Few people can learn by watching alone. You have to get your hands dirty if you want to learn! Besides, this is the best time dive in, as you&amp;#8217;ve got a saftey net in case you mess up (erm, that is &lt;em&gt;when&lt;/em&gt; you mess up).&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/05/02/agile-metrics</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/05/02/agile-metrics.html" />
    <title>Agile Metrics: Tracking Numbers that Matter</title>
    <updated>2011-05-02T00:00:00-07:00</updated>
    <author>
      <name>jd</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;Many agile teams keep track of &lt;a href='http://c2.com/cgi/wiki?ProjectVelocity'&gt;velocity&lt;/a&gt; as a measure of their performance.&lt;sup id='fnref:1'&gt;&lt;a href='#fn:1' rel='footnote'&gt;1&lt;/a&gt;&lt;/sup&gt; This is a useful measure, but it shouldn&amp;#8217;t be the only one. This past week we realized that another performance metric of an agile team should be how much work it &lt;em&gt;avoids&lt;/em&gt; doing.&lt;/p&gt;

&lt;p&gt;During a discussion with stakeholders last week, we realized that much of the work in our queue was a temporary fix for a problem we would ultimately solve at a later date. Thanks to the stakeholders keeping their own planning queue in a visible location, the development team was able to see that the proper fix for the problem was scheduled for just a few weeks later. We engaged the primary stakeholder in a discussion about the requirements of the proper fix, and deemed that it would be no more work than the temporary fix. Naturally, we started work on the proper fix, abandoning the temporary one &amp;#8212; and eliminated nearly a week’s worth of work!&lt;/p&gt;

&lt;p&gt;Velocity is a great metric to track, but perhaps agile teams should also get in the habit of tracking work avoided on account of successful communication. After all, the best code is no code at all.&lt;/p&gt;
&lt;div class='footnotes'&gt;&lt;hr /&gt;&lt;ol&gt;&lt;li id='fn:1'&gt;
&lt;p&gt;This, of course, is not the recommended use for velocity, but remains a method of measuring performance.&lt;/p&gt;
&lt;a href='#fnref:1' rev='footnote'&gt;&amp;#8617;&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/04/29/meet-the-kanban-board</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/04/29/meet-the-kanban-board.html" />
    <title>Meet The Kanban Board</title>
    <updated>2011-04-29T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;A lot of people talk about agile. We try to be agile. Working with one of the &lt;a href='http://agilemanifesto.org/authors.html'&gt;signers&lt;/a&gt; of the Agile Manifesto doesn&amp;#8217;t hurt.&lt;/p&gt;

&lt;p&gt;Our day to day process revolves around a plexiglass kanban board. I&amp;#8217;d like to show it to you.&lt;/p&gt;

&lt;h4 id='meet_the_kanban_board'&gt;Meet the kanban board.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban5.jpg' alt='aboutus kanban board' /&gt;&lt;/p&gt;

&lt;p&gt;The kanban board does a lot for us. It radiates information. It&amp;#8217;s been called an &lt;em&gt;agile innovation&lt;/em&gt;.&lt;/p&gt;

&lt;h4 id='it_helps_us_remember_what_were_working_on'&gt;It helps us remember what we&amp;#8217;re working on.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban7.jpg' alt='kanban organize our work' /&gt;&lt;/p&gt;

&lt;h4 id='we_know_whos_working_on_what'&gt;We know who&amp;#8217;s working on what.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban2.jpg' alt='kanban whos doing what' /&gt;&lt;/p&gt;

&lt;h4 id='we_can_see_whats_coming_up_too'&gt;We can see what&amp;#8217;s coming up too.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban6.jpg' alt='kanban planned column' /&gt; &lt;br /&gt;&lt;br /&gt; We try to keep Martin on low hanging features.&lt;/p&gt;

&lt;h4 id='it_tells_us_when_too_much_is_going_on'&gt;It tells us when too much is going on.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban4.jpg' alt='kanban too much going on' /&gt; &lt;br /&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h4 id='and_sometimes_it_breaks_and_we_have_to_change_it'&gt;And sometimes it breaks and we have to change it.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban1.jpg' alt='kanban sometimes we have to change it' /&gt; &lt;br /&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h4 id='it_gives_us_some_time_to_decide_if_things_are_really_done'&gt;It gives us some time to decide if things are really done.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban8.jpg' alt='kanban accept column' /&gt; &lt;br /&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h4 id='and_it_asks_us_to_show_them_off_when_they_are'&gt;And it asks us to show them off when they are.&lt;/h4&gt;

&lt;p&gt;&lt;img src='/images/meet-kanban/kanban9.jpg' alt='kanban demo column' /&gt; &lt;br /&gt;&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;After spending so much time working with this board it&amp;#8217;s hard for me to imagine us doing our work without it. We&amp;#8217;ve had to change and tweak it many times. As the way we work evolves, and our team changes we&amp;#8217;ve had to adapt it to better reflect our work. The goal is to be able to look at it, at any moment, and know exactly what the status of things are. It isn&amp;#8217;t always perfect, but it cuts down a lot on the effort we must put towards coordinating our efforts. We keep trying to get more aspects of our work reflected on the board, and that&amp;#8217;s almost never a mistake. Everyone in the company can use it to see and communicate their work, and move cards across.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/04/22/playing-with-sedgewick's-data</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/04/22/playing-with-sedgewick%27s-data.html" />
    <title>Playing With Sedgewick's Data</title>
    <updated>2011-04-22T00:00:00-07:00</updated>
    <author>
      <name>ward</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;Pittsburg University CS Professor Robert Sedgewick suggests students will learn better and stay engaged when writing programs that make sense of real data. I suggested to the newly forming Portland Data Science meetup that we might likewise benefit from exploring data together and gave Sedgewick&amp;#8217;s state adjacencies as an example. Here is where I found the data and the first few lines of the file:&lt;/p&gt;

&lt;p&gt;&lt;a href='http://introcs.cs.princeton.edu/data/contiguous-usa.dat'&gt;http://introcs.cs.princeton.edu/data/contiguous-usa.dat&lt;/a&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;AL FL
AL GA
AL MS
AL TN
AR LA
AR MO
AR MS &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I saw this when I was reading something else Sedgewick wrote and thought, what would graphviz do with this?&lt;/p&gt;
&lt;img src='/images/contiguous-usa.png' alt='Contiguous USA Graph' width='640' /&gt;
&lt;p&gt;It did pretty well, I&amp;#8217;d say. In fact it did too well. How did graphviz know that WA was in the northwest?&lt;/p&gt;

&lt;p&gt;You can see that it did get the northeast upside down. That&amp;#8217;s comforting. I know more about geography than graphviz.&lt;/p&gt;

&lt;p&gt;This raises the question, what little extra bit of information would allow a much better map? Some ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the length or orientation of the border&lt;/li&gt;

&lt;li&gt;the size of the state in square miles&lt;/li&gt;

&lt;li&gt;the state&amp;#8217;s voting record&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the perl program I used to convert the dataset to dot format:&lt;/p&gt;

&lt;p&gt;&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='perl'&gt;&lt;span class='nv'&gt;@lines&lt;/span&gt; &lt;span class='o'&gt;=&lt;/span&gt; &lt;span class='sb'&gt;`cat contiguous-usa.txt`&lt;/span&gt;&lt;span class='p'&gt;;&lt;/span&gt;
&lt;span class='nb'&gt;open&lt;/span&gt; &lt;span class='n'&gt;D&lt;/span&gt;&lt;span class='p'&gt;,&lt;/span&gt; &lt;span class='s'&gt;&amp;quot;&amp;gt;contiguous-usa.dot&amp;quot;&lt;/span&gt;&lt;span class='p'&gt;;&lt;/span&gt;
&lt;span class='k'&gt;print&lt;/span&gt; &lt;span class='n'&gt;D&lt;/span&gt; &lt;span class='s'&gt;&amp;quot;graph US {\nnode [style=filled,color=yellow]&amp;quot;&lt;/span&gt;&lt;span class='p'&gt;;&lt;/span&gt;
&lt;span class='k'&gt;for&lt;/span&gt; &lt;span class='p'&gt;(&lt;/span&gt;&lt;span class='nv'&gt;@lines&lt;/span&gt;&lt;span class='p'&gt;)&lt;/span&gt; &lt;span class='p'&gt;{&lt;/span&gt; &lt;span class='k'&gt;print&lt;/span&gt; &lt;span class='n'&gt;D&lt;/span&gt; &lt;span class='s'&gt;&amp;quot;$1 -- $2;\n&amp;quot;&lt;/span&gt; &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='sr'&gt;/(\w\w) (\w\w)/&lt;/span&gt;&lt;span class='p'&gt;;&lt;/span&gt; &lt;span class='p'&gt;}&lt;/span&gt;
&lt;span class='k'&gt;print&lt;/span&gt; &lt;span class='n'&gt;D&lt;/span&gt; &lt;span class='s'&gt;&amp;quot;}\n&amp;quot;&lt;/span&gt;&lt;span class='p'&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;&lt;/p&gt;

&lt;p&gt;I consider this kind of programming a warm-up for serious work. But warming up is very important in a field where the opportunities are so diverse.&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/04/19/our-continuous-integration-is-a-big-red-flashing-light</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/04/19/our-continuous-integration-is-a-big-red-flashing-light.html" />
    <title>Our Continuous Integration Is A Big Red Flashing Light</title>
    <updated>2011-04-19T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;Continuous Integration is a pretty important part of any agile workflow. Having your ci server run your test suite whenever anyone pushes code means never having to argue about who broke the build.&lt;/p&gt;

&lt;p&gt;A couple months ago we took it a step further. Instead of just having the CI server send an email when the build breaks, we have a big red light in our office which starts flashing.&lt;/p&gt;

&lt;h2 id='the_ingredients'&gt;The ingredients:&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;A big red flashing light I found a Goodwill&lt;/li&gt;

&lt;li&gt;A &lt;a href='http://www.pjrc.com/teensy/'&gt;Teensy 2.0 microcontroller&lt;/a&gt; donated by Ward Cunningham&lt;/li&gt;

&lt;li&gt;A vacation light timer, hacked by Matt Youell&lt;/li&gt;

&lt;li&gt;A mac mini, running bash and curl&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id='how_we_made_it_work'&gt;How we made it work:&lt;/h2&gt;

&lt;p&gt;On my way back into Portland one weekend I stopped at the Goodwill and spotted a red light on a shelf with wires hanging out of it. I ponied up the $6.99 to buy it, having no idea if it actually functioned. I took it home and proceeded to test it.&lt;/p&gt;
&lt;iframe title='YouTube video player' src='http://www.youtube.com/embed/hwKzYv9IekI' frameborder='0' height='390' width='480'&gt;
&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;
&lt;p&gt;Realizing this would be the perfect addition to our CI setup, I enlisted to help of active &lt;a href='http://dorkbotpdx.org/'&gt;DorkBotPDXer&lt;/a&gt; Ward Cunningham. He gave me a Teensy 2.0 that he&amp;#8217;d gotten from Teensy creator Paul Stoffregen. He also talked me into going to the next DorkBot meetup, to get some help using it.&lt;/p&gt;

&lt;p&gt;&lt;img src='/images/IMG_0537.jpg' alt='teensy 2.0' /&gt;&lt;/p&gt;

&lt;p&gt;I went to the meetup and sat down with Paul to get some help with the Teensy. I flashed his &lt;a href='http://www.pjrc.com/teensy/usb_serial.html'&gt;USB Serial&lt;/a&gt; shell onto the microcontroller. With this code installed on the Teensy I could plug it into a USB, and toggle the built in LED on or off with shell commands like this:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='bash'&gt;&lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;d6=1&amp;quot;&lt;/span&gt; &amp;gt; /dev/cu.usbmodem12341
&lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;d6=0&amp;quot;&lt;/span&gt; &amp;gt; /dev/cu.usbmodem12341
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Progress.&lt;/p&gt;

&lt;p&gt;Now that I could control the Teensy&amp;#8217;s LED programmatically, the next challenge was hooking it up to the big red light. Matt Youell had done electrical work in the past. Apparently powering a tiny LED is dramatically different from running a 120V light, which opens the risk of frying your equipment, fires, and fire marshals. Despite this Matt offered to help. He knew we&amp;#8217;d need some type of relay. He took the setup home, and came back a few days later with it wired to vacation light timer he had lying around.&lt;/p&gt;

&lt;p&gt;&lt;img src='/images/IMG_0523.jpg' alt='Vacation Light Timer' /&gt;&lt;/p&gt;

&lt;p&gt;For a while after he wired it we were nervous we would come in Monday morning to a burned down office but it&amp;#8217;s been working great for months now.&lt;/p&gt;

&lt;p&gt;The last step was getting the whole rig hooked up to our CI server. We use &lt;a href='https://github.com/thoughtworks/cruisecontrol.rb'&gt;CruiseControl.rb&lt;/a&gt; to run our builds. I wrote a simple bash script that hits the server and toggles the pin 6 (the light) on the Teensy based on the response. It looks something like this:&lt;/p&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='bash'&gt;&lt;span class='c'&gt;#!/bin/sh&lt;/span&gt;
&lt;span class='c'&gt;# Monitor cruisecontrol and trigger red light when there&amp;#39;s a broken build.&lt;/span&gt;
&lt;span class='c'&gt;# Also turn the light on when we don&amp;#39;t get a 200 response from the server.&lt;/span&gt;

&lt;span class='nv'&gt;bad_requests&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;0
&lt;span class='k'&gt;while&lt;/span&gt; &lt;span class='o'&gt;[&lt;/span&gt; &lt;span class='nb'&gt;true&lt;/span&gt; &lt;span class='o'&gt;]&lt;/span&gt;; &lt;span class='k'&gt;do&lt;/span&gt;
&lt;span class='k'&gt; &lt;/span&gt;&lt;span class='nv'&gt;ci_url&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;http://ci.aboutus.com/XmlStatusReport.aspx
 &lt;span class='nv'&gt;response&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;&lt;span class='sb'&gt;`&lt;/span&gt;curl -i --max-time 5 -s -u user:pw &lt;span class='nv'&gt;$ci_url&lt;/span&gt;&lt;span class='sb'&gt;`&lt;/span&gt;

 &lt;span class='c'&gt;# count how many times we&amp;#39;ve gotten a non-200 response from ci&lt;/span&gt;
 &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='o'&gt;[&lt;/span&gt; &lt;span class='sb'&gt;`&lt;/span&gt;&lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='nv'&gt;$response&lt;/span&gt; | grep &lt;span class='s1'&gt;&amp;#39;HTTP/1.1 200 OK&amp;#39;&lt;/span&gt; | wc -l&lt;span class='sb'&gt;`&lt;/span&gt; -ne 1 &lt;span class='o'&gt;]&lt;/span&gt; ; &lt;span class='k'&gt;then&lt;/span&gt;
&lt;span class='k'&gt;   &lt;/span&gt;&lt;span class='nv'&gt;bad_requests&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;&lt;span class='sb'&gt;`&lt;/span&gt;expr &lt;span class='nv'&gt;$bad_requests&lt;/span&gt; + 1&lt;span class='sb'&gt;`&lt;/span&gt;
 &lt;span class='k'&gt;else&lt;/span&gt;
&lt;span class='k'&gt;   &lt;/span&gt;&lt;span class='nv'&gt;bad_requests&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;0
 &lt;span class='k'&gt;fi&lt;/span&gt;

 &lt;span class='c'&gt;# turn the light on when there&amp;#39;s a build failure or we&amp;#39;ve had 3 consecutive&lt;/span&gt;
 &lt;span class='c'&gt;# non-200 responses from the ci server.&lt;/span&gt;
 &lt;span class='k'&gt;if&lt;/span&gt; &lt;span class='o'&gt;[&lt;/span&gt; &lt;span class='sb'&gt;`&lt;/span&gt;&lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='nv'&gt;$response&lt;/span&gt; | grep &lt;span class='s1'&gt;&amp;#39;lastBuildStatus=&amp;quot;Failure&amp;quot;&amp;#39;&lt;/span&gt; | wc -l&lt;span class='sb'&gt;`&lt;/span&gt; -gt 0 &lt;span class='o'&gt;]&lt;/span&gt; &lt;span class='se'&gt;\&lt;/span&gt;
      &lt;span class='o'&gt;||&lt;/span&gt; &lt;span class='o'&gt;[&lt;/span&gt; &lt;span class='nv'&gt;$bad_requests&lt;/span&gt; -gt 2 &lt;span class='o'&gt;]&lt;/span&gt;; &lt;span class='k'&gt;then&lt;/span&gt;
   &lt;span class='o'&gt;(&lt;/span&gt;sleep 1; &lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;d6=1&amp;quot;&lt;/span&gt;&lt;span class='o'&gt;)&lt;/span&gt; &amp;gt; /dev/cu.usbmodem12341
 &lt;span class='k'&gt;else&lt;/span&gt;
   &lt;span class='o'&gt;(&lt;/span&gt;sleep 1; &lt;span class='nb'&gt;echo&lt;/span&gt; &lt;span class='s2'&gt;&amp;quot;d6=0&amp;quot;&lt;/span&gt;&lt;span class='o'&gt;)&lt;/span&gt; &amp;gt; /dev/cu.usbmodem12341
 &lt;span class='k'&gt;fi&lt;/span&gt;
&lt;span class='k'&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;I set this up as a startup item on our reception computer and bam!, we were done.&lt;/p&gt;

&lt;p&gt;The complete setup looks a little like this:&lt;/p&gt;

&lt;p&gt;&lt;img src='/images/IMG_0536.jpg' alt='Build Indicator Setup' /&gt;&lt;/p&gt;

&lt;h2 id='why_its_awesome'&gt;Why It&amp;#8217;s Awesome&lt;/h2&gt;

&lt;p&gt;Having this set up has been a big win for our team. It makes it even more obvious when someone has broken the build, and decrease the amount of focus we need to devote to monitoring the CI server. It&amp;#8217;s an in-your-face &amp;#8220;don&amp;#8217;t deploy now&amp;#8221; indicator which is great for a team that typically pushes code to production several times per day.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s also had the interesting effect of making the non-developers we work with aware of how continuous integration works and why it&amp;#8217;s important. Now they know when the build is broken as soon as we do. They know we&amp;#8217;re running tests, and that the test failures control the light; a big flashing red light never means good. Tests are important. They protect us and they help us developers, and the whole company be more agile.&lt;/p&gt;
&lt;iframe title='YouTube video player' src='http://www.youtube.com/embed/Sdsd2HwsfHs' frameborder='0' height='390' width='640'&gt;
&lt;/iframe&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/04/18/sort-==-srot</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/04/18/sort-%3D%3D-srot.html" />
    <title>Sort == Srot</title>
    <updated>2011-04-18T00:00:00-07:00</updated>
    <author>
      <name>thomas</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;This may be something well-known to a lot of people, but it&amp;#8217;s one that I just recently found out about. We were working with large volumes of data for our &lt;a href='http://www.aboutus.org/Learn/Keep-Track-of-Inbound-Links'&gt;redirectory&lt;/a&gt; feature on (http://www.aboutus.org). For part of this we had to sort about 100 GB of tab-separtated data. Because we wanted to do the simplest thing that would work, GNU &lt;code&gt;sort&lt;/code&gt; came to the rescue.&lt;/p&gt;

&lt;p&gt;For those of you who aren&amp;#8217;t familiar with &lt;code&gt;sort&lt;/code&gt;, it is a standard UNIX/Linux tool that sorts the input it gets on standard in, and returns the new list on stdout. It has all kinds of options to control the sort order, algorithms used, etc., but some of them aren&amp;#8217;t too obvious. For the basic list, check out the &lt;a href='http://linux.die.net/man/1/sort'&gt;man-page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Before I go on, a brief exercise. Lexically sort the following lines:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt; foo
 far
 f.o&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you guessed &amp;#8220;f,o, far, foo&amp;#8221;, congratulations! You&amp;#8217;re a sane, normal person. If, on the other hand, you guess &amp;#8220;far, f.o, foo&amp;#8221;, you are GNU &lt;code&gt;sort&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This was a serious problem for us because it meant that some of our data was out of order, making our processing scripts angry. We would have &lt;code&gt;go-daddy.com&lt;/code&gt; stuffed in the middle of a long list of &lt;code&gt;godaddy.com&lt;/code&gt;&amp;#8217;s. Not good.&lt;/p&gt;

&lt;p&gt;The solution lies in a little-known environment variable that controls your terminal language and character set, &amp;#8220;LC_ALL&amp;#8221; (actually, LC_LOCALE but LC_ALL works too). If you set that bit of magic to &amp;#8220;C&amp;#8221;, things work as you would expect.&lt;/p&gt;

&lt;p&gt;By default on most modern linux installs, the system locale defaults to US-english using UTF-8 character set. Beyond setting language and such, it also sets things like how characters are ordered as far as a computer is concerned. For whatever reason, punctuation is handled as &amp;#8220;higher&amp;#8221; than letters, even though they are &amp;#8220;lower&amp;#8221; in the standard character set charts, and because &lt;code&gt;sort&lt;/code&gt; uses &lt;code&gt;strcmp&lt;/code&gt;, character ordering is very important.&lt;/p&gt;

&lt;p&gt;So there you go. If you are trying to sort a ton of data, and it&amp;#8217;s not coming out sorted, give &lt;code&gt;export LC_ALL=C&lt;/code&gt; a try. It worked for me!&lt;/p&gt;</content>
  </entry>
  
  <entry>
    <id>http://dev.aboutus.org/2011/04/08/initial-commit</id>
    <link type="text/html" rel="alternate" href="http://dev.aboutus.org/2011/04/08/initial-commit.html" />
    <title>Initial Commit</title>
    <updated>2011-04-08T00:00:00-07:00</updated>
    <author>
      <name>sam</name>
      <uri>http://dev.aboutus.org/</uri>
    </author>
    <content type="html">&lt;p&gt;This is the first post for AboutUs&amp;#8217; programming blog. The blog&amp;#8217;s powered by git, jekyll and nginx. To create a post we create a text file in the git repository in our favorite editor (i.e. vim). The posts go in a &lt;code&gt;_posts&lt;/code&gt; directory and look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog [master*] $ ls _posts

_posts:
2011-04-08-initial-commit.markdown&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href='http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html'&gt;Jekyll&lt;/a&gt; is used to convert the content textile, markdown, and html templates into a static html site.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog [master*] $ jekyll
Configuration from /www/aboutus/devblog/_config.yml
Building site: . -&amp;gt; ./_site
Successfully generated site: . -&amp;gt; ./_site&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Jekyll&amp;#8217;s &lt;code&gt;--auto&lt;/code&gt; option is pretty nice for development. It regenerates the site every time you save a file.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog [master*] $ jekyll --auto
Configuration from /www/aboutus/devblog/_config.yml
Auto-regenerating enabled: . -&amp;gt; ./_site
[2011-04-11 20:49:52] regeneration: 8 files changed
[2011-04-11 20:50:22] regeneration: 1 files changed
[2011-04-11 20:50:27] regeneration: 1 files changed&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;On the server jekyll generates static html files for nginx to serve.&lt;/p&gt;

&lt;p&gt;It barely has &lt;em&gt;any&lt;/em&gt; dependencies.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog # gem install jekyll --no-rdoc --no-ri
Building native extensions.  This could take a while...
Successfully installed liquid-2.2.2
Successfully installed fast-stemmer-1.0.0
Successfully installed classifier-1.3.3
Successfully installed directory_watcher-1.4.0
Successfully installed syntax-1.0.0
Successfully installed maruku-0.6.0
Successfully installed jekyll-0.10.0
7 gems installed&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There&amp;#8217;s a git post-receive hook that regenerates the site whenever someone pushes a change.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog # vi .git/hooks/post-receive&lt;/code&gt;&lt;/pre&gt;
&lt;div class='highlight'&gt;&lt;pre&gt;&lt;code class='bash'&gt;&lt;span class='c'&gt;#!/bin/sh&lt;/span&gt;
&lt;span class='c'&gt;#&lt;/span&gt;
&lt;span class='nv'&gt;GIT_REPO&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;/www/aboutus/devblog.git
&lt;span class='nv'&gt;TMP_GIT_CLONE&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;/tmp/devblog
&lt;span class='nv'&gt;PUBLIC_WWW&lt;/span&gt;&lt;span class='o'&gt;=&lt;/span&gt;/www/aboutus/devblog

git clone &lt;span class='nv'&gt;$GIT_REPO&lt;/span&gt; &lt;span class='nv'&gt;$TMP_GIT_CLONE&lt;/span&gt;
jekyll --no-auto &lt;span class='nv'&gt;$TMP_GIT_CLONE&lt;/span&gt; &lt;span class='nv'&gt;$PUBLIC_WWW&lt;/span&gt;
rm -Rf &lt;span class='nv'&gt;$TMP_GIT_CLONE&lt;/span&gt;
&lt;span class='nb'&gt;exit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;Nginx serves the static files in &lt;code&gt;/www/aboutus/devblog&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Deploying is just a &lt;code&gt;git push&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;devblog [master] $ git push
Counting objects: 23, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (12/12), 925 bytes, done.
Total 12 (delta 6), reused 0 (delta 0)
remote: Initialized empty Git repository in /tmp/devblog/.git/
remote: Configuration from /tmp/devblog/_config.yml
remote: Building site: /tmp/devblog -&amp;gt; /www/aboutus/devblog
remote: Successfully generated site: /tmp/devblog -&amp;gt; /www/aboutus/devblog
Killed by signal 1.
To blog@devblog:/www/aboutus/devblog.git
   12aa781..ab54f06  master -&amp;gt; master&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id='adding_syntax_highlighting'&gt;Adding Syntax Highlighting&lt;/h2&gt;

&lt;p&gt;To add syntax highlighting I just had to install Pygments, which is a cool python project.&lt;/p&gt;

&lt;p&gt;I did this with &lt;code&gt;easy_install&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ easy_install Pygments&lt;/code&gt;&lt;/pre&gt;</content>
  </entry>
  
 
</feed>

