<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">

  <title><![CDATA[Thoughts on Systems]]></title>
  
  <link href="http://www.emilsit.net/" />
  <updated>2012-11-23T00:48:16-05:00</updated>
  <id>http://www.emilsit.net/</id>
  <author>
    <name><![CDATA[Emil Sit]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/EmilSitMainBlog" /><feedburner:info uri="emilsitmainblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <title type="html"><![CDATA[SCNA 2012 summary]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/8me5T7Z6iMA/" />
    <updated>2012-11-23T00:36:00-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/scna-2012-summary</id>
    <content type="html">&lt;p&gt;&lt;a href="http://scna.softwarecraftsmanship.org/"&gt;Software Craftsmanship North America&lt;/a&gt; is an annual conference bringing
together software craftsmen&amp;#8212;developers who are interested in improving their
own ability to program. In his opening remarks at SCNA 2012, 8th Light&amp;#8217;s
co-founder, Micah Martin described the conference as the &amp;#8220;yearly attitude
adjustment&amp;#8221; for software craftsmen.&lt;/p&gt;

&lt;p&gt;The speakers covered topics ranging from professional and product development,
to engineering practices like testing and architecture, to theoretical CS
concepts like monoids and logic programming. I have a complete-ish set of
&lt;a href="http://flic.kr/s/aHsjCRZ3nt"&gt;notes on my Flickr&lt;/a&gt; but here are some highlights.&lt;/p&gt;

&lt;p&gt;Cory Foy talked about a model for teaching programmers
(&lt;a href="http://www.slideshare.net/CoryFoy/when-code-cries"&gt;slides&lt;/a&gt;) that starts with
work that has low context and low cognitive demand (such as katas, koans) and
brings them up to doing work with high context, high cognitive demand (such as
adding features and listening to code). This mirrors closely the 8th Light
apprenticeship model.  He also talked about how we need to learn to listen to
the code and not try to force it to do things that it is not suited for; to
listen requires understanding, to understand requires practice, and to practice
requires context.&lt;/p&gt;

&lt;p&gt;There were several discussions about apprenticeship. My sense is that 3 months
is enough time to train people in basic craftsmanship suitable for basic web
development (the equivalent of a semester, so maybe 4 courses worth). It obviously
isn&amp;#8217;t the ten thousand hours necessary to produce a master.  The successes
described also suggests that apprenticeship is not necessarily good at
producing developers that can be hired for other companies.  Of the 20 or so
apprentices trained by &lt;a href="http://apprentice.io"&gt;apprentice.io&lt;/a&gt; (a program at Thoughtbot to try to
commercialize apprenticeships), only one has been actually placed in an
external company despite over a hundred companies interested in hiring out of
the apprentice pool. On the other hand, they&amp;#8217;ve hired about eight themselves.
8th Light has similarly grown much of its current 20+ craftsmen through its
internal apprenticeship program.&lt;/p&gt;

&lt;p&gt;8th Light has shared their &lt;a href="https://groups.google.com/d/msg/sc-mentors/ooxRAXaJnqE/RqLu9Uv70rUJ"&gt;internal syllabus&lt;/a&gt;
for training craftsmen.  Thoughtbot, the team behind &lt;a href="http://apprentice.io"&gt;apprentice.io&lt;/a&gt;, has
also produced a set of &lt;a href="http://learn.thoughtbot.com/"&gt;basic trailmaps for learning basic techniques&lt;/a&gt;
that the community can &lt;a href="https://github.com/thoughtbot/trail-map"&gt;contribute to on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;m curious about adopting a more formal apprenticeship/mentoring program at
places not primarily doing web app development, and in particular at systems-y
companies like &lt;a href="http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt; (where time and money is
limited) and VMware (where there is both more existing training, and where
resources are less scarce). Certainly, some of the basic skills and culture
do need to be acquired, but so does the knowledge necessary to build a
distributed query execution engine or a shadow page table walker.&lt;/p&gt;

&lt;p&gt;Uncle Bob’s talk
(&lt;a href="http://vimeo.com/54025415"&gt;video&lt;/a&gt;/&lt;a href="http://www.flickr.com/photos/emilsit/8192749682/in/set-72157632026546461"&gt;summary&lt;/a&gt;)
spoke more broadly. He argued that we need to behave professionally because,
one day, some software glitch will result in lots of deaths (think
&lt;a href="http://en.wikipedia.org/wiki/Therac-25"&gt;Therac-25&lt;/a&gt;) and the world will demand
an answer from the tech industry. If we don&amp;#8217;t want government regulation, we
better behave professionally. As Uncle Bob put it, to be professional means
that we do not ship shit. That we say no. That we work as a team. And that we
learn continuously: Uncle Bob proposed upwards of 20 hours a week on our own.&lt;/p&gt;

&lt;p&gt;There were many talks about aspects of testing. Michael Feathers gave a
talk that sort of questioned some of ones assumptions about testing by focusing
on the value delivered by tests. He talked about, for example, deleting
tests&amp;#8212;if they no longer provide value. Value of tests can come from many
places: guiding the design of objects, detecting changes in behavior, acting
as documentation, guiding acceptance criteria.  The value of a test can change
over time and we should not over-venerate any specific test. He argued
that it is more appropriate to set a time budget for testing.&lt;/p&gt;

&lt;p&gt;Gary Bernhardt gave a beautiful talk about mixing functional programming and
object oriented programming. He noted that mocks and stubs cause tests to
become isolated from reality but that purely functional code does not require
mocking: it always behaves the same way given the same inputs.  Thus, he argued
that code should be structured to have a functional core surrounded by a more
imperative/OO shell that sequences the results with actions, a style he called
&amp;#8220;Faux-O&amp;#8221;. By focusing on providing values (functional results), we free the
computation from the execution model (for example, how Java Callable&amp;#8217;s can be
plugged into a variety of ExecutorServices).&lt;/p&gt;

&lt;p&gt;Justin Searls took a different tack to testing, bridging Michael and Gary&amp;#8217;s
talks in a sense. His big picture observation is that different kinds
of testing deliver different amounts of reality and we should choose tests
that give us the amount of reality we need. (He has a nice
&lt;a href="http://searls.testdouble.com/2012/04/01/types-of-tests/"&gt;taxonomy of tests&lt;/a&gt; on his blog.)
One takeaway from his talk is that we should adopt a standard
for what kind of testing we do and stick to it: he liked the &lt;a href="http://www.growing-object-oriented-software.com/"&gt;GOOS&lt;/a&gt;
style of using isolation tests to guide design and more end-to-end acceptance tests to
prove functionality, but listed a few others.&lt;/p&gt;

&lt;p&gt;Drilling down into more specific tools/techniques, Brian Marick gave a talk
about generating data for tests using logic programming, using an example in
Clojure. His goal was to ensure that he only says as much about the data used
for a test as is absolutely necessary for the test and to allow other aspects
of that data to vary; this can be achieved by writing a logic program to state
the test&amp;#8217;s requirements and allowing the runtime to solve for the right data.
In fact, you could imagine automatically testing all valid values that the
logic program generated, instead of just one (much like Guava&amp;#8217;s
&lt;a href="http://www.gamlor.info/wordpress/2012/09/google-guava-collection-test-suite/"&gt;Collections test suite builder&lt;/a&gt;
does more imperatively).  We have explored this idea for system-level testing
at both VMware and Hadapt, where it would be useful for tests to declare their
dependencies on the system (e.g., requires a system configured in a particular
way) and have the test framework automatically satisfy those dependencies in
some way that the test does not care about. Logic programming would provide
a way to bind the resulting dependencies to variables that could be used
by the test.&lt;/p&gt;

&lt;p&gt;Susan Potter gave a &lt;a href="http://www.flickr.com/photos/emilsit/8192749760/in/set-72157632026546461/"&gt;talk about monoids&lt;/a&gt;
at a very theoretical level, but they have a practical impact on code expressiveness.
A nice way to understand monoids is to see how
&lt;a href="http://dave.fayr.am/posts/2012-10-4-finding-fizzbuzz.html"&gt;monoids apply to FizzBuzz&lt;/a&gt;.  At a
more systems level, monoids are used by Twitter in their
&lt;a href="http://monkey.org/~marius/talks/twittersystems/#30"&gt;services stack&lt;/a&gt; to compose
asynchronous results.  As we develop tools at Hadapt for provisioning systems
or manipulating internal plan trees, I expect to apply monoids to help ensure
composable abstractions.&lt;/p&gt;

&lt;p&gt;The last talk of the conference was by Leon Gersing and was a great motivational
talk about personal development. You should &lt;a href="http://vimeo.com/54042336"&gt;watch it&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The talks were only half the time at SCNA. Networking with other developers made
up the rest, as well as being intermixed with fun activities like kata battles
(wherein two developers race to complete a basic coding kata live on screen
in front of the audience) and Jeopardy. There was also a re-factoring kata fishbowl
where I narrowly missed an opportunity to pair with Uncle Bob.  While I got a lot of
value from the talks, I wished there had been more time for pairing and working
on code with the other developers there. On the last day, I
got a tutorial from &lt;a href="https://twitter.com/randycoulman"&gt;Randy Coulman&lt;/a&gt;, who has been
programming in SmallTalk for 10 years, as he did the coin changer kata in SmallTalk.
More explicit time for that sort of impromptu practice (not just chatting about
work) would have made the conference even better.&lt;/p&gt;

&lt;p&gt;Overall, SCNA was a great conference and I hope to be able to spend more time with
software craftsmen in the future.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=8me5T7Z6iMA:wRb5WTrM-TY:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/8me5T7Z6iMA" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/scna-2012-summary/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Growing a Software Craftsman Engineering Organization]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/agEgyPIaW_w/" />
    <updated>2012-11-09T08:23:00-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/growing-a-software-craftsman-engineering-organization</id>
    <content type="html">&lt;p&gt;One of the hallmarks of a software craftsman is the desire to
improve and hone one&amp;#8217;s abilities.  Certainly, this is one of the
reasons that I am attending &lt;a href="http://scna.softwarecraftsmanship.org/"&gt;Software Craftsmanship North America
(SCNA)&lt;/a&gt; this year.  As a leader in an engineering
organization, however, I am also curious about how to grow an
engineering organization that is focused on not only delivering
value, but doing so in a way that values well-crafted software.&lt;/p&gt;

&lt;p&gt;The population of people who are already craftsmen (outside of
conferences such as this) is somewhat limited, so hiring solely
craftsmen is not likely to be scalable.  At the SCNA mixer last
night, I heard two basic approaches to developing a team of
craftsmen.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.8thlight.com/"&gt;8th Light&lt;/a&gt; uses an apprenticeship model. 8th Light hires
people in as an apprentice: there is a clear understanding that an
apprentice is learning about the craft and how they work at 8th
Light. There is a good ratio of craftsmen to apprentices and
everyone is invested in teaching and learning. During the
apprentice period, the apprentice may be unpaid or paid at
below-market rates as they finish training/learning. (I&amp;#8217;m sure this
is done in a fair way and everyone gets value from the
arrangement.) What was surprising to me was that not only do they
hire in experienced developers (who have self-selected as being
interested in improving/learning), but they hire people with
aptitude but relatively little programming experience. Over the
course of a year, these true apprentices grow into journeymen and
craftsmen.  It appears one successful model is to budget time and
money into training up your own pipeline of craftsmen.&lt;/p&gt;

&lt;p&gt;A second approach I heard about was through
injection of a leader/manager who drove craftsmanship into the
organization.  I spoke with people at a financial
services company and at a publishing company;
in both cases, about a year ago, someone was brought
in who drove the engineers in the direction of craftsmanship.
Today, those teams practice TDD/BDD, watch &lt;a href="http://www.cleancoders.com/"&gt;Clean
Coders&lt;/a&gt; videos to learn, and attend conferences like
SCNA.&lt;/p&gt;

&lt;p&gt;I hope over the next few days, and through continuing conversations
afterwards, to get more insight into organizations that
successfully balance the need for delivery with training its team
to deliver high quality code, and what principles and tactics
they use to transition to a high productivity state.&lt;/p&gt;

&lt;p&gt;If you have any thoughts, please share them!&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=agEgyPIaW_w:tIU4ltM_cRE:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/agEgyPIaW_w" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/growing-a-software-craftsman-engineering-organization/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Developing Cloudera Applications with Gradle and Eclipse]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/TCctC1C6wD8/" />
    <updated>2012-09-02T00:08:00-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/developing-cloudera-applications-with-gradle-and-eclipse</id>
    <content type="html">&lt;p&gt;This post is a translation/knock-off of Cloudera&amp;#8217;s post on &lt;a href="http://www.cloudera.com/blog/2012/08/developing-cdh-applications-with-maven-and-eclipse/"&gt;developing CDH
applications with Maven and Eclipse&lt;/a&gt; for &lt;a href="http://gradle.org/"&gt;Gradle&lt;/a&gt;.  It should help you get started using Gradle
with Cloudera&amp;#8217;s Hadoop. &lt;a href="http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt; makes significant use of Gradle for exactly this purpose.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://gradle.org/"&gt;Gradle&lt;/a&gt; is a build automation tool that can be used for Java projects.
Since nearly all the Apache Hadoop ecosystem is written in Java, Gradle is a
great tool for managing projects that build on top of the Hadoop APIs. In this
post, we’ll configure a basic Gradle project that will be able to build
applications against CDH (Cloudera’s Distribution including Apache Hadoop)
binaries.&lt;/p&gt;

&lt;p&gt;Gradle projects are defined using a file called &lt;code&gt;build.gradle&lt;/code&gt;, which describes
things like the projects dependencies on other modules, the build order, and any other
plugins that the project uses. The &lt;a href="https://gist.github.com/3606537"&gt;complete &lt;code&gt;build.gradle&lt;/code&gt;&lt;/a&gt; described below,
which can be used with CDH, is available as a &lt;a href="https://gist.github.com/3606537"&gt;gist&lt;/a&gt;. Gradle&amp;#8217;s build files are short and simple,
combining the power of &lt;a href="http://maven.apache.org/"&gt;Apache Maven&lt;/a&gt;&amp;#8217;s configuration by convention
with the ability to customize that convention easily (and in enterprise friendly ways).&lt;/p&gt;

&lt;p&gt;The most basic Java project can be compiled with a simple &lt;code&gt;build.gradle&lt;/code&gt; that contains
the one line:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;apply plugin: "java"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;While optional, it is helpful to start off your &lt;code&gt;build.gradle&lt;/code&gt; declaring project
metadata as well:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// Set up group and version info for the artifact
group = "com.mycompany.hadoopproject"
version = "1.0"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Since we want to use this project for Hadoop development, we need to add some
dependencies on the Hadoop libraries. Gradle resolves dependencies by
downloading jar files from remote repositories. This must be configured, so we
add both the &lt;a href="http://search.maven.org/"&gt;Maven Central Repository&lt;/a&gt;
(that contains useful things like JUnit)
and the CDH repository. This is done in the &lt;code&gt;build.gradle&lt;/code&gt; like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;repositories {
    // Standard Maven 
    mavenCentral()
    maven {
        url "https://repository.cloudera.com/artifactory/cloudera-repos/"
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second repository enables us to add a Hadoop dependency in the dependencies section.
The first repository enables us to add a JUnit dependency.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;dependencies {
    compile "org.apache.hadoop:hadoop-client:2.0.0-mr1-cdh4.0.1"
    testCompile "junit:junit:4.8.2"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A project with the above dependency would compile against the CDH4 MapReduce v1
library.  Cloudera provides a &lt;a href="https://ccp.cloudera.com/display/CDH4DOC/Using+the+CDH4+Maven+Repository"&gt;list of Maven artifacts included in CDH4&lt;/a&gt; for finding HBase and other components.&lt;/p&gt;

&lt;p&gt;Since Hadoop requires at least Java 1.6, we should also specify the compiler
version for Gradle:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// Java version selection
sourceCompatibility = 1.6
targetCompatibility = 1.6
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This gets us to a point where we’ve got a fully functional project, and we can
build a jar by running &lt;code&gt;gradle build&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In practice, it’s good to declare the version string as a property,
since there is a high likelihood of dependencies on more than one artifact
with the same version.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ext.hadoopVersion = "2.0.0-mr1-cdh4.0.1"
dependencies {
    compile "org.apache.hadoop:hadoop-client:${hadoopVersion}"
    testCompile "junit:junit:4.8.2"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now, whenever we want to upgrade our code to a new CDH version, we only need to
change the version string in one place.&lt;/p&gt;

&lt;p&gt;Note that the configuration here produces a jar that does not
contain the project dependencies within it. This is fine, so long as we only
require Hadoop dependencies, since the Hadoop daemons will include all the
Hadoop libraries in their own classpaths. If the Hadoop dependencies are not
sufficient, it will be necessary to package the other dependencies into the
jar. We can configure Gradle to package a jar with dependencies included
by adding the following block:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// Emulate Maven shade plugin with a fat jar.
// http://docs.codehaus.org/display/GRADLE/Cookbook#Cookbook-Creatingafatjar
jar {
    from configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Unfortunately, the jar now contains all the Hadoop libraries, which would conflict
with the Hadoop daemons’ classpaths. We can indicate to Gradle that certain
dependencies need to be downloaded for compilation, but will be provided to
the application at runtime by augmenting the Hadoop dependencies. The code then
looks like this, with an added dependency on Guava:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// Provided configuration as suggested in GRADLE-784
configuration {
    provided
}
sourceSets {
    main {
        compileClasspath += configurations.provided
    }
}

ext.hadoopVersion = "2.0.0-mr1-cdh4.0.1"
dependencies {
    provided "org.apache.hadoop:hadoop-client:${hadoopVersion}"

    compile "com.google.guava:guava:11.0.2"

    testCompile "junit:junit:4.8.2"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Gradle also has integration with a number of IDEs, such as Eclipse
and IntelliJ IDEA.  The default integrations can be provided by adding&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;apply plugin: "eclipse"
apply plugin: "idea"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to add support for generating Eclipse &lt;code&gt;.classpath&lt;/code&gt; and &lt;code&gt;.project&lt;/code&gt; files and
IntelliJ &lt;code&gt;.iml&lt;/code&gt; files.  The default build output locations may not be desirable,
so we configure Eclipse as follows:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;eclipse {
    // Ensure Eclipse build output appears in build directory
    classpath {
        defaultOutputDir = file("${buildDir}/eclipse-classes")
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For Eclipse, simply run &lt;code&gt;gradle eclipse&lt;/code&gt; and then import the project into Eclipse.
As you update/add dependencies, re-run &lt;code&gt;gradle eclipse&lt;/code&gt; to update the &lt;code&gt;.classpath&lt;/code&gt;
file and refresh in Eclipse.  Gradle automatically handles generating a classpath,
including linking to source jars.&lt;/p&gt;

&lt;p&gt;Recent versions of
&lt;a href="http://www.jetbrains.com/idea/webhelp/gradle-2.html"&gt;IntelliJ&lt;/a&gt; and the
&lt;a href="http://static.springsource.org/sts/docs/latest/reference/html/gradle/"&gt;SpringSource Tool Suite&lt;/a&gt;
also support direct import of Gradle projects.  When using this integration,
the &lt;code&gt;apply plugin&lt;/code&gt; lines are not necessary.&lt;/p&gt;

&lt;p&gt;Gradle represents a &lt;a href="http://gradle.org/documentation"&gt;well-documented&lt;/a&gt; and
powerful alternative to developing projects in Maven.  While not without its quirks,
I am significantly happier maintaining an enterprise build in Gradle at &lt;a href="http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt;,
compared to the complex Maven build I maintained at VMware. Give it a try.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=TCctC1C6wD8:tTOIMSYPMUg:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/TCctC1C6wD8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/developing-cloudera-applications-with-gradle-and-eclipse/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Let's improve our code]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/dcQ0VJZd3j8/" />
    <updated>2012-01-01T23:00:00-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/lets-improve-our-code</id>
    <content type="html">&lt;p&gt;New Year&amp;#8217;s is a good time to set intentions for the coming year.
Many people come off the holidays with the intention to exercise more, but
if you&amp;#8217;re reading this blog, you&amp;#8217;re probably a programmer (if you&amp;#8217;re not, consider
signing up for &lt;a href="http://www.codeyear.com"&gt;Code Year&lt;/a&gt;&amp;#8230;), so let&amp;#8217;s set an intention
about our programming. But first, a musical interlude.&lt;/p&gt;

&lt;p&gt;Earl Hines was a jazz pianist; in this 9 minute video, he describes how his early playing evolved.&lt;/p&gt;

&lt;iframe width="420" height="315" src="http://www.youtube-nocookie.com/embed/hgWvggDY2qA?rel=0" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;


&lt;p&gt;As you watch it, notice how he not only describes and demonstrates how his
style evolved, he also describes why. For example, he talks about how his
melodic line was drowned out in the larger bands so he picks up playing in
octaves (doubling up the notes).&lt;/p&gt;

&lt;p&gt;In his &lt;a href="http://youtu.be/Se8kcnU-uZw"&gt;TED talk, David Byrne&lt;/a&gt; generalizes the idea of environment
influencing music by talking about how music has always evolved to fit the
architecture in which it was performed: from how the ethereal sounds of early
church music were driven by the open acoustics of churches to how the smaller
rooms of the 18th and 19th centuries allowed for the more complex rhythms and
patterns of classical music to be heard. (Watch it &lt;a href="http://youtu.be/Se8kcnU-uZw"&gt;here&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Can we as programmers reflect similarly about our programming styles? What
influences the way our programs look? And more
importantly, perhaps, why should we care?&lt;/p&gt;

&lt;p&gt;For music, Byrne argues that the evolution of styles was driven by the needs of
the audience and the acoustics of the performance hall. Understanding these
consciously allows contemporary musicians to make more informed choices about
what and how they perform.&lt;/p&gt;

&lt;p&gt;As programmers, our programs must communicate: with the compiler, of course, so
that it will render our code executable, but also with the human readers of our
code, be that our future selves or our colleagues. So to write better
programs&amp;#8212;programs that &lt;em&gt;communicate&lt;/em&gt; their intent more concisely and clearly,
as opposed to those that execute more efficiently or that are more clever&amp;#8212;we
should consider what affects the structure and readability of the programs we
write.&lt;/p&gt;

&lt;p&gt;The frameworks and mechanisms available to us most obviously affect the
structure of code. Write a program in a system based on callbacks, such as the
async XML HTTP request that underlies AJAX, and you will find yourself with
code that chains callbacks together, preserves state in various heap objects,
and is requires that callbacks be called from the right contexts to work
properly. Write code for a threaded system and your code will have all manner
of locks and constructs to control memory write visibility. Regular expressions
can be called from Perl with the overhead of only &lt;code&gt;m//&lt;/code&gt; so it is easier to
write text munging code in Perl than almost any other language.&lt;/p&gt;

&lt;p&gt;Our methodologies, tools, and processes&amp;#8212;&lt;em&gt;how&lt;/em&gt; we program&amp;#8212;also determine how
our code looks. Test-driven development will tend to produce stronger and more
usable abstractions. Stream of consciousness programming results in a mess.
Using an editor that supports refactoring patterns will make it more likely
that you will refactor. Code review or pair programming will similarly result
in code improvements, simply because you had to communicate while writing the
code. (Even just commenting your code helps in this regard.) The end result of
these practices is code that is more understandable.&lt;/p&gt;

&lt;p&gt;Our audience (that is, our teammates) also affects our code. This is the role
of engineering culture. What will your teammates accept versus some ideal? To
get code committed to the Linux kernel requires detailed commit messages, a
well structured patch series and surviving code review on the kernel mailing
list. To get code committed to your personal project requires nothing outside
of what you ask of yourself.&lt;/p&gt;

&lt;p&gt;We have control over these factors. We can vary our tools, our practices, our
choice of frameworks, and influence our team culture. If we are framework or
API developers, we can consciously evaluate what code we induce our users to
write and improve on what we provide to simplify their lives, and facilitate
their communication and self-expression.&lt;/p&gt;

&lt;p&gt;This year, let’s set an intention to examine our code and improve how it reads.
Let&amp;#8217;s experiment and play with the factors under our control to see which choices
work better for our teams. Ask your teammates whether one way or another works
better for them. Spend some time analyzing your own code and consider how it
got that way.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ll try to share some of what I learn from my team at &lt;a href="http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt;
and I&amp;#8217;m curious to hear what you learn from yours.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=dcQ0VJZd3j8:0eJZPUxkZHA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/dcQ0VJZd3j8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/lets-improve-our-code/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Git is more usable than Mercurial]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/cBvkU_xG0rw/" />
    <updated>2011-12-05T22:49:00-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/git-is-more-usable-than-mercurial</id>
    <content type="html">&lt;p&gt;Once upon a time, I used Mercurial for development.  When I moved to
VMware, people there seemed to favor Git and so I spent the past few years
learning Git and helping to evangelize its use within VMware.  I have written
about &lt;a href="http://www.emilsit.net/blog/archives/choosing-mercurial-for-chord/"&gt;why I chose Mercurial&lt;/a&gt;, as
well as my initial &lt;a href="http://www.emilsit.net/blog/archives/experiences-with-mercurial-and-git/"&gt;reactions upon starting to use Git&lt;/a&gt;.
Hadapt happens to be using Mercurial today and so I have been re-visiting
Git and Mercurial.&lt;/p&gt;

&lt;p&gt;What I &lt;a href="http://www.emilsit.net/blog/archives/experiences-with-mercurial-and-git/"&gt;wrote about Git and Mercurial&lt;/a&gt; in 2008 is still true: Git
and Mercurial are similar in may respects&amp;#8212;for example, you can represent the
same commit graph structure in both&amp;#8212;and they are both certainly better than
Subversion and CVS.  However, there are a lot of differences to appreciate in
terms of user experience that I am now in a better position to evaluate.&lt;/p&gt;

&lt;p&gt;In using Mercurial, I find myself oddly hobbled in my ability to do things.  At
first, I thought that this might simply be because some things are simply done
differently in Mercurial but at this point, I actually think that Git&amp;#8217;s design
and attention to detail result in it actually being more usable than Mercurial.&lt;/p&gt;

&lt;p&gt;There are three &amp;#8220;philosophical&amp;#8221; distinctions that are in Git&amp;#8217;s favor:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Git has &lt;a href="http://eagain.net/articles/git-for-computer-scientists/"&gt;one branching model&lt;/a&gt;.
Mercurial has several that have evolved over time; Steve Losh has
a comprehensive essay &lt;a href="http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/"&gt;describing ways to branch in Mercurial&lt;/a&gt;.  The effect of this is that
different Mercurial users branch in different ways and the different
styles don&amp;#8217;t really mix well in one repo.  Git users, once they
learn how branching works, are unlikely to be confused by branches.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Git has names (refs) that don&amp;#8217;t change unexpectedly.  Every Git commit you care
about has a name that you can choose.  Some Mercurial commits that you
might care about do not have a name.  For example, the &lt;code&gt;default&lt;/code&gt; branch
in Mercurial can have multiple heads, so it interprets &lt;code&gt;-r default&lt;/code&gt; as
the tip-most commit.  Unfortunately, that commit will vary depending on
who has committed what to which head (and when you see it).&lt;/p&gt;

&lt;p&gt;Further, Git exposes relative naming by allowing you to refer to the
branches in remote repositories by name, without affecting your own names.&lt;/p&gt;

&lt;p&gt;Putting this together, consider what happens after you pull in Mercurial.
Your last commit used to be called &lt;code&gt;default&lt;/code&gt;
but after the pull, &lt;code&gt;default&lt;/code&gt; is something from the upstream.  Your commit
is a separate head that now has no name.  In Git, your &lt;code&gt;master&lt;/code&gt; doesn&amp;#8217;t
move after a fetch and the remote&amp;#8217;s branch is called &lt;code&gt;origin/master&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Git even tracks the changes what commit each name refers to in a reflog.
You can easily refer to things that the name used to refer to.  In Mercurial,
branch names don&amp;#8217;t have reliable meanings, and it doesn&amp;#8217;t track them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Git commands operate in a local context by default. Mercurial commands
often operate on a repository context.  For example, &lt;code&gt;git grep&lt;/code&gt; operates
on the current sub-directory of your work tree, &lt;code&gt;hg grep&lt;/code&gt; operates on your history.
The Git analog of &lt;code&gt;hg grep&lt;/code&gt; is using the log pick-axe; the Mercurial analog of
&lt;code&gt;git grep&lt;/code&gt; is to use &lt;a href="http://betterthangrep.com/"&gt;ack&lt;/a&gt;, or if you must, something
like &lt;code&gt;hg grep -r reverse(::.) pattern .&lt;/code&gt; (Seriously?)&lt;/p&gt;

&lt;p&gt;Another example is the &lt;code&gt;log&lt;/code&gt; command. Git&amp;#8217;s log command shows you the history
of the commit you are on right now. Mercurial&amp;#8217;s log command shows
you something about the whole repository unless you restrict with some
combination of &lt;code&gt;-b&lt;/code&gt; and &lt;code&gt;-f&lt;/code&gt;. Combined with Mercurial&amp;#8217;s way of
resolving branch names to commits, it becomes very difficult to use &lt;code&gt;hg log&lt;/code&gt;
to compare two heads or explore what has changed in another
head of the same branch.&lt;/p&gt;

&lt;p&gt;More often than not, I care about things in their current tree more
than how things are in some random other branch that I am not
working on and Mercurial makes it hard to do that.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;There are other usability issues that I&amp;#8217;ve found that are more detail-oriented
than philosophical.  I&amp;#8217;ll note a few here.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;hg log&lt;/code&gt; doesn&amp;#8217;t display the full text of the commit message unless you
&lt;code&gt;hg log --debug&lt;/code&gt;. This is an unfortunate disincentive to
&lt;a href="http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html"&gt;writing good commit messages&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;hg log -p&lt;/code&gt; doesn&amp;#8217;t pay as much attention to merge commits as Git does; the
help for &lt;code&gt;hg log&lt;/code&gt; reads:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;log -p/&amp;#8211;patch may generate unexpected diff output for merge
changesets, as it will only compare the merge changeset against its
first parent. Also, only files different from BOTH parents will appear
in files:.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;whereas &lt;code&gt;git log&lt;/code&gt; has a variety of options to control how the merge
diff is displayed, including showing diffs to both parents, removing
&amp;#8220;uninteresting&amp;#8221; changes that did not conflict, or showing the full merge
against either just the first or all parents of the merge commit.&lt;/p&gt;

&lt;p&gt;Both Mercurial and Git have lots of configurable options; Git has a thin veneer
over editing a config file in the form of the &lt;code&gt;git config&lt;/code&gt; sub-command.
Mercurial involves editing a file even if just setting up your initial username
or enabling extensions.  I often wound up editing Git config files directly,
but having the commands were nice for sharing instructions with others.&lt;/p&gt;

&lt;p&gt;Git support for working with patches natively is better.  Mercurial supports
e-mailing and applying patches, but oddly the extension for sending out patches
is built in (patchbomb) but the extension for importing from an mbox (mbox) is
not.  There&amp;#8217;s no direct analog of &lt;code&gt;git apply&lt;/code&gt;; instead you have to use a patch
queue.  Patch queues are okay, but branches and well-integrated
rebase/e-mail/apply support are much nicer than patch queues: you don&amp;#8217;t need to
manual find some &lt;code&gt;.hg/patches/series&lt;/code&gt; file and edit it to re-order stuff.&lt;/p&gt;

&lt;p&gt;I could write more and indeed many people have written about Git and
Mercurial&amp;#8212;you can explore my &lt;a href="http://pinboard.in/u:emilsit/t:git/"&gt;bookmarks about git&lt;/a&gt;
for some of the better ones.  Let me
close here with three interesting features in Mercurial 2.0:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the new &lt;a href="http://mercurial.selenic.com/wiki/LargefilesExtension"&gt;largefiles extension&lt;/a&gt; allows users to not transfer large files down until they are needed;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mercurial.selenic.com/wiki/Subrepository"&gt;subrepos&lt;/a&gt; can be Git or Subversion in addition to Mercurial;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://selenic.com/hg/help/revsets"&gt;revsets&lt;/a&gt; allow you to search your history in very flexible ways.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Overall, I feel that Git is significantly more usable for day-to-day development
than Mercurial.  I&amp;#8217;d be curious to hear if you think the opposite is true.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=cBvkU_xG0rw:G20MRKjSFcI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/cBvkU_xG0rw" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/git-is-more-usable-than-mercurial/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[A new adventure]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/HYUTuQm_Ars/" />
    <updated>2011-11-06T21:10:54-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/a-new-adventure</id>
    <content type="html">&lt;p&gt;Friday, 4 November, was my last day at VMware.&lt;/p&gt;

&lt;p&gt;I started at VMware in 2008, working on a project that has now become &lt;a href="http://www.vmware.com/mobile"&gt;VMware&amp;#8217;s Horizon Mobile&lt;/a&gt;.  Last year, I switched to working on the latest release of &lt;a href="http://www.vmware.com/products/vcloud-director/overview.html"&gt;VMware&amp;#8217;s vCloud Director&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;VMware has a lot going for it as a place to work. For example, it has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Executive leadership with a clear vision to make &lt;a href="http://blogs.vmware.com/console/2011/08/an-oasis-of-innovation-in-the-desert.html"&gt;VMware a leading player in the cloud era&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;Experienced and friendly engineers to work with: within 50 feet of my VMware office there were two (other) MIT PhDs, numerous people from IBM and Sun with deep Java experience, and a former Usenix tutorial instructor;&lt;/li&gt;
&lt;li&gt;An R&amp;amp;D organization dedicated to &lt;a href="https://twitter.com/#!/herrod/status/68451855025455104"&gt;internal innovation&lt;/a&gt; and making some of them public (with &lt;a href="http://labs.vmware.com/flings"&gt;VMware Flings&lt;/a&gt;); not to mention,&lt;/li&gt;
&lt;li&gt;Great perks (comparable to &lt;a href="http://mashable.com/2011/10/17/google-facebook-twitter-linkedin-perks-infographic/"&gt;perks at other top companies&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;It wasn&amp;#8217;t an easy choice to leave.&lt;/p&gt;

&lt;p&gt;Last month, I became aware that a startup in the big data space was moving to Boston.  I&amp;#8217;d been wondering about life outside VMware and this opportunity seemed just about perfect.  So I&amp;#8217;m beginning a new adventure at &lt;a href="http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt;.  As an early employee, I imagine I&amp;#8217;ll be doing a little bit of everything. I hope to combine the skills and knowledge I built up from my graduate work and the practical experience of delivering enterprise software at VMware to help Hadapt build a powerful, scalable, data analytics platform and make Hadapt a successful company.&lt;/p&gt;

&lt;p&gt; I&amp;#8217;m excited to get started and I hope to share here with you some of my experiences as I go.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=HYUTuQm_Ars:5z_0WTCSDHw:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/HYUTuQm_Ars" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/a-new-adventure/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Rules for Development Happiness]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/pNOYsoOZZJY/" />
    <updated>2011-10-20T09:20:49-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/rules-for-development-happiness</id>
    <content type="html">&lt;p&gt;Inspired by Alex Payne&amp;#8217;s &lt;a href="http://al3x.net/2008/09/08/al3xs-rules-for-computing-happiness.html"&gt;Rules for Computing Happiness&lt;/a&gt;, some rules for having happy developers and being happy as a developer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use version control. (See &lt;a href="http://www.joelonsoftware.com/articles/fog0000000043.html"&gt;The Joel Test&lt;/a&gt;.)  In particular, use a distributed version control system (like &lt;a href="http://mercurial.selenic.com/"&gt;Mercurial&lt;/a&gt; or &lt;a href="http://gitref.org/"&gt;Git&lt;/a&gt;).  This ensures you can commit offline and also conduct &lt;a href="http://blogs.oracle.com/tor/entry/source_code_archaeology"&gt;code archaeology&lt;/a&gt; offline.&lt;/li&gt;
&lt;li&gt;Have a correct and fast incremental build (e.g., &lt;a href="http://evbergen.home.xs4all.nl/nonrecursive-make.html"&gt;non-recursive Make&lt;/a&gt; or &lt;a href="http://gradle.org/"&gt;Gradle&lt;/a&gt;) to avoid &lt;a href="http://xkcd.com/303/"&gt;this&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Have a system for testing your changes in a safe environment &lt;a href="http://googletesting.blogspot.com/2008/09/presubmit-and-performance.html"&gt;&lt;em&gt;prior&lt;/em&gt; to code submission&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Avoid dependencies on system tools. Different developers tend to have different systems and hence different versions of tools.&lt;/li&gt;
&lt;li&gt;Be able to work offline. Offline may mean when you&amp;#8217;re on a plane, but it may also happen when the office network goes down. Both happen. Notably, the latter happens even when you work on a desktop with a wired connection. (It&amp;#8217;s been pointed out to me that the network going down can be a good team-building experience.)

&lt;ul&gt;
&lt;li&gt;Be able to build offline.  That means having all build dependencies cached locally.&lt;/li&gt;
&lt;li&gt;Have all e-mail cached locally. Don&amp;#8217;t be unable to find those key instructions some mailed you just because &lt;a href="http://gmailblog.blogspot.com/2011/02/gmail-back-soon-for-everyone.html"&gt;GMail is restoring your mail from tape&lt;/a&gt;.  Helpful tools here are &lt;a href="http://isync.sourceforge.net/"&gt;isync&lt;/a&gt; or &lt;a href="http://offlineimap.org/"&gt;offlineimap&lt;/a&gt;.  Index your mail with &lt;a href="http://code.google.com/p/mu0/"&gt;mu&lt;/a&gt;. (Or configure Thunderbird/Apple Mail/etc to keep &lt;em&gt;everything&lt;/em&gt; offline.)&lt;/li&gt;
&lt;li&gt;Be able to send mail offline; e.g., have it queue locally for deliver when the network comes back.  But, make sure you keep a copy locally in case the hotel&amp;#8217;s WiFi is transparently re-directing out-bound SMTP connections to &lt;code&gt;/dev/null&lt;/code&gt;. (This really happened to me.)&lt;/li&gt;
&lt;li&gt;Have other documentation cached locally. (Use something like &lt;a href="https://github.com/github/gollum"&gt;gollum&lt;/a&gt; for your wiki.)&lt;/li&gt;
&lt;li&gt;If you work somewhere with a shared-storage home directory, make sure you can login when the network is down!&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=pNOYsoOZZJY:2pG9zeDpHhE:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/pNOYsoOZZJY" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/rules-for-development-happiness/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Store Hudson configuration in Git]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/OfriucGfkq8/" />
    <updated>2011-01-14T15:13:40-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/store-hudson-configuration-in-git</id>
    <content type="html">&lt;p&gt;For any kind of server, it&amp;#8217;s a good idea to keep its configuration in some sort of version control system.  &lt;a href="http://hudson-ci.org/"&gt;Hudson&lt;/a&gt; is a pluggable continuous integration system.  Recently, I was trying to set one up and was wondering the best way to &lt;a href="http://stackoverflow.com/questions/2087142/is-there-a-way-to-keep-hudson-configuration-files-in-source-control"&gt;store Hudson&amp;#8217;s configuration in version control (StackOverflow summary)&lt;/a&gt;.  The most complete answer is a post on the Hudson blog about how to &lt;a href="http://www.hudson-labs.org/content/keeping-your-configuration-and-data-subversion"&gt;keep Hudson&amp;#8217;s configuration in Subversion&lt;/a&gt;; there are also plugins like a nascent &lt;a href="http://wiki.hudson-ci.org/display/HUDSON/SCM+Sync+configuration+plugin"&gt;SCM Sync configuration&lt;/a&gt; plugin.  But, the former is very Subversion specific and the latter does not seem particularly mature.  So, to understand how to do it in your workflow, there are two things to consider.&lt;/p&gt;

&lt;p&gt;First, which files are relevant? Hudson puts configuration, run-time state, source code and build output all in the same sub-directory (called &lt;code&gt;HUDSON_HOME&lt;/code&gt;).  Second, relatedly, since normally you edit Hudson&amp;#8217;s configuration through the GUI, when should you commit changes? Should it be automated (e.g., nightly at midnight) or manual (e.g., &lt;code&gt;ssh&lt;/code&gt; into the server and manually commit)?  I&amp;#8217;ll answer those questions with an implementation in &lt;a href="http://git-scm.com/"&gt;Git&lt;/a&gt; but you can translate the information easily to your preferred VCS.&lt;/p&gt;

&lt;p&gt;Identify relevant files by using the following &lt;code&gt;.gitignore&lt;/code&gt; file:&lt;/p&gt;

&lt;script src="https://gist.github.com/780105.js?file=.gitignore"&gt;&lt;/script&gt;


&lt;p&gt;This ignores the uninteresting files and will allow &lt;code&gt;git status&lt;/code&gt; to show you interesting new files.  Note that I prefer to actually commit the binaries of plugins since I don&amp;#8217;t want to rely on outside sources (namely, the mirror network) having the particular version of the plugin that I was using for the given configuration files.  To use this if you are installing a new Hudson server, you can just&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;cd $HUDSON_HOME/.. # Default is /var/lib
rm -r hudson
git clone git://gist.github.com/780105.git hudson
# Don't forget to chown hudson hudson as appropriate for your environment
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;before starting Hudson for the first time. Then once it has started, run &lt;code&gt;git commit&lt;/code&gt; to track the default config that Hudson creates.&lt;/p&gt;

&lt;p&gt;The second question is when. The Hudson blog&amp;#8217;s &lt;a href="http://www.hudson-labs.org/content/keeping-your-configuration-and-data-subversion"&gt;recommendation&lt;/a&gt; is to create a Hudson job that runs nightly at midnight to check for differences and automatically commit them.  I prefer manually committing the changes on the server and then pushing it. This allows me to identify specific functional changes (using &lt;a href="http://www.kernel.org/pub/software/scm/git/docs/git-add.html"&gt;&lt;code&gt;git add -p&lt;/code&gt;&lt;/a&gt;) and commit them individually.  If you want to do it automatically, simply write a script or add a job that will&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;git commit -a -m "Automated commit of Hudson configuration"
git push
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;once you set up an appropriate &lt;code&gt;origin&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Once you have this set up, you can even use something like &lt;a href="http://opscode.com/chef/"&gt;Chef&lt;/a&gt; to automatically pull down updated configuration that you manage and test elsewhere and restart the Hudson server when necessary.  Then you can re-create your Hudson server in case of failure at any time!&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=OfriucGfkq8:aBRXnhTE_YM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/OfriucGfkq8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/store-hudson-configuration-in-git/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Programming without fear]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/fViGENTb9-0/" />
    <updated>2010-11-16T01:06:39-05:00</updated>
    <id>http://www.emilsit.net/blog/archives/programming-without-fear</id>
    <content type="html">&lt;p&gt;This past weekend, I attended &lt;a href="http://www.3pvantage.com/"&gt;Gil Broza&lt;/a&gt;&amp;#8217;s seminar on &lt;a href="http://www.gbcacm.org/journey/withoutfear2010/"&gt;Programming Without Fear&lt;/a&gt;, organized by the Greater Boston Chapter of the ACM&amp;#8217;s &lt;a href="http://www.thejourneymanprogrammer.org/"&gt;Journeyman Programmer initiative&lt;/a&gt;.  The seminar was as advertised, and covered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.codinghorror.com/blog/2006/05/code-smells.html"&gt;Code smells&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.agiledata.org/essays/tdd.html"&gt;Test-driven development&lt;/a&gt;, with a focus on micro-testing (aka unit-testing);&lt;/li&gt;
&lt;li&gt;Common &lt;a href="http://refactoring.com/"&gt;refactoring patterns&lt;/a&gt; and techniques for refactoring using a unit-test base for safety;&lt;/li&gt;
&lt;li&gt;Testing in isolation with &lt;a href="http://martinfowler.com/articles/injection.html"&gt;dependency injection&lt;/a&gt; with &lt;a href="http://martinfowler.com/articles/mocksArentStubs.html"&gt;stubs and mocks&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For anyone with more than a passing interest in agile, the material Gil presented (covered in the links above) will not be new.&lt;/p&gt;

&lt;p&gt;The benefits of the seminar came from two things.  First, Gil presented the information in a somewhat &amp;#8220;formal&amp;#8221; framework: a taxonomy of code smells, a set of refactoring patterns, a pair of mnemonics (PRICELESS unit tests and TRUST your refactoring process) to help remember basic techniques.  This gives someone new to the material an organized set of knowledge to internalize.  Second, Gil has prepared a series of exercises, interspersed with the lecture-y sections, that seminar participants work through in pairs, designed to reinforce the theoretical frameworks with practical experience.  Even as someone moderately experienced with these concepts, the exercises are useful in that they focus on the fundamentals and force you to actively strengthen those fundamentals.  (The weakest section, I thought, was the one on mocking which received insufficient exposition and dropped the class directly into jMock, which was a bit opaque.)&lt;/p&gt;

&lt;p&gt;Gil is not the most exciting or funny teacher but he kept the attendees engaged by teaching with a Socratic flavor&amp;#8212;he presented examples and solicited audience evaluations, allowing the audience to interact to reach conclusions.  The practical exercises were followed by group de-briefs. This encouraged the audience to stay engaged and better absorb the material.&lt;/p&gt;

&lt;p&gt;My main worry about the techniques is the overall reliance on Eclipse (or other IDE) as a developer&amp;#8217;s assistant: while certainly the tooling is convenient, they make me worry about Java and whether the &lt;a href="http://pragprog.com/the-pragmatic-programmer/extracts/wizards"&gt;use of tools and wizards&lt;/a&gt; weaken developers who may never learn how to do things themselves.&lt;/p&gt;

&lt;p&gt;What I really enjoyed was the experience of actually developing and refactoring with the protection of a unit test suite and learning techniques to perform refactoring without more than a moment or two of compiler errors.  This was in sharp contrast to my normal refactoring experience of making a top-level change and then following all the compiler warnings until the work is done.  Now if only every codebase I worked on came with such a set of tests&amp;#8230;&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=fViGENTb9-0:qlXxiY2Cqo4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/fViGENTb9-0" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/programming-without-fear/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[How to install ThinkUp on NearlyFreeSpeech]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/P3w1RQQg4FU/" />
    <updated>2010-09-03T23:50:21-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/how-to-install-thinkup-on-nearlyfreespeech</id>
    <content type="html">&lt;p&gt;&lt;a href="http://smarterware.org"&gt;Gina Trapani&lt;/a&gt; and ExpertLabs have put &lt;a href="http://expertlabs.org/thinkup.html"&gt;ThinkUp&lt;/a&gt;, a cool tool for tracking replies to your posts on Twitter.  As of September 2010, ThinkUp has a nifty drop-in web-based installer, much like WordPress.  Simply &lt;a href="http://github.com/ginatrapani/ThinkUp/downloads"&gt;grab ThinkUp 0.007 or later&lt;/a&gt;, unzip it somewhere that your PHP/MySQL-enabled web server can get at and it&amp;#8217;ll prompt you through the installation.&lt;/p&gt;

&lt;p&gt;In response to Gina&amp;#8217;s post &lt;a href="http://smarterware.org/6638/long-weekend-hacking-how-to-help-out-with-thinkup"&gt;asking for help testing/hacking this long-weekend&lt;/a&gt;, I ran through this in about 10 minutes on &lt;a href="http://www.nearlyfreespeech.net/"&gt;NearlyFreeSpeech&lt;/a&gt;.  Here are some quick tips where NFSN&amp;#8217;s set-up is a bit different than what is expected by the default installer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After you unzip the ThinkUp dist, run &lt;code&gt;chgrp -R web _lib/view/compiled_view&lt;/code&gt; and &lt;code&gt;chmod -R g+w _lib/view/compiled_view&lt;/code&gt; so that the templating engine can cache its views.&lt;/li&gt;
&lt;li&gt;Make sure you have a MySQL process enabled in your NearlyFreeSpeech control panel.  A basic &lt;a href="https://www.nearlyfreespeech.net/services/mysql"&gt;MySQL process costs $0.02/day&lt;/a&gt; but you can share the process with your WordPress database.  Spin up phpMyAdmin in the right-hand sidebar and create a user called &lt;code&gt;thinkup&lt;/code&gt; and make sure to check off the option to create a database with the same name and grant all rights to that user.  Generate a random password, and copy that password.&lt;/li&gt;
&lt;li&gt;In the ThinkUp database configuration section, enter &lt;code&gt;thinkup&lt;/code&gt; as the user name and database name and paste your generated password.  Open the advanced section and change the database host from &lt;code&gt;localhost&lt;/code&gt; to your database host name.  It&amp;#8217;ll be something like &lt;em&gt;username&lt;/em&gt;.db; mine, for example, is &lt;code&gt;sit.db&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;ThinkUp will fail to write the configuration file due to perms but helpfully offers the ability to copy and paste a file.  Select the text in the config text box and go back to the terminal where you unzipped the dist.  In the &lt;code&gt;thinkup&lt;/code&gt; directory, &lt;code&gt;cat &amp;gt; config.inc.php&lt;/code&gt;, paste and then Ctrl-D to save the file.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Check your e-mail for the activation link and configure your account.  You&amp;#8217;ll need to register your installation as a Twitter application and paste in the consumer key and consumer secret.  The config page will send you to the Twitter registration page and tell you the callback URL to provide.  Leave ThinkUp as a read-only application and leave the &amp;#8216;Use Twitter for login&amp;#8217; unchecked.&lt;/p&gt;

&lt;p&gt;That should do it!&lt;/p&gt;

&lt;p&gt;For more details, check out the &lt;a href="http://wiki.github.com/ginatrapani/ThinkUp/"&gt;ThinkUp wiki&lt;/a&gt; for more up-to-date instructions.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=P3w1RQQg4FU:8JV5luqj-Rk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/P3w1RQQg4FU" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/how-to-install-thinkup-on-nearlyfreespeech/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Understanding SpringSource and the Spring Framework]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/IbegRpLJBMs/" />
    <updated>2010-06-01T22:35:44-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/understanding-springsource-and-the-spring-framework</id>
    <content type="html">&lt;p&gt;In light of recent announcements like &lt;a href="http://www.vmforce.com"&gt;vmForce&lt;/a&gt; or the &lt;a href="http://blog.springsource.com/2010/05/19/spring-google-appengine/"&gt;SpringSource/Google App Engine integration&lt;/a&gt;, you may be wondering, what the &lt;a href="http://www.springsource.com/products/enterprise"&gt;Spring Framework&lt;/a&gt; is, precisely.  What does the SpringSource company provide?&lt;/p&gt;

&lt;p&gt;According to their &lt;a href="http://www.springsource.com/"&gt;homepage&lt;/a&gt;, SpringSource is in the business of &amp;#8220;eliminating enterprise Java complexity&amp;#8221; and is a leader in Java application infrastructure and management.  That&amp;#8217;s not very concrete, and so I don&amp;#8217;t feel it is particularly helpful, particularly if you are not an J2EE/JEE (Java Enterprise Edition) developer. In this post, I&amp;#8217;ll talk about SpringSource in general and focus on the Spring Framework.  Note that while I work for VMware (which owns SpringSource) and use the Spring Framework (commonly referred to as Spring) at work, I am not part of our SpringSource division nor do I have any particularly special access to the innards of SpringSource.  I did get to take the &lt;a href="http://mylearn.vmware.com/mgrReg/courses.cfm?ui=S2&amp;amp;a=one&amp;amp;id_subject=17750"&gt;Core Spring training&lt;/a&gt; for free, but it is only after 5 months of programming with Spring that I&amp;#8217;ve started to understand the SpringSource philosophy.&lt;/p&gt;

&lt;p&gt;SpringSource products &lt;em&gt;let you write code that is as focused as possible on the needs of your application&lt;/em&gt;, and as little as possible on the boilerplate or hassle of dealing with different underlying environments (e.g., dev, test, production may have different database backends) or infrastructures (e.g., GAE, vCloud).  This is the core value that underlies SpringSource, but it is only explored indirectly via its various instantiations in the SpringSource literature and product line.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://www.springsource.com/products/enterprise"&gt;Spring Framework (aka Spring)&lt;/a&gt; provides glue.  Spring &lt;em&gt;provides glue in a relatively uniform manner&lt;/em&gt;, so that once you understand the basic approach(es), you can apply it to interfacing with different components.  From the documentation, Spring seems to do everything, but at the same time, when you try to use it, you may feel that it seems to do almost nothing.  It may be useful to compare the Spring Framework to the &lt;a href="http://www.debian.org"&gt;Debian Linux distribution&lt;/a&gt;: Debian provides a nice out-of-the-box experience with a &lt;a href="http://en.wikipedia.org/wiki/Dpkg"&gt;uniform mechanism for managing software&lt;/a&gt;, and in particular, &lt;a href="http://www.debian-administration.org/articles/91"&gt;alternative software packages&lt;/a&gt; that can provide a common service. But to get at the power of the underlying packages, you must learn how to configure and use them.  Likewise, Spring does not actually provide many services on its own.  It does not free you from having to learn how to write a unit test, access a database, manage a messaging system, or implement security.  Instead, it makes it possible for you to write code to do these things in a somewhat generic manner, so that your code can be as generic as possible.&lt;/p&gt;

&lt;p&gt;Understanding these two key points will help you make sense of the variety of things written about Spring.&lt;/p&gt;

&lt;p&gt;The core glue provided by Spring is its &lt;a href="http://martinfowler.com/articles/injection.html"&gt;dependency injection&lt;/a&gt; support, also known as the &amp;#8220;inversion of control&amp;#8221; or &lt;a href="http://static.springsource.org/spring/docs/3.0.x/spring-framework-reference/html/beans.html"&gt;IoC container&lt;/a&gt;.  This means that, instead of class Foo explicitly instantiating an object implementing interface Bar, Foo will have a constructor argument or setter that accepts a Bar.  The inversion of control container lets you specify the right kind of Bar for Foo in a given environment and handles constructing that Bar, and injecting it by calling the setter.  This makes code less fragile because it no longer needs special-casing for testing (e.g., a stub Bar) or anything else.  The mapping of a particular Bar to Foo becomes part of a configuration file that also captures all of Foo&amp;#8217;s other dependencies.&lt;/p&gt;

&lt;p&gt;Spring also provides more special-purpose glue; the Spring Framework page writes:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Spring provides the ultimate programming model for modern enterprise Java applications by &lt;strong&gt;insulating business objects from the complexities of platform services&lt;/strong&gt; for application component management, Web services, transactions, security, remoting, messaging, data access, aspect-oriented programming and more.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;For example, your application can use the Spring Framework with straight JDBC, or with generic &lt;a href="http://en.wikipedia.org/wiki/Object-relational_mapping"&gt;object-relational mappers (ORMs)&lt;/a&gt; like JPA through to highly specific ones like iBatis or Hibernate.  You can configure your application to then talk to a variety of database back-ends, with minimal changes in the actual configuration files, and write minimal code related to setting up database connections and processing error cases.  Spring provides wrappers and translators to help unify service-specific method names, such as the one that causes an ORM system to generate database tables, and exceptions into more generic expressions of those concepts.  This means you might be able to switch between JPA providers, for example, without changing too much configuration.  However, you still have to configure your JPA provider correctly.&lt;/p&gt;

&lt;p&gt;In line with allowing you flexibility from the infrastructure, Spring also provides flexibility of mechanism, so that the code and configuration that you write to integrate with Spring&amp;#8217;s services are at a level that you are comfortable with.   You can &lt;a href="http://static.springsource.org/spring/docs/3.0.x/spring-framework-reference/html/beans.html#beans-factory-metadata"&gt;configure the IoC&lt;/a&gt; with XML or with Java; you can use annotations or you can use explicit configuration.  To specify which of your business methods should &lt;a href="http://static.springsource.org/spring/docs/3.0.x/spring-framework-reference/html/transaction.html"&gt;be in a database transaction&lt;/a&gt;, you can annotate with &lt;code&gt;@Transactional&lt;/code&gt; in your source code, or you can use an aspect-oriented programming filter to tag the relevant methods in an external configuration file.&lt;/p&gt;

&lt;p&gt;All of this glue and the flexibility of mechanism contribute to making Spring hard to understand; however, their presence emphasizes the idea that Spring wants to get out of your way so that you can focus on application development.  Other SpringSource tools such as &lt;a href="http://www.springsource.org/roo"&gt;Roo&lt;/a&gt; and &lt;a href="http://static.springsource.com/projects/tc-server/6.0/devedition/"&gt;Insight&lt;/a&gt; work similarly: they simplify development and debugging (respectively) without requiring that you make extensive changes to your source, and respecting current best-practices.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve left out various components of Spring to focus on the core philosophy of SpringSource, but this background should help you make sense of resources like the &lt;a href="http://en.wikipedia.org/wiki/Spring_Framework"&gt;Wikipedia article on Spring&lt;/a&gt;, the &lt;a href="http://www.springsource.org/documentation"&gt;Spring Framework documentation&lt;/a&gt;, and books like &lt;a href="http://www.manning.com/walls2/"&gt;Spring in Action&lt;/a&gt;.  If you&amp;#8217;re running into problems, however, the best place to get concrete questions about Spring answered is &lt;a href="http://stackoverflow.com/questions/tagged/spring"&gt;Stack Overflow&lt;/a&gt;.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=IbegRpLJBMs:fghIW1AVmCQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/IbegRpLJBMs" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/understanding-springsource-and-the-spring-framework/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Examining your personal programming style]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/5TmX-Zy6Luo/" />
    <updated>2010-05-02T23:23:29-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/examining-your-personal-programming-style</id>
    <content type="html">&lt;p&gt;When I was growing up, we would listen to classical music stations in the car and try to figure out the composer and sometimes even the performer.  Both musicians and composers often have their own distinctive &lt;em&gt;style&lt;/em&gt;: you can hear the mathematical precision of Gould, or the clarity of Horowitz, whether they are interpreting Bach or Mozart.  My last post started me thinking about a musician or composer&amp;#8217;s style and drawing a parallel in the context of computer programming.&lt;/p&gt;

&lt;p&gt;When thinking about music, one&amp;#8217;s style is a matter of personal expression, but if you say &amp;#8220;coding style&amp;#8221; to a programmer (or really, to &lt;a href="http://www.google.com/search?q=coding+style"&gt;Google&lt;/a&gt;), you&amp;#8217;ll find rules about whitespace, variable naming, plus some proverbs about how to write maintainable code (e.g., &amp;#8220;avoid global variables&amp;#8221;).  Overall, I don&amp;#8217;t think these are particularly relevant to the &lt;em&gt;art&lt;/em&gt; of programming.&lt;/p&gt;

&lt;p&gt;For example, formatting and naming conventions are important in a codebase only that a properly followed convention becomes invisible&amp;#8212;just like your nose becomes acclimated to a smell, your brain quickly learns to recognize a formatting convention and ignore it.  Having a convention (any convention!) allows you to focus on what the source code is really doing.  Following a convention is good for everyone reading your code, even you.  &lt;a href="http://www.gnu.org/software/indent/"&gt;Automate your coding conventions&lt;/a&gt; and forget about it.  (In the extreme, check out what the &lt;a href="http://research.swtch.com/2009/12/gofmt.html"&gt;Go Language formatter&lt;/a&gt; can do.)&lt;/p&gt;

&lt;p&gt;Similarly, coding style proverbs, like &amp;#8220;write tests before code&amp;#8221; or &amp;#8220;keep code in a function at one level of abstraction&amp;#8221;, are like any other proverb: these statements capture an element of experience from programmers past, but are often blindly followed by people new to the practice.  It takes significant time before a programmer can truly internalize the reasons for and nuance behind any proverb.  (Incidentally, if you are interested in studying proverbs, I highly recommend you examine the &lt;a href="http://senseis.xmp.net/?GoProverbs"&gt;game of Go&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;What I am interested in exploring is personal expression and style in programming, outside of language/library/tools, proverbs or code formatting. Having a personal style is not a concept that we as computer programmers are generally exposed to.  School focuses almost exclusively on the technical, ignoring both the practice (i.e., the stuff of proverbs) and the art (the subject of this post).  Indeed, I am only beginning to be able to express what my personal style might be.&lt;/p&gt;

&lt;p&gt;So, how do you express yourself in code?  To begin exploring our artistic programming style, let&amp;#8217;s continue to draw an analogy from the arts&amp;#8212;instead of music, let&amp;#8217;s look at the process of &lt;a href="http://www.luminous-landscape.com/columns/aesthetics9.shtml"&gt;establishing a personal photographic style&lt;/a&gt;.  The author talks about the importance of the choices&amp;#8212;the choice of equipment (camera), subject matter, the approach to &lt;em&gt;making&lt;/em&gt; a picture.  You come upon a way of doing things that you believe is right, that supports your personal values.  When looking at myself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Equipment: Linux/Vim/Mutt/Xmonad/Git.&lt;/li&gt;
&lt;li&gt;Subject matter: I care strongly about the process with which you build programs, and so I wind up working a lot on tools and scripting, but I&amp;#8217;m also interested in distributed systems problems.&lt;/li&gt;
&lt;li&gt;Approach: I like (re)using what is present; I strive for consistency, simplicity, elegance. I am somewhat inclined towards using functional constructs in imperative languages (though I never did like OCaml&amp;#8217;s syntax). I always look at a diff of my code to ensure it is minimal before committing it, and I like to write verbose log messages.&lt;/li&gt;
&lt;li&gt;Examples: Some things I&amp;#8217;ve worked on that you can look at include of course &lt;a href="http://hg.pdos.csail.mit.edu/hg/"&gt;Chord&lt;/a&gt;, and some contributions to Gina Trapani&amp;#8217;s &lt;a href="http://github.com/ginatrapani/todo.txt-cli"&gt;todo.txt&lt;/a&gt; tool.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I&amp;#8217;m not entirely happy with this &amp;#8220;approach&amp;#8221; because it feels like mostly a list of platitudes.  But some of my difficulty, I think is in not having thought about this specifically as I look at other people&amp;#8217;s code, not even having the words that I can use to compare and contrast my approach with that of others.  So, I&amp;#8217;d like this post be a start for each of us to explore our own style.&lt;/p&gt;

&lt;p&gt;Spend a few minutes thinking about what you value in your own code, and how you define yourself as an artistic programmer and write about it in a comment, perhaps using the template I&amp;#8217;ve set for myself above.  I hope we&amp;#8217;ll each learn something!&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=5TmX-Zy6Luo:Z2Xw14IBJtQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/5TmX-Zy6Luo" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/examining-your-personal-programming-style/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Music education versus computer science education]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/w6j4FzXtFMc/" />
    <updated>2010-04-24T23:41:58-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/music-education-versus-computer-science-education</id>
    <content type="html">&lt;p&gt;My mother recently forwarded me this interview of the pianist &lt;a href="http://en.wikipedia.org/wiki/Glenn_Gould"&gt;Glenn Gould&lt;/a&gt;:&lt;/p&gt;

&lt;object width="425" height="324"&gt;&lt;param name="movie" value="http://www.youtube.com/v/thPd18JGvbE&amp;hl=en_US&amp;fs=1&amp;"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/thPd18JGvbE&amp;hl=en_US&amp;fs=1&amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;


&lt;p&gt;I encourage you to watch this (and all 6 parts), even if you know nothing about classical music.&lt;/p&gt;

&lt;p&gt;What stood out to me in viewing this series of videos was the fluidity with which Gould is able to discuss a piece of music, in its historical context, and to simply jump in to play a phrase from a different piece to call out a point for discussion.  This demonstrates an incredible mastery of the subject matter, unifying history and context, theory, and practical implementation.&lt;/p&gt;

&lt;p&gt;From &lt;a href="http://www.msmnyc.edu/precollege/curriculum/"&gt;my experience at the Manhattan School of Music&lt;/a&gt;, musical training seeks precisely to bring this unification into its students. As you study, you learn to play a variety of pieces or styles, from different time period.  You are taught some history, to be able to understand the evolution of styles; you are taught the underlying theory, to be able to discuss this evolution using precise terms; and then you practice the mechanics needed to actually play the pieces.&lt;/p&gt;

&lt;p&gt;How does this compare to computer programming? Computer &amp;#8220;science&amp;#8221;, as you may know, has a fair amount of artistry to it.&lt;/p&gt;

&lt;p&gt;We are simply not trained to have these kinds of discussions. How many people do you know who, during a code or design discussion, might say, &amp;#8220;Oh, this is very similar to System X, in contrast to how System Y did things,&amp;#8221; and then pull up the source code (or architecture diagram) for System X and Y and compare them with the relevant pieces under discussion?&lt;/p&gt;

&lt;p&gt;At MIT, the classes are (were?) organized around ideas and then around technical implementation (leading, hopefully, to understanding).  Little emphasis is placed on &lt;em&gt;how&lt;/em&gt; to be a programmer, such as prototyping, testing, revision control, or code-review; that is, these things rarely factor significantly into your grade.  Even less emphasis is given to the ability to discuss ideas in context, even at the graduate level.  Undergraduate classes at most schools seem to focus on learning best practices for a particular programming language.  As graduate students, only the extremely motivated would explore beyond the papers presented in the course syllabus, or the immediate related work for a given project; tracking down the source code of other systems is almost never done (perhaps simply because it is not often available).&lt;/p&gt;

&lt;p&gt;In the professional world, no one teaches you how to do code or design reviews at this level either.  Just like in school, professional programmers are constantly subject to deadlines which override just about any extra-curricular work.  Reviews are often focused on mechanics or on vague, unsubstantiated worries.  Again, extreme personal motivation is required to move beyond this.&lt;/p&gt;

&lt;p&gt;How can we improve this situation?  Is there room at the undergraduate level for &lt;a href="http://ucosp.wordpress.com/"&gt;more capstone projects&lt;/a&gt; that unify the theory, the history, and the mechanics with the craft of programming? What about at the graduate level?  How about in a professional environment? What has been your experience?&lt;/p&gt;

&lt;p&gt;I&amp;#8217;d like to find programmers that work this way, that are excited and passionate about the craft of programming. Are you one? Get in touch.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=w6j4FzXtFMc:xAsJaBM_2L8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/w6j4FzXtFMc" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/music-education-versus-computer-science-education/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Repeated and Reproducible Systems Research]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/iO3ca4t-Y2w/" />
    <updated>2010-04-20T14:52:36-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/repeated-and-reproducible-systems-research</id>
    <content type="html">&lt;p&gt;&lt;a href="http://www.daniel-lemire.com/"&gt;Daniel Lemire&lt;/a&gt; (&lt;a href="http://twitter.com/lemire"&gt;@lemire&lt;/a&gt;) recently posted on the &lt;a href="http://www.daniel-lemire.com/blog/archives/2010/04/20/the-mythical-reproducibility-of-science/"&gt;Mythical Reproducibility of Science&lt;/a&gt;, noting that sharing code also makes it easier to spread your ideas:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;The reproducibility that matters is getting people to use your ideas. Merely proving you are honest falls short of your potential!&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;He&amp;#8217;s also written on this in the past with &lt;a href="http://www.daniel-lemire.com/blog/archives/2010/02/10/open-sourcing-your-software-hurts-your-competitiveness-as-a-researcher/"&gt;statistics on downloads&lt;/a&gt; of his software.&lt;/p&gt;

&lt;p&gt;Here are some more writings on reproducibility:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://doi.acm.org/10.1145/991130.991131"&gt;The case for repeated research in operating systems&lt;/a&gt; from ACM SIGOPS 2004.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.usenix.org/event/hotos05/final_papers/red_team.html"&gt;The Many Faces of Systems Research&amp;#8212;And How to Evaluate Them&lt;/a&gt; from HotOS 2005.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.emilsit.net/blog/archives/tools-for-repeatable-research/"&gt;Tools for repeatable research&lt;/a&gt;, that I wrote four (!) years ago, has some pointers.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;At &lt;a href="http://pdos.csail.mit.edu/"&gt;PDOS&lt;/a&gt;, Frans Kaashoek and Robert Morris definitely encourage us to build real systems and make the system available.  I definitely like this approach and never found that publishing the full source to &lt;a href="http://pdos.csail.mit.edu/chord/"&gt;Chord&lt;/a&gt;, including our work-in-progress/submission, caused any problems.  It has also meant that a lot of people still play with Chord, even if I no longer actively maintain it.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=iO3ca4t-Y2w:G9_cJnMGWms:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/iO3ca4t-Y2w" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/repeated-and-reproducible-systems-research/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Interview non-questions]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/jYEHo54RH2A/" />
    <updated>2010-04-18T23:42:24-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/interview-non-questions</id>
    <content type="html">&lt;p&gt;Once you get a job at a company, you move from one side of the interview table to the other.  My ideal candidate for just about any engineering position:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;has the ability to present technical ideas on the fly;&lt;/li&gt;
&lt;li&gt;has practical Unix knowledge;&lt;/li&gt;
&lt;li&gt;can write clearly and concisely in English and in code;&lt;/li&gt;
&lt;li&gt;has a strong technical background.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Knowledge of particular technologies or programming languages is generally not interesting.  Rather, the candidate should be smart and passionate.&lt;/p&gt;

&lt;p&gt;One way I&amp;#8217;ve started looking for passion is to see if the candidate has been involved in any volunteer work or open source projects.  But it can be hard to assess the other qualities, even in an hour long interview.  Typically, an interview assesses your ability to solve a particular problem, possibly in code, but not much about how it would be actually work with you.&lt;/p&gt;

&lt;p&gt;As we hire some more people for MVP, I&amp;#8217;m considering changing up our standard &amp;#8220;bring in candidate for a series of 45m one on ones&amp;#8221; to include some ways to probe for my desired qualities before the interview.  I&amp;#8217;d like to have  candidates perhaps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Send in a dot-file of some sort (e.g., .bashrc, .vimrc, .emacs, httpd.conf, etc.)  That is: does the candidate use Unix and customize it?  Does the candidate comment dot-files?&lt;/li&gt;
&lt;li&gt;Prepare and deliver for the interview panel, a 5m presentation on some (any!) technical topic.  Ensures the ability to communicate ideas clearly and answer questions.&lt;/li&gt;
&lt;li&gt;Provide some samples of bug reports the candidate has filed or technical discussions that the candidate has had on a mailing list.  (Say, 3 from the past 3 years, ideally from an open source project.) Alternately, provide a pointer to the candidate&amp;#8217;s blog.  That is, can this person write cogently?  &amp;#8220;Excellent communication skills&amp;#8221; anyone?&lt;/li&gt;
&lt;li&gt;Provide a code sample, something the candidate has had primary responsibility for developing, on the order of 100&amp;#8211;500 lines of code.  Two interviewers will review the code with the candidate.&lt;/li&gt;
&lt;li&gt;Provide a commit, i.e., a diff to existing code (perhaps the code sample provided) and a commit message.  This would demonstrate the ability to provide a clean functional change and document it for the team.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Most companies I interviewed with (at the PhD level) required a presentation, but only one asked for code samples.  I&amp;#8217;ve not seen any requests personally for anything else.  Have you?&lt;/p&gt;

&lt;p&gt;Incidentally, the Mobile Virtualization Platform team at VMware is hiring (mostly for our Cambridge office).  Get in touch if you&amp;#8217;re interested.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=jYEHo54RH2A:YTtcxzP8VoE:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/jYEHo54RH2A" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/interview-non-questions/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Getting started with Git]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/NmfwIezrsgI/" />
    <updated>2010-03-18T11:12:11-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/getting-started-with-git</id>
    <content type="html">&lt;p&gt;Because of our work with the Linux kernel and with Android, we have started using &lt;a href="http://git-scm.com/"&gt;Git&lt;/a&gt; more extensively at work, and my colleagues often have questions about how to get things done with Git.  While the &lt;a href="http://www.kernel.org/pub/software/scm/git-core/docs/everyday.html"&gt;every-day command lists&lt;/a&gt; are helpful, most of the time, people would benefit more from &lt;a href="http://blog.nelhage.com/2010/01/on-git-and-usability/"&gt;getting a fundamental understanding of how git works&lt;/a&gt;.
Here is a brief list of useful resources to help achieve that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://whygitisbetterthanx.com"&gt;Why Git is Better than X&lt;/a&gt;, where X is one of hg, bzr, svn, perforce.  Also, why git can be more confusing.&lt;/li&gt;
&lt;li&gt;Git&amp;#8217;s &lt;a href="http://eagain.net/articles/git-for-computer-scientists/"&gt;data model in a nutshell&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Scott Chacon&amp;#8217;s &lt;a href="http://progit.org/book/"&gt;ProGit book&lt;/a&gt; is a good tutorial and reference.&lt;/li&gt;
&lt;li&gt;Scott Chacon&amp;#8217;s &lt;a href="http://gitcasts.com/"&gt;Git screencasts&lt;/a&gt;, if you&amp;#8217;re a visual learner.&lt;/li&gt;
&lt;li&gt;Like presentations? Slides from an MIT SIPB &lt;a href="http://web.mit.edu/cluedumps/slides/understanding-git-2008.pdf"&gt;Understanding Git&lt;/a&gt; class [PDF, 1MB], if you&amp;#8217;re in a hurry, or &lt;a href="http://www.slideshare.net/chacon/getting-git"&gt;Getting Git&lt;/a&gt; from Scott Chacon if you&amp;#8217;re not.&lt;/li&gt;
&lt;li&gt;Use Mark Lodato&amp;#8217;s &lt;a href="http://marklodato.github.com/visual-git-guide/"&gt;Visual Git Reference&lt;/a&gt; to help you understand how commands interact with history, the staging area and your working directory.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For me, the core difficulty is that people have to explicitly think about the history of their code and how they would like to share it with others.  Git gives you many options whereas tools like Subversion and Perforce don&amp;#8217;t; this plethora of options can make things confusing.  In fact, it can lead to very philosophically different approaches for all aspects of your development process, ranging from &lt;a href="http://whygitisbetterthanx.com/#any-workflow"&gt;shared-repository vs integrator&lt;/a&gt; to whether or not to merge frequently (&lt;a href="http://robey.lag.net/2009/11/29/more-git.html"&gt;yes?&lt;/a&gt; or &lt;a href="http://gitster.livejournal.com/42247.html"&gt;no!&lt;/a&gt;).  Here are a few useful readings on how people actually can &lt;em&gt;use&lt;/em&gt; git:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Oliver Steele&amp;#8217;s &lt;a href="http://osteele.com/archives/2008/05/my-git-workflow"&gt;Git Workflow&lt;/a&gt; and &lt;a href="http://osteele.com/archives/2008/05/commit-policies"&gt;commit policies&lt;/a&gt;. Helpful for understanding how people use the index and remotes.&lt;/li&gt;
&lt;li&gt;Ryan Tomako has another &lt;a href="http://tomayko.com/writings/the-thing-about-git"&gt;way to use the index&lt;/a&gt; to help you split logically separate work into separate commits.  Though some people &lt;a href="http://fourkitchens.com/blog/2009/02/03/importance-atomicity-or-why-git-staging-area-bad"&gt;don&amp;#8217;t like the index&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Once you&amp;#8217;re ready for more &amp;#8220;tips&amp;#8221;, be sure to follow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://gitster.livejournal.com"&gt;Junio Hamano&amp;#8217;s blog&lt;/a&gt;.  Junio is the maintainer of git and writes deep and useful technical posts.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://progit.org/"&gt;Scott Chacon&amp;#8217;s ProGit blog&lt;/a&gt;.  Scott has done more for advocacy of Git than anyone else.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://gitready.com"&gt;Git Ready&lt;/a&gt;; an inactive but very useful set of tips from Nick Quaranto.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For more regularly updated and possibly esoteric resources, check out my &lt;a href="http://delicious.com/sit/git"&gt;git bookmarks on delicious&lt;/a&gt;.  Do you have some useful resources for getting people started with Git? Leave them in the comments.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Update:&lt;/em&gt; Added a link to the Visual Git Reference.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=NmfwIezrsgI:gJ_ViTQmyXw:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/NmfwIezrsgI" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/getting-started-with-git/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[StackOverflow DevDays Boston 2009, Afternoon]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/m3V54piYH-8/" />
    <updated>2009-10-12T07:21:55-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/stackoverflow-devdays-boston-2009-afternoon</id>
    <content type="html">&lt;p&gt;The afternoon of &lt;a href="http://stackoverflow.carsonified.com/events/boston"&gt;Boston DevDays&lt;/a&gt; 2009 was, in my opinion, not as broadly appealing as the morning sessions (see my writeup of the morning &lt;a href="http://www.emilsit.net/blog/archives/stackoverflow-devdays-boston-2009-morning/"&gt;here&lt;/a&gt;).  However, there was still a lot of interesting material presented.&lt;/p&gt;

&lt;p&gt;Joel welcomed us back from lunch by plugging
&lt;a href="http://stackexchange.com"&gt;StackExchange&lt;/a&gt; and how it&amp;#8217;ll mean the end of &amp;#8220;crappy
old copies of Usenet&amp;#8221; (by which he meant phpBB).  He showed a pretty graph of
StackOverflow edging out ExpertsExchange in traffic. He also announced a new job search
site called &lt;a href="http://careers.stackoverflow.com/"&gt;careers.stackoverflow.com&lt;/a&gt; that
charges job seekers some money and asks you what your favorite editor is.
There was also a video ad for the FogCreek training videos.  This man knows how
to monetize.&lt;/p&gt;

&lt;h3&gt;Patrick Hynds and Chris Bowen&lt;/h3&gt;

&lt;p&gt;The first technical session of the afternoon was on &lt;a href="http://asp.net/mvc/"&gt;ASP.NET MVC&lt;/a&gt;.
Patrick started the session with an explanation of ASP.NET MVC&amp;#8217;s history relative to
ASP Classic and ASP.NET, and why one might want to use a &lt;a href="http://en.wikipedia.org/wiki/Model-view-controller"&gt;model-view-controller (MVC)&lt;/a&gt;
architecture for a website: for example, much finer control over generated HTML compared to
traditional ASP, test-driven development, and better URLs for SEO.&lt;/p&gt;

&lt;p&gt;The rest of the talk was a demo of creating a hello world MVC
application in Visual Studio.  The presenters walked through
updating models and view and controllers, setting up some basic
routing.  It seems that ASP.NET MVC is a fine re-implementation of
Ruby on Rails or Django for the Microsoft world.  One concrete
tip I learned was that in Visual Studio, Control-. will offer you some
completions or other shortcuts.&lt;/p&gt;

&lt;p&gt;Reception to this talk was somewhat mixed, at least as far as I
can tell from the blogs and Tweets about it. The talk itself could have been
improved, of course; for example it
would have helped for Patrick to have explained what MVC
stood for (with a few architecture diagrams) before plugging its
advantages for ten minutes.  My take is that if
you knew nothing about MVC, it was a straightforward talk that
gave an introduction to the concepts and the implementation in
.NET.  If you were already familiar with MVC, I think you
would have thought it pretty content-free as there wasn&amp;#8217;t a tremendous
amount of focus on the ASP.NET side of things.&lt;/p&gt;

&lt;h3&gt;John Resig&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://ejohn.org/"&gt;John Resig&lt;/a&gt;, the creator and lead developer
of &lt;a href="http://jquery.org"&gt;jQuery&lt;/a&gt;, a very popular JavaScript library,
next took the stage to talk about JavaScript testing.&lt;/p&gt;

&lt;p&gt;&amp;#8220;Developing for JavaScript is a lot like whack-a-mole,&amp;#8221; John
reported.  The large space of operating systems, browsers,
browser versions, JavaScript engines, and browser plugins mean
that typically if you fix one thing you&amp;#8217;re more than likely
to break something else.  And so, in some informal studies,
John found that people just don&amp;#8217;t test.  This is something
John would like to change.&lt;/p&gt;

&lt;p&gt;A unit test suite for JavaScript apparently isn&amp;#8217;t that hard to
write.  John threw up a bunch of increasingly feature-rich test
harnesses&amp;#8212;with asserts, grouping by role, and a test runner
web-page&amp;#8212;using a few dozen lines of JavaScript.  The hardest
part of writing a test suite is likely to be adding support for
asynchronous events (e.g. XMLHttpRequests).  Fortunately, there
are several pre-built suites such as
&lt;a href="http://docs.jquery.com/QUnit"&gt;QUnit&lt;/a&gt;,
&lt;a href="http://www.jsunit.net"&gt;JSUnit&lt;/a&gt;, &lt;a href="http://developer.yahoo.com/yui/yuitest/"&gt;YUI
Test&lt;/a&gt;, and
&lt;a href="http://seleniumhq.com/"&gt;Selenium&lt;/a&gt;.  John spent a bit of time
talking about the differences between these frameworks, and
particularly plugged QUnit and YUITest.&lt;/p&gt;

&lt;p&gt;Selenium is of particular interest since, unlike the others,
it is not just a unit test framework.  It also has plugins
to allow recording and scripting events to a browser, so you
can do whole site testing.  It even comes with &lt;a href="http://selenium-grid.seleniumhq.org/"&gt;Selenium
Grid&lt;/a&gt; which will let
you distribute and automate testing.  This seems like a big win.&lt;/p&gt;

&lt;p&gt;There are JavaScript engines like
&lt;a href="http://www.mozilla.org/rhino/"&gt;Rhino&lt;/a&gt;.  To help test code
in a browserless environment, John wrote &lt;a href="http://groups.google.com/group/envjs"&gt;env.js&lt;/a&gt;, which is a
browser-like environment that runs in pure JavaScript.  He talked about how this could be used for
screen scraping.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.flickr.com/photos/emilsit/3999843934/"
title="John Resig at DevDays Boston by Emil Sit, on Flickr"&gt;
&lt;img class="right" src="http://farm3.static.flickr.com/2582/3999843934_61ceeb4fce_m.jpg" width="240" height="160" title="&amp;#34;John Resig plugs testswarm.com for distributed JavaScript testing.&amp;#34;" alt="&amp;#34;John Resig plugs testswarm.com for distributed JavaScript testing.&amp;#34;"&gt;
&lt;/a&gt;
Finally, John introduced &lt;a href="http://testswarm.com/"&gt;testswarm.com&lt;/a&gt;.
This is a SETI@Home style site where anyone can visit the
site, download some tests to run and report back the results.
This should give very broad coverage and allow developers to
get feedback from a wide range of real environments (e.g.
mobile!).&lt;/p&gt;

&lt;p&gt;Overall, John&amp;#8217;s talk was a rapid-fire overview of JavaScript
testing resources from a JavaScript ninja.  It was very practical,
easy to follow and probably great for anyone who does JavaScript
development.  However, it lacked the &amp;#8220;Python is awesome!&amp;#8221; feel of
Ned Batchelder&amp;#8217;s morning talk and so for a non-Javascript
developer such as myself, it was not as appealing.&lt;/p&gt;

&lt;h3&gt;Miguel De Icaza&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://tirania.org/blog/"&gt;Miguel De Icaza&lt;/a&gt; closed out the day
with a talk on Mono.  Miguel explained that he wasn&amp;#8217;t really sure
what he should talk about&amp;#8212;Mono is a giant universe and
explaining &amp;#8220;Mono&amp;#8221; it is &amp;#8220;kind of like explaining God&amp;#8221;&amp;#8212;and his
&lt;a href="http://tirania.org/blog/archive/2009/Sep-30.html"&gt;informal survey&lt;/a&gt; didn&amp;#8217;t
really provide a mandate.  He would up giving some nice technical
demonstrations of some recent Mono developments, with a fair amount
of personal flair to keep the audience engaged.&lt;/p&gt;

&lt;p&gt;The core of Mono is an implementation of the Common Language
Runtime for Linux.  One of the goals of this project was
to bring the best development tools to Linux.&lt;/p&gt;

&lt;p&gt;The first demonstration Miguel gave was an impressive combination
of tools.  Using a &lt;a href="http://www.go-mono.com/monovs/"&gt;plugin to Visual
Studio&lt;/a&gt;, Miguel demonstrated that
you could develop a Linux port of a .NET application entirely using
Visual Studio on Windows, and seamlessly testing and packaging
on a Linux machine (or VM), by walking through a live example
of porting BlogEngine.NET.  This made use of Bonjour to
dynamically discover the Linux machines, pushing execution
to the selected machine, and viewing the debugging results
in Visual Studio.&lt;/p&gt;

&lt;p&gt;Miguel then decided that a developer might want to publish their
application as a &lt;a href="http://nat.org/blog/2009/07/suse-studio-10/"&gt;software
appliance&lt;/a&gt;, so he
walked through a complete demonstration of using &lt;a href="http://susestudio.com/"&gt;SUSE
Studio&lt;/a&gt;.  He seamlessly built an RPM on
his Linux box from Visual Studio and pushed it into &amp;#8220;the cloud&amp;#8221; of
SUSE Studio.  From his browser, he configured an appliance with
that RPM, baked it the way they do on cooking shows, booted the
virtual machine in the cloud, accessed it using a Flash-based console in
the web browser, and accessed the port of BlogEngine.NET that he
had just booted.&lt;/p&gt;

&lt;p&gt;For his second major demonstration, Miguel moved over to
&lt;a href="http://monotouch.net"&gt;MonoTouch&lt;/a&gt;.  This showed using
&lt;a href="http://monodevelop.com/"&gt;MonoDevelop&lt;/a&gt;, an IDE for Mono
developers, running on a Mac, to work with the iPhone interface
builder application, to build a simple flashlight application
(i.e., a giant white button) for the iPhone simulator.  He
talked a little also about the technical work involved here,
which was to compile the developer&amp;#8217;s Mono code into ARM assembly
and link it into the Mono runtime, to create an iPhone
application.  This gets around Apple&amp;#8217;s &amp;#8220;no interpreters&amp;#8221; rule.&lt;/p&gt;

&lt;p&gt;Miguel&amp;#8217;s talk was easily the most entertaining one of the day.  It
was perhaps most entertaining because, in addition to his wry
humor (check out &lt;a href="http://twitter.com/irobinson/status/4691930719"&gt;the pictures from Ian Robinson&amp;#8217;s
Tweet&lt;/a&gt;, for
example), as he performed his live demonstrations, things would
break, whereupon Miguel would think, realize what was wrong,
pop open a Terminal and fix it.  That&amp;#8217;s not something you
see in the usual carefully scripted demos at most shows.
Of course, Miguel was also demonstrating some interesting
technical features and giving an advertisement for a wide-range
of Mono-related tools as well, so there was something for
everyone.&lt;/p&gt;

&lt;h3&gt;Wrap-up&lt;/h3&gt;

&lt;p&gt;After Miguel&amp;#8217;s talk, Joel suggested that the audience break
up into informal groups and get dinner, loosely organized
around seven areas that he suggested.  I hadn&amp;#8217;t planned for
that and had to get home; not sure how many people went.&lt;/p&gt;

&lt;p&gt;Overall, I think DevDays was well worth attending.  On the networking side, I got
to meet a few local developers (of whom I&amp;#8217;ve posted a few &lt;a href="http://www.flickr.com/photos/emilsit/sets/72157622558403216/"&gt;pictures on Flickr&lt;/a&gt;)
and catch up briefly with some acquaintances from school.  On the technical side, I got a broad overview of several popular technical areas from leading
figures in those areas.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=m3V54piYH-8:tevvu_Ihii8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/m3V54piYH-8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/stackoverflow-devdays-boston-2009-afternoon/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[StackOverflow DevDays Boston 2009, Morning]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/DaIgUTmUb2M/" />
    <updated>2009-10-08T09:47:44-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/stackoverflow-devdays-boston-2009-morning</id>
    <content type="html">&lt;p&gt;&lt;a href="http://stackoverflow.carsonified.com/events/boston"&gt;Boston DevDays&lt;/a&gt; kicked
off a month-long tour of technical talks aimed at programmers, organized
by &lt;a href="http://stackoverflow.com/"&gt;StackOverflow&lt;/a&gt; and &lt;a href="http://carsonified.com"&gt;Carsonified&lt;/a&gt;.
I had the good fortune to attend, meet a few interesting people and see
some fun talks.  I tried to write a bit in real-time &lt;a href="http://search.twitter.com/search?q=from:emilsit+%23devdays"&gt;(search Twitter here)&lt;/a&gt;
but the WiFi was pretty over-subscribed and there was no cell coverage
to speak of so eventually I gave up.  Here are some more detailed notes, starting with
the morning sessions.&lt;/p&gt;

&lt;p&gt;The day ran very much on schedule, and after some very loud music, DevDays opened with a funny video demonstrating the
competent technical leadership of the keynote speaker&amp;#8230;&lt;/p&gt;

&lt;h3&gt;Joel Spolsky&lt;/h3&gt;

&lt;p&gt;Joel opened the day by giving a demonstration of the tyranny of computers:
you are constantly interrupted with questions asking for decisions, like &amp;#8220;Do you want to install these
10 updates to Windows?&amp;#8221; or &amp;#8220;Enable caret browsing?&amp;#8221;, that can be really hard to answer.
He argued that one of the reasons that computers ask us so many questions is that programmers want to give
users the power (and control) to do what they want.  But that&amp;#8217;s another way of
saying that programmers don&amp;#8217;t want to make the decisions to keep things simple.  The
rest of Joel&amp;#8217;s somewhat meandering but always entertaining talk was a discussion
of how programmers (us!) should approach decision making, framed as a trade-off
between simplicity and power.&lt;/p&gt;

&lt;p&gt;Decisions are ultimately hard to make&amp;#8212;there have been many studies that
demonstrate that when people have too many choices, they freeze up and choose
nothing.  Thus, we&amp;#8217;ve seen strong push towards simplicity in recent years; one
clear example of that has been the &lt;a href="http://gettingreal.37signals.com/"&gt;Getting Real book from 37signals&lt;/a&gt;.  Joel
(jokingly?) points out that three of the four 37signals apps are in fact just
one big textarea tag that you type into.  Other examples, given later, include
of course Apple&amp;#8217;s products and Google.&lt;/p&gt;

&lt;p&gt;But why do we wind up with programs with lots of options?  Well, if you have a
simple program, you find that most people won&amp;#8217;t buy your product if it doesn&amp;#8217;t
have feature X (or Y or Z or &amp;#8230;).  So you wind up adding features over
time, as you get more customers and more experience.  Thus, removing simplicity
often happens as a side-effect of making money.&lt;/p&gt;

&lt;p&gt;How to balance these?  If we don&amp;#8217;t want to take away all decisions from the
user, we need a rule to guide us in what to remove and what to keep.  One rule
to follow is that &lt;em&gt;the computer does not get to set the agenda&lt;/em&gt;.  Good decision
points are those that should help the user achieve what they want to do.  Bad
decision points are those that interrupt the user, that the user really isn&amp;#8217;t
equipped to answer (e.g., should GMail automatically display inline images in
HTML?), that are things that the programmer cares about.&lt;/p&gt;

&lt;p&gt;To decide what is good or bad, developers need a good model to understand what the user is trying
to do&amp;#8212;Joel says every user is ultimately trying to replicate their DNA, but
you may have some more refined model.  Joel gave the example of Amazon&amp;#8217;s
1-Click purchasing where the user should just be able to buy something with
a single click.  Apparently, early drafts of 1-Click really weren&amp;#8217;t
one click: programmers kept wanting to put in things like confirmation pages.
Eventually, they arrived at just one click&amp;#8212;by not immediately starting the
order processing, but holding it for a few minutes to consolidate related
1-Click orders and allow for cancellation of errors.  This was more work for
the developers, but simpler for the user.  This is what we want to see happen.&lt;/p&gt;

&lt;p&gt;Overall, I think Joel&amp;#8217;s talk set a nice tone for how we should think as developers
but didn&amp;#8217;t offer anything particularly ground-breaking.&lt;/p&gt;

&lt;h3&gt;Ned Batchelder on Python&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://netbatchelder.com/"&gt;Ned Batchelder&lt;/a&gt; presented Python as a &amp;#8220;Clean noise free environment to type your ideas.&amp;#8221;
Or, alternatively, &amp;#8220;Python is awesome.&amp;#8221;  With a few bullets to lead-off (e.g.,
the REPL, duck typing, and &amp;#8220;batteries included&amp;#8221; nature of Python), we dove into code to
really understand what Python is capable of doing.
&lt;a href="http://nedbatchelder.com/text/devdays.html"&gt;Ned&amp;#8217;s slides&lt;/a&gt; are online if you want to
take a look.  I won&amp;#8217;t cover his talk in much detail since it was largely
explaining the Python language but will list some high points.&lt;/p&gt;

&lt;p&gt;The first example was Peter Norvig&amp;#8217;s concise &lt;a href="http://norvig.com/spell-correct.html"&gt;spell corrector&lt;/a&gt;.
This code makes ample use of &lt;a href="http://docs.python.org/tutorial/datastructures.html#list-comprehensions"&gt;list comprehensions&lt;/a&gt; and &lt;a href="http://docs.python.org/tutorial/datastructures.html#sets"&gt;sets&lt;/a&gt;, plus functions like &lt;a href="http://docs.python.org/library/re.html#re.findall"&gt;&lt;code&gt;re.findall&lt;/code&gt;&lt;/a&gt;,
&lt;a href="http://docs.python.org/library/collections.html#collections.defaultdict"&gt;&lt;code&gt;collections.defaultdict&lt;/code&gt;&lt;/a&gt;, and &lt;a href="http://docs.python.org/library/functions.html#max"&gt;parameterized &lt;code&gt;max&lt;/code&gt;&lt;/a&gt;, which were all new to me.&lt;/p&gt;

&lt;p&gt;The second example was a custom build-up of a simple Python
templating engine, loosely based on &lt;a href="http:/www.djangoproject.com/"&gt;Django&lt;/a&gt;&amp;#8217;s template styles.
This example demonstrated simple formatting with the &lt;a href="http://docs.python.org/tutorial/inputoutput.html#fancier-output-formatting"&gt;percent operator&lt;/a&gt;, but then quickly moved into more advanced features
like duck-typing (by implementing &lt;a href="http://docs.python.org/reference/datamodel.html#object.__getitem__"&gt;&lt;code&gt;__getitem__&lt;/code&gt;&lt;/a&gt;) and &lt;a href="http://docs.python.org/library/functions.html#callable"&gt;&lt;code&gt;callable&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One of the nice things about Ned&amp;#8217;s presentation is that he demonstrated the
power of Python in two short but extremely powerful examples that left people
who didn&amp;#8217;t know Python thinking, &amp;#8220;Wow! That is really amazing!&amp;#8221;&lt;/p&gt;

&lt;p&gt;Ned recommended reading &lt;a href="http://diveintopython.org"&gt;Dive into Python&lt;/a&gt;
to learn more, and if you are in the Boston-area,  checking out the &lt;a href="http://python.meetup.com/181/"&gt;Cambridge Python meetup&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Dan Pilone&lt;/h3&gt;

&lt;p&gt;&lt;a href="http://twitter.com/danpilone"&gt;Dan Pilone&lt;/a&gt;&amp;#8217;s talk was an overview of iPhone development,
first giving a quick market overview, then giving a broad overview
of the nitty-gritty of developing in Objective C, and finally diving
into the practical economics and realities of selling iPhone apps.
The iPhone app market (as of Apple&amp;#8217;s September numbers), consists of
something like 85,000 applications, of which 75% are pay
applications.  Some 35% of applications are games, whereas 6% fall
into more social applications.  There are 5 or 6 different
iPhone/iTouch hardware platforms, and something like 20 million
plus devices sold.  The iPhone has a great user
experience and comes with a great delivery model (the iTunes app
store).  This combined with some of the numbers make it a
good platform to develop for, with significant amounts of money
that can be earned.  (Dan emphasized that you really have to work
to develop and market the application, hence the &amp;#8220;earned&amp;#8221;.)&lt;/p&gt;

&lt;p&gt;The iPhone development environment is very shiny.  I&amp;#8217;m sure
&lt;a href="http://arstechnica.com/apple/news/2008/10/iPhone-SDK.ars"&gt;Ars Technica&lt;/a&gt;
has a much better overview but in short Dan demonstrated some
tools like XCode, CoreData (a graphical SQLite data modeler),
reference counting support (aka retain), and Instruments (a memory
profiler).  Dan suggested that this shininess is to make up
for some of the oddities of &lt;a href="http://en.wikipedia.org/wiki/Objective-C"&gt;Objective C&lt;/a&gt;
that you&amp;#8217;ll have to live with.  He also demonstrated some of
the interface builder tools and how they link up.&lt;/p&gt;

&lt;p&gt;Testing turns out to be quite interesting; the simulator
is okay but limited and often your app will work in the simulator
but fail on real devices.  For example, your app on a real device
might &amp;#8220;run so slow you wish it had crashed&amp;#8221;.  The simulator also doesn&amp;#8217;t
enforce sandboxing as strictly as real devices, where
each app has its own uid and single directory where it
can store data.  There are also many different hardware
variants that you have to support that limit you in different ways: for example, early iPhones only give you
40MB of memory to play with whereas newer ones give you almost 120MB.  This is not reflected well in the simulator either.&lt;/p&gt;

&lt;p&gt;Shipping an iPhone app on the app store requires approval,
a process that can take two weeks &lt;em&gt;per round-trip&lt;/em&gt;.
There&amp;#8217;s no way to get around it so you must budget in time
for that in development.  The approval store helps guarantee
a minimal level of quality of apps&amp;#8212;they will verify that
your app indeed works on all different hardware, and they will
(eventually) catch any licensing violations but they&amp;#8217;re
overall pretty reasonable.&lt;/p&gt;

&lt;p&gt;Once you get approval, you show up in the recent releases section
of the app store, and there you have about 24 hours to get popular or
else you will fade into the long tail of oblivion. In fact, if you are in the top 50 apps,
you will easily get double the sales (presumably relative to the 51st app); if you
are in the top 10, you&amp;#8217;ll be getting an order-of-magnitude more sales. So, make
sure you get your approval/release-date lined up with your marketing blitz.
The alternative is to charge a bit more than $0.99, and go for slow but steady sales.&lt;/p&gt;

&lt;p&gt;As a Blackberry owner and Linux user, I found Dan&amp;#8217;s talk to be a great introduction
to iPhone development.  Presumably his new ORA book, &lt;a href="https://www.amazon.com/dp/0596803540?tag=indashe-20&amp;amp;camp=0&amp;amp;creative=0&amp;amp;linkCode=as4&amp;amp;creativeASIN=0596803540&amp;amp;adid=11M4W940YFT2A0HNB3MM&amp;amp;"&gt;Head First iPhone Development&lt;/a&gt;, would be a good buy if you are in to that sort of thing.&lt;/p&gt;

&lt;h3&gt;Joel on FogBugz&lt;/h3&gt;

&lt;p&gt;Before lunch, Joel took the opportunity to give a pitch for his company&amp;#8217;s FogBugz product,
and announce some new features.  He gave us a walk-through of its capabilities, from
organizing documentation, to project planning, to bug tracking, to user support.  New features announced
are a rich plugin architecture, plus support for Mercurial and code reviews
in a new hosted plug-in to FogBugz called &lt;a href="http://fogcreek.com/kiln/"&gt;Kiln&lt;/a&gt;.  He spent a fair amount of time
on that, demonstrating calling &lt;code&gt;hg push&lt;/code&gt; from the command line.  He also demonstrated the &lt;a href="http://www.joelonsoftware.com/items/2007/10/26.html"&gt;evidence-based scheduling&lt;/a&gt; features of FogBugz.&lt;/p&gt;

&lt;p&gt;Nothing too exciting for me working in a big company using Perforce, but a good marketing opportunity for FogCreek
and a nice chance to see how some other people do scheduling and bug tracking.  I was a bit disappointed that
there&amp;#8217;s no direct way to do pre-commit (i.e. pre-push) reviews a la &lt;a href="http://code.google.com/p/gerrit/"&gt;Gerrit&lt;/a&gt;, but &lt;a href="http://twitter.com/jasonrr/status/4689929004"&gt;@jasonrr&lt;/a&gt; says you can set up a branch repo, push to that and then review there before merging to main.  I expect this means
that &lt;a href="http://github.com"&gt;GitHub&lt;/a&gt; will be getting code review support soon.&lt;/p&gt;

&lt;h3&gt;Lunch!&lt;/h3&gt;

&lt;p&gt;With that, we broke for lunch. &lt;a href="http://twitter.com/jldio"&gt;@jldio&lt;/a&gt; has the &lt;a href="http://www.jimdio.net/?p=23"&gt;scoop on lunch&lt;/a&gt;, and his own write-up of the day too.
More to come later, thanks for reading.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=DaIgUTmUb2M:aC08t-j--3w:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/DaIgUTmUb2M" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/stackoverflow-devdays-boston-2009-morning/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Systems Researchers: Mike Freedman]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/hKptx3wifVY/" />
    <updated>2009-03-25T06:00:49-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/systems-researchers-mike-freedman</id>
    <content type="html">&lt;p&gt;&lt;a href="http://www.cs.princeton.edu/~mfreed/"&gt;Mike Freedman&lt;/a&gt; and I have known each other since we were Masters students at MIT, working on things like the Tarzan anonymizing network (a parallel, pre-cursor to Tor).  He went on to build the hugely successful (&amp;#8220;as seen on Slashdot&amp;#8221;) Coral content distribution network, which figured largely in his dissertation.  It&amp;#8217;s a great treat to have him talk here about how Coral was built and deployed.  Be sure also to check out his &lt;a href="http://sns.cs.princeton.edu/blog/"&gt;research group blog&lt;/a&gt; for more interesting thoughts from him and his students!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What did you build?&lt;/em&gt;&lt;br/&gt;
&lt;a href="http://www.coralcdn.org/"&gt;CoralCDN&lt;/a&gt; is a semi-open content distribution network.  Our stated goal with &lt;a href="http://www.coralcdn.org/"&gt;CoralCDN&lt;/a&gt; was to
&amp;#8220;democratize content publication&amp;#8221;: namely, allow websites to scale by
demand, without requiring special provisioning or commercial CDNs to
provide the &amp;#8220;heavy lifting&amp;#8221; of serving their bits.  Publishing with
CoralCDN is as easy as slightly modifying a URL to include &lt;code&gt;.nyud.net&lt;/code&gt;
in its hostname (e.g., &lt;a
href="http://www.cnn.com.nyud.net/"&gt;http://www.cnn.com.nyud.net/&lt;/a&gt;),
and the content is subsequently requested from and served by
CoralCDN&amp;#8217;s network of caching web proxies.&lt;/p&gt;

&lt;p&gt;Our initial goal for deploying &lt;a href="http://www.coralcdn.org/"&gt;CoralCDN&lt;/a&gt; was a network of volunteer
sites that would cooperate to provide such &amp;#8220;automated mirroring&amp;#8221;
functionality, much like sites do somewhat manually with open-source
software distribution.  As we progressed, I also imagined that small
groups of users could also cooperate in a form of time-sharing for
network bandwidth: they each would provide some relatively constant
amount of upload capacity, with the goal of being able to then handle
any sudden spikes (from the Slashdot effect, for example) to any
participant.  This model fits well with how 95th-percentile billing
works for hosting and network providers, as it then becomes very
important to flatten out bandwidth spikes.  We started a deployment of
CoralCDN on &lt;a href="http://www.planet-lab.org/"&gt;PlanetLab&lt;/a&gt;, although it never really migrated off that
network.  (We did have several hundred users, and even some major
Internet exchange points, initially contact us to run CoralCDN nodes,
but we didn&amp;#8217;t go down that path, both for manageability and security
reasons.)&lt;/p&gt;

&lt;p&gt;CoralCDN consists of three main components, all written from scratch:
a special-purpose web proxy, nameserver, and distributed hash table
(DHT) indexing node.  CoralCDN&amp;#8217;s proxy and nameserver are what they
sound like, although they have some differences given that they are
specially designed for our setting.  The proxy has a number of design
choices and optimizations well-suited for interacting with websites
that are on their last legs&amp;#8212;CoralCDN is designed for dealing with
&amp;#8220;Slashdotted&amp;#8221; websites, after all&amp;#8212;as well as being part of a big
cooperative caching network.  The nameserver, on the other hand, is
designed to dynamically synthesize DNS names (of the form &lt;code&gt;.nyud.net&lt;/code&gt;),
provide some locality and load balancing properties when selecting
proxies (address records it returns), and ensure that the returned
proxies are actually alive (as the proxy network itself is comprised
of unreliable servers).  The indexing node forms a DHT-based
structured routing and lookup structure that exposes a put/get
interface for finding other proxies caching a particular web object.
Coral&amp;#8217;s indexing layer differs from traditional DHTs (such as
&lt;a href="http://pdos.csail.mit.edu/chord/"&gt;MIT&amp;#8217;s Chord/DHash&lt;/a&gt;) in that it creates a hierarchy of locality-based
clusters, each which maintains a separate DHT routing structure and
put/get table, and it provides weaker consistency properties within
each DHT structure.  These latter guarantees are possibly because
Coral only needs to find &lt;em&gt;some&lt;/em&gt; proxies (preferably nearby ones)
caching a particular piece of content, not &lt;em&gt;all&lt;/em&gt; such proxies.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tell us about what you built it with.&lt;/em&gt;&lt;br/&gt;
CoralCDN is built in C++ using &lt;a href="http://www.scs.stanford.edu/~dm/"&gt;David Mazieres&lt;/a&gt;&amp;#8217;
&lt;a href="http://www.okws.org/doku.php?id=sfslite"&gt;libasync and libarpc&lt;/a&gt;
libraries, originally built for the Self-certifying File System (SFS).
This came out of my own academic roots in MIT&amp;#8217;s PDOS group, where SFS
was developed by David and its libraries are widely used.  (David was
my PhD advisor at NYU/Stanford, and I got my MEng degree in PDOS.)
Some of the HTTP parsing libraries used by CoralCDN&amp;#8217;s web proxy were
from &lt;a href="http://www.okws.org/"&gt;OKWS&lt;/a&gt;, &lt;a href="http://maxk.org/"&gt;Max Krohn&lt;/a&gt;&amp;#8217;s webserver also written using SFS libraries.
Max was research staff with David at NYU during part of the time I was
there.  It&amp;#8217;s always great to use libraries written by people you know
and can bother when you find a bug (although for those two, that was a
rare occurrence indeed!).&lt;/p&gt;

&lt;p&gt;When I started building CoralCDN in late 2002, I initially attempted
to build its hierarchical indexing layer on top of the &lt;a href="http://pdos.csail.mit.edu/chord/"&gt;MIT Chord/DHash&lt;/a&gt;
implementation, which also used SFS libraries.  This turned out to be
a mistake (dare I say nightmare?), as there was a layering mismatch
between the two systems: I wanted to build distinct, localized DHT
clusters in a certain way, while Chord/DHash sought to build a single,
robust, global system.  It was thus rather promiscuous in maintaining
group membership, and I was really fighting the way it was designed.
Plus, &lt;a href="http://pdos.csail.mit.edu/chord/"&gt;MIT Chord&lt;/a&gt; was still research-quality code at the time, so bugs
naturally existed, and it was really difficult to debug the resulting
system with large portions of complex, distributed systems code that I
hadn&amp;#8217;t written myself.  Finally, we initially thought that the &amp;#8220;web
proxy&amp;#8221; part of the system would be really simple, so our original
proxy implementation was just in python.  CoralCDN&amp;#8217;s first
implementation was scrapped after about 6 months of work, and I
restarted by writing my own DHT layer and proxy (in C++ now) from
scratch.  It turns out that the web proxy has actually become the
largest code base of the three, continually expanded during the
system&amp;#8217;s deployment to add security, bandwidth shaping and
fair-sharing, and various other robustness mechanisms.&lt;/p&gt;

&lt;p&gt;Anyway, back to development libraries. I think the SFS libraries
provide a powerful development library that makes it easy to build
flexible, robust, fast distributed services&amp;#8230;provided that one spends
time overcoming their higher learning curve.  Once you learn them,
however, they make it really easy to program in an event-based style,
and the RPC libraries prevent many of the silly bugs normally
associated with writing your own networking protocols.  I think Max&amp;#8217;s
&lt;a href="http://www.okws.org/doku.php?id=sfslite:tame2"&gt;tame&lt;/a&gt; libraries significantly improve the readability and (hopefully)
lessen the learning curve of doing such event-based programming, as
tame &lt;a href="http://www.usenix.org/events/usenix07/tech/krohn.html"&gt;removes the &amp;#8220;stack-ripping&amp;#8221;&lt;/a&gt; that one normally sees associated
with events.  Perhaps I&amp;#8217;ll use tame in future projects, but as I&amp;#8217;ve
already climbed the learning curve of &lt;a href="http://www.okws.org/doku.php?id=sfslite"&gt;libasync&lt;/a&gt; myself, I haven&amp;#8217;t yet.&lt;/p&gt;

&lt;p&gt;That said, one of my PhD students at Princeton, &lt;a
href="http://www.cs.princeton.edu/~jterrace/"&gt;Jeff Terrace&lt;/a&gt;, is
building a high-throughput, strongly-consistent object-based
(key/value) storage system called CRAQ using tame.  He&amp;#8217;s seemed to
really like it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you test your system for correctness?&lt;/em&gt;&lt;br/&gt;
I deploy it?  In seriousness, it&amp;#8217;s very difficult to test web proxies,
especially ones deployed in chaotic environments and interacting with
poorly-behaving clients and servers.&lt;/p&gt;

&lt;p&gt;I did most of my testing during initial closed experiments on about
150-300 PlanetLab servers, which is a distributed testbed deployed at
a few hundred universities and other institutions that each operate
two or more servers.  Testing that the DHT &amp;#8220;basically&amp;#8221; worked was
relatively easy: see if you actually get() what you put().  There are
a lot of corner cases here, however, especially when one encounters
weird network conditions, some of which only became apparent after we
moved Coral from the network-friendly North American and European
nodes to those PlanetLab servers in China, India, and Australia.
Always be suspicious with systems papers that describe the authors&amp;#8217;
&amp;#8220;wide deployment&amp;#8221; on &amp;#8220;selected&amp;#8221; (i.e., cherry-picked) U.S. PlanetLab
servers.&lt;/p&gt;

&lt;p&gt;Much of the testing was just writing the appropriate level of debug
information so we could trace requests through the system.  I got
really tired of staring at routing table dumps at that time.  Last
year I worked with Rodrigo Fonseca to integrate &lt;a
href="http://x-trace.net"&gt;X-Trace&lt;/a&gt; into CoralCDN, which would have
made it &lt;em&gt;significantly&lt;/em&gt; easier to trace transactions through the DHT
and the proxy network.  I&amp;#8217;m pretty excited about such tools for
debugging and monitoring distributed systems in a fine-grained
fashion.&lt;/p&gt;

&lt;p&gt;Testing all the corner cases for the proxy turned out to be another
level of frustration.  There&amp;#8217;s really no good way to completely debug
these systems without rolling them out into production deployments,
because there&amp;#8217;s no good suite of possible test cases: The potential
&amp;#8220;space&amp;#8221; of inputs is effectively unlimited.  You constantly run into
clients and servers which completely break the HTTP spec, and you just
need to write your server to deal with these appropriately.  Writing a
proxy thus because a little bit of learning to &amp;#8220;guess&amp;#8221; what developers
mean.  I think this actually has become worse with time.  Your basic
browser (FireFox, IE) or standard webserver (Apache) is going to be
quite spec-compliant.  The problem is that you now have random
developers writing client software (like podcasters, RSS readers,
etc.) or generating Ajax-y XmlHttpRequest&amp;#8217;s.  Or casual developers
dynamically generating HTTP on the server-side via some scripting
language like PHP.  Because who needs to generate vaguely
spec-compliant HTTP if you are writing both the client and server?
(Hint: there might be a middlebox on path.)  And as it continues to
become even easier to write Web services, you&amp;#8217;ll probably continue to
see lots of messy inputs and outputs from both sides.&lt;/p&gt;

&lt;p&gt;So while I originally tested CoralCDN using its own controlled
PlanetLab experiments, after the system went live, I started testing
new versions by just pushing them out to one or a few nodes in the
live deployment.  Then I just monitor these new versions carefully
and, if things seemed to work, slowly push them out across the entire
network.  Coral nodes include a shared secret in their packet headers,
which excludes random people from joining our deployment.  I also use
these shared secrets to deploy new (non-backwards-compatible) versions
of the software, as the new version (with a new secret nonce) won&amp;#8217;t
link up with DHT nodes belonging to previous versions.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you deploy your system? How big of a deployment?&lt;/em&gt;&lt;br/&gt;
CoralCDN has been running 24/7 on 200-400 PlanetLab servers since
March 2004.  I manage the network using &lt;a href="http://appmanager.berkeley.intel-research.net/"&gt;AppManager&lt;/a&gt;, built by Ryan
Huebsch from Berkeley, which provides a SQL server that keeps a record
of current node run state, requested run state, install state, etc.
So AppManager gives me a Web interface to control the desired runstate
of nodes, then all nodes &amp;#8220;call home&amp;#8221; to the AppManager server to
determine updated runstate.  You write a bunch of shell scripts to
actually use these run states to start or stop nodes, manage logs,
etc.  This &amp;#8220;bunch of shell scripts&amp;#8221; eventually grew to be about 3000
lines of &lt;code&gt;bash&lt;/code&gt;, which was somewhat unexpected.  While AppManager is a
single server (although nodes are configured with a backup host for
failover), CoralCDN&amp;#8217;s scripts are designed for nodes to &amp;#8220;fail same&amp;#8221;.
That is, requested runstate is stored durably on each node, so if the
management server is offline or returns erroneous data (which it has
in the past), the nodes will maintain their last requested runstate
until the management server comes back online and provides a valid
status update.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you evaluate your system?&lt;/em&gt;&lt;br/&gt;
We performed all the experiments one might expect in an academic
evaluation on an initial test deployment on PlanetLab. Our &lt;a
href="http://www.cs.princeton.edu/~mfreed/docs/coral-nsdi04.pdf"&gt;NSDI
&amp;#8216;04&lt;/a&gt; paper discusses these experiments.&lt;/p&gt;

&lt;p&gt;After that stage, CoralCDN just runs &amp;#8211; people continue to use it, so
it provides some useful functionality.  My interest transitioned from
providing great service to just keeping it running (while I moved onto
other research).&lt;/p&gt;

&lt;p&gt;I probably spend about 10 minutes a week &amp;#8220;keeping CoralCDN running&amp;#8221;,
which is typically spent answering abuse complaints, rather than
actually managing the system.  This is largely because the system&amp;#8217;s
algorithms were designed to be completely self-organizing &amp;#8211; as we
initially thought of CoralCDN as a peer-to-peer system &amp;#8211; as opposed
to a centrally-managed system designed for PlanetLab.  System
membership, fault detection and recovery, etc., is all completely
automated.&lt;/p&gt;

&lt;p&gt;Unfortunately, dynamic membership and failover doesn&amp;#8217;t extend to the
primary nameservers we have registered for &lt;code&gt;.nyud.net&lt;/code&gt; with the &lt;code&gt;.net&lt;/code&gt;
gTLD servers.  These 10-12 nameservers also run on PlanetLab servers,
so if one of these servers go offline, our users experience bad DNS
timeouts until I manually remove that server from the list registered
with Network Solutions.  (PlanetLab doesn&amp;#8217;t provide any IP-layer
virtualization that would allow us to failover to alternate physical
servers without modifying the registered IP addresses.)  And I have to
admit I&amp;#8217;m pretty lazy about updating the DNS registry, especially
given the rather painful web UI that Network Solution provides.  (In
fairness, the UIs for GoDaddy and other registrars I&amp;#8217;ve used are
similarly painful).  I think registrars should really provide a
programmatic API for updating entries, but haven&amp;#8217;t found one for
low-cost registrars yet.  Anyway, offline nameservers are probably the
biggest performance problem with CoralCDN, and probably the main
reason it seems slow at times.  This is partly a choice I made,
however, in not becoming a vigilant system operator for which managing
CoralCDN becomes a full-time job.&lt;/p&gt;

&lt;p&gt;There&amp;#8217;s a lesson to be had here, I think, for academic systems that
somehow &amp;#8220;escape the lab&amp;#8221; but don&amp;#8217;t become commercial services: either
promote lessened expectations for your users (and accept that reality
yourself), build up a full-time developer/operations staff (a funding
quandary), or expect the project to soon die-off after its initial
developers lose interest or incentives.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Anything you’d like to add?&lt;/em&gt;&lt;br/&gt;
My research group actually just launched a &lt;a
href="http://sns.cs.princeton.edu/blog/"&gt;blog&lt;/a&gt;.  In the next few
weeks, I&amp;#8217;ll be writing a series about some of the lessons I&amp;#8217;ve learned
from building and deploying CoralCDN.  I want to invite all your
readers to look out for those posts and really welcome any comments or
discussion around them.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=hKptx3wifVY:-GuLv2CJwmk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/hKptx3wifVY" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/systems-researchers-mike-freedman/</feedburner:origLink></entry>
  
  <entry>
    <title type="html"><![CDATA[Systems Researchers: Justin Cappos]]></title>
    <link href="http://feedproxy.google.com/~r/EmilSitMainBlog/~3/PcFnGO5uB8g/" />
    <updated>2009-03-16T06:00:22-04:00</updated>
    <id>http://www.emilsit.net/blog/archives/systems-researchers-justin-cappos</id>
    <content type="html">&lt;p&gt;&lt;a href="http://www.cs.arizona.edu/~justin/"&gt;Justin Cappos&lt;/a&gt; received his PhD from the &lt;a href="http://www.cs.arizona.edu/"&gt;University of Arizona&lt;/a&gt; under the supervision of &lt;a href="http://www.cs.arizona.edu/people/jhh/"&gt;John Hartman&lt;/a&gt;.  I met Justin several years ago at a PlanetLab Consortium meeting when he was starting to work on Stork, a system to simplify package deployment.  He is currently a Post Doc at the University of Washington working with &lt;a href="http://www.cs.washington.edu/homes/tom"&gt;Tom Anderson&lt;/a&gt; and &lt;a href="http://www.cs.washington.edu/homes/arvind"&gt;Arvind Krishnamurthy&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What did you build?&lt;/em&gt;&lt;br/&gt;
The most relevant / longest term projects are: &lt;a href="http://www.cs.arizona.edu/stork"&gt;Stork&lt;/a&gt;, &lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;San Fermin&lt;/a&gt;, and &lt;a href="https://seattle.cs.washington.edu"&gt;Seattle&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Stork is a &lt;a href="http://www.cs.arizona.edu/stork"&gt;package manager&lt;/a&gt; that has security and functionality
improvements over existing Linux package managers.   Some of the
advances we made in Stork have been adapted by APT, YUM, YaST and
other popular Linux package managers.&lt;/p&gt;

&lt;p&gt;San Fermin is a system for &lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;aggregating large quantities of data from
computers&lt;/a&gt;.   San Fermin provides the result faster and with better
fault tolerance than existing techniques.&lt;/p&gt;

&lt;p&gt;Seattle is an &lt;a href="https://seattle.cs.washington.edu"&gt;educational testbed&lt;/a&gt; built from resources donated by
universities all around the world.   The universities run a safe,
lightweight VM that students from other universities can run code in.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tell us about what you built it with.&lt;/em&gt;&lt;br/&gt;
I used Java for &lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;San Fermin&lt;/a&gt; because I needed to leverage existing
Pastry code that was in Java.   I used Python for &lt;a href="http://www.cs.arizona.edu/stork"&gt;Stork&lt;/a&gt; and &lt;a href="https://seattle.cs.washington.edu"&gt;Seattle&lt;/a&gt;.
I found Python to be far superior for large projects (other languages
I&amp;#8217;ve used are Java, C, C++, QBASIC, and Pascal).   Python has been a
dream come true because it&amp;#8217;s great for prototyping, easy to learn, and
the resulting code is readable (so long as you have sensible style
constraints on the written code).&lt;/p&gt;

&lt;p&gt;Perhaps the most useful thing is getting other developers involved.
I like to do the initial prototyping myself, but after that it is
great to have others helping out.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you test your system for correctness?&lt;/em&gt;&lt;br/&gt;
There are whitebox / unit tests as well as blackbox / integration
tests for most parts of the systems.   The time that we spent building
thorough test cases really paid off because it simplifies debugging.&lt;/p&gt;

&lt;p&gt;I like to use my systems in real world environments with outside
users, so the system is never done thus correctness is an iterative
process.   If I&amp;#8217;m aware of bugs, we fix them.   If I&amp;#8217;m not aware of
bugs, we&amp;#8217;re adding features based upon user requests (and therefore
may be adding more bugs for us to fix later).   In general, the fact
that we have users that rely on the software over long time periods is
a testament to its stability which is related to correctness.&lt;/p&gt;

&lt;p&gt;To more specifically answer the question you are really asking, I
usually run my code by hand and evaluate it in small / constrained
environments (turning these test runs into unit tests).   I also
follow a philosophy where I try to &amp;#8220;detect and fail&amp;#8221; as much as
possible.   I care more about correctness than performance (at least
initially) and so add many redundant checks in my code to catch errors
as soon as possible.&lt;/p&gt;

&lt;p&gt;I find that if I&amp;#8217;m careful and thorough when writing my code, I spend
very little time debugging.   I probably spend about 30% of the time
writing code, 40% writing comments / docs (which I normally write
before / during coding), 20% of the time writing test code for
individual modules, and about 10% debugging after the fact.   I think
part of this is I&amp;#8217;m really careful about checking input and boundary
conditions and so I can normally pin-point the exact cause of a
failure.&lt;/p&gt;

&lt;p&gt;In terms of problems when writing code, I generally only use standard
libraries and code I&amp;#8217;ve written.   I don&amp;#8217;t depend on third party code
because I don&amp;#8217;t know what level of support it will have.   Also, since
I usually code in Python, it&amp;#8217;s easy for me to add functionality to do
whatever I need.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you deploy your system?  How big of a deployment?&lt;/em&gt;&lt;br/&gt;
Stork has been deployed on &lt;a href="http://www.planet-lab.org/"&gt;PlanetLab&lt;/a&gt; for about 6 years.   For the
majority of the time we&amp;#8217;ve been deployed on every working PlanetLab
node.   Stork has managed &gt; 500K VMs and when I last checked was used
daily by users on around two dozen sites.   We initially used
AppManager to deploy Stork, but since have been using PlanetLab&amp;#8217;s
initscript functionality.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;San Fermin&lt;/a&gt; was deployed on PlanetLab for use in combining and managing
logs from &lt;a href="http://www.cs.arizona.edu/stork"&gt;Stork&lt;/a&gt;.   However, we found that &lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;San Fermin&lt;/a&gt; was dysfunctional
due to difficulties in getting Pastry to start reliably and work when
non-transitive connectivity occurs.   As a result, we mainly ended up
using San Fermin as a publication vehicle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://seattle.cs.washington.edu"&gt;Seattle&lt;/a&gt; is currently deployed on more than 1100 computers around the
world.   We have done our initial deployment by encouraging educators
to use our platform in networking and distributed systems classes.
Our longer term plans involve using Seattle to build a research
testbed of 1 million nodes.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How did you evaluate your system?&lt;/em&gt;&lt;br/&gt;
With &lt;a href="http://www.usenix.org/events/nsdi08/tech/cappos.html"&gt;San Fermin&lt;/a&gt; (and other systems research I haven&amp;#8217;t mentioned),
there is fairly clear related work to compare against.   In some
cases, the biggest challenge has been getting the existing research
prototype code from another project to run well enough for comparison.&lt;/p&gt;

&lt;p&gt;For the work that I&amp;#8217;ve done where I&amp;#8217;ve focused on impact over
publication impact (&lt;a href="https://seattle.cs.washington.edu"&gt;Seattle&lt;/a&gt; / &lt;a href="http://www.cs.arizona.edu/stork"&gt;Stork&lt;/a&gt;), evaluation is much more
difficult because these aren&amp;#8217;t incremental improvements over existing
models and systems.   These systems break the mold in terms of
security and / or functionality so sometimes it&amp;#8217;s difficult to know
how to compare them to existing work.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/EmilSitMainBlog?a=PcFnGO5uB8g:j7aazFLLoug:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/EmilSitMainBlog?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/EmilSitMainBlog/~4/PcFnGO5uB8g" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://www.emilsit.net/blog/archives/systems-researchers-justin-cappos/</feedburner:origLink></entry>
  
</feed>
