Ruby Pond

Default environment specific values for DATABASE_URL

2012-04-11T16:12:00+00:00

If you've been following the intelligent advice from 12 Factor when building your application then you'll know that environment variables are the correct place to specify settings for your environment. Crazy eh?

However this is at odds with the assumptions and conventions in Rails that database.yml is the canonical source of all your database configs. Having to export your environment variables any time you want to change the environment of the app, or over-riding in test_helper/spec_helper is needless pain.

Keeping DATABASE_URL and database.yml consistent

Instead of having to manage all this manually, lets just let Rails win for now by setting DATABASE_URL to the value(s) defined within database.yml. No more having to set default values in your helpers, no more re-exporting your environment locally when you want to change modes, no more exports if you're using libraries that assume you're configuring your app in an intelligent and scalable way (like queue_classic).

Add this to your Gemfile:

gem "rails-database-url"

That's it. Source is on GitHub, pull requests gladly accepted if you find any bugs.

Simplifying ActiveRecord Connection Strings

2011-07-24T21:00:00+00:00

Over the years I've gone to great lengths to avoid committing production passwords into version control. With Rails this has inevitably meant a number of different ways to ensure the database.yml was where it needed to be on the production servers, with the right credentials. Custom Capistrano recipes to create a new config when setting up a server, moving it into place after a deploy, it worked well enough but it's always felt a little heavy.

There's got to be a better way

There are a whole range of configuration settings that are environment dependent. Which database to connect to, what queues Resque workers should listen to, the Rack/Application environment mode. What we need is a standard way on each server to set these environment variables. Oh hang on, environment variables!

It's easy enough to either export an environment variable in a shell, or pass it in with the command like RACK_ENV=production rackup config.ru but a more complicated configuration like ActiveRecord requires doesn't quite work like that. It's expecting a Hash that defines the database adapter to use, username, password, the database, a host, and possibly a bunch of other options. We could probably serialise that back and forth between a Hash and something like JSON but it feels like a bit of a hack.

What would be better is if we could come up with a simple and uniform string format that could define how to locate any resource, including a database. I propose we call such a string a Uniform Resource Locator, or URL for short ;)

Using URLs to connect to databases

This isn't a new technique, other ORMs like Sequel have been doing this for years. It's just that ActiveRecord hasn't supported it, until now. I've created a gem that adds support for URL based connections to ActiveRecord, it's called activerecord_url_connections.

Add it to your Gemfile:

gem "activerecord_url_connections"

And now you can connect to your database by adding the following to an initializer (e.g., config/initializers/activerecord.rb):

ActiveRecord::Base.establish_connection(ENV["DATABASE_URL"])

Now you can either export the DATABASE_URL environment variable in your app environment or set it when you start your app like so:

DATABASE_URL=postgres://localhost/myapp_production

No more juggling database.yml on production servers, no risk of checking sensitive credentials into version control.

Convention Over Configuration

Something still doesn't feel quite right about this approach though. Creating an initializer just to have that one line seems a bit needless, especially when it will be exactly the same in almost every app. So in much the same way Rails assumes a default connection to use if you don't provide one, so it will look to see if you've set DATABASE_URL and use it when available.

Stick with the conventions, and you'll not need to do anything beyond adding the gem and setting DATABASE_URL. Sweet!

But wait, there's more!

There's some further good news too. Firstly, for anyone using Heroku for hosting Rack apps that use ActiveRecord this means connection to your database will "just work" (it's not needed for Rails as Heroku create a database.yml file so the existing Rails behaviour will work). Secondly this change has made it into ActiveRecord for the 3.2 release so you'll only need this gem to backport the behaviour to previous releases.

Slaying dragons with git, bash, and ruby

2010-09-07T13:45:00+00:00

An often over-looked feature when using git are the various hooks you have available. They cover pre-applypatch, post-update, and anything between or beyond. I suspect a lot of people may have first been introduced to them when integrating with a Continuous Integration server as a means of telling it to test a new build, but they work equally well as a hidden monkey saving your from showing the world some of your more embarrassing mistakes.

Getting started with git hooks

Within your cloned git repository you'll most likely be aware of the .git/ directory. Within there you'll have another directory called hooks/ which, surprise surprise, your git-hooks live. You'll probably have a bunch of existing hooks in there with .sample as the extension to stop them being executed, it's worth taking a look at them to get an overview of the various hooks and what is possible.

To get a hook to fire you need a file with the appropriate name (remove the .sample extension on each file you want to run), and it needs to be granted execute permissions:

$ chmod +x .git/hooks/hook-name-here

Joining forces for real ultimate power

Coding in ruby most the day makes it the quickest language for me to use to throw together a script. Thankfully you can write you hooks in ruby, or just about any language really, just change the shebang line accordingly:

#!/usr/bin/env ruby

However there are lots of things that are much easier to do from a command line than they are in a ruby script, and so we will stand on the shoulders of giants and use the underlying *nix tools to do what they're best at, and use ruby to keep things re-usable and readable.

Catching out bad habits

One thing I've been guilty of in the past is hastily trying to fix a bug, and then accidentally leaving a debug breakpoint in the committed code. If that ever made it onto a production system it would leave it hanging and unresponsive. Even on other developer machines it causes enough confusion. So to make me look much more reliable than I really am, enter the git pre-commit hook:

#!/usr/bin/env ruby
if `grep -rls "require 'ruby-debug'; debugger" *` != ""
  puts "You twit, you've left a debugger in!"
  exit(1)
end

Now whenever I try to commit code, it will first run a recursive grep over the codebase to ensure I've not left my debug statement in (I can be sure it always looks like "require 'ruby-debug'; debugger" as I have it bound to a shortcut).

Stopping an incomplete merge

There's been occasions where a particularly large rebase or merge creates a lot of conflicts in a file, and one of those has snuck through and rather than being fixed the inline diff has actually been committed. Time to add another check to pre-commit, using egrep to scan recursively for the 3 different line markers that git uses to indicate a merge conflict:

#!/usr/bin/env ruby

if `egrep -rls "^<<<<<<< |^>>>>>>> |^=======$" *`
  puts "Dang, looks like you screwed the merge!"
  exit(1)
end

If you try this though you'll probably discover that it doesn't quite work as expected, because there are some binary files that happen to include these characters. More shell scripting to the rescue then, we will pipe the results into a couple of other commands to filter it out. First it goes via xargs to allow us to take the input from STDIN and pass each line recursively into file to find out what type of file we are dealing with. We then pipe that into egrep again to select only the script and text files:

#!/usr/bin/env ruby

if `egrep -rls "^<<<<<<< |^>>>>>>> |^=======$" * | xargs file | egrep 'script|text'` != ""
  puts "Dang, looks like you screwed the merge!"
  exit(1)
end

It would be nice at this point to actually know what files have been affected, without needing to commit the above series of commands to memory, so we can output it again this time passing the result into awk to strip out just the filename:

#!/usr/bin/env ruby

if `egrep -rls "^<<<<<<< |^>>>>>>> |^=======$" * | xargs file | egrep 'script|text'` != ""
  puts "Dang, looks like you screwed the merge!"
  puts `egrep -rls "^<<<<<<< |^>>>>>>> |^=======$" * | xargs file | egrep 'script|text' | awk -F: '{print $1}'`
  exit(1)
end

Helping your workflow

I'm a big fan of committing regularly in manageable amounts, but I want to ensure each commit is self-contained and has all the tests passing. I don't want to be in a state where I revert a commit and end up with a broken app. However, there are times where I'll be spiking something or refactoring a class and I'd like a temporary save point incase I make a mess of things and want to step back. To do that, I typically commit with a message like "WiP: Got Foo working, about to fix Bar." with the intention of coming back when it's complete and amending that commit to include the additional changes and have a more meaningful message. Sometimes I forget to use --amend though and things don't go to plan. That's another one that is easy to avoid:

#!/usr/bin/env ruby

if `git log --oneline --author=\`git config --get-all user.email | sed s/@.*//g\` -n 5 | grep -i wip` != ""
  puts "You've left a WiP commit message behind"
end

You might need to do a little tweaking on that one depending on your setup, so I'll break it out in the order the commands will be executed to help you modify to your needs. First, I use git-config to return the email address of the current user:

$ git config --get-all user.email

I then pipe that into sed to return just the bit before the @ sign:

$ sed s/@.*//g

That's all been executed in a sub-process (I've backslash escaped the back tick characters at each end of the command: "git config --get-all user.email | sed s/@.*//g"). The result of that command is passed into git-log to return the last 5 commits for that author:

$ git log --oneline --author=username_here -n 5

And finally, grep is called on the result to ensure I haven't left the string "wip" in any of the commits:

$ grep -i wip

Ensuring you don't break the build

The hook that kicked it all off for me was to ensure that I didn't break the build, mostly as an attempt to claim moral superiority over anyone else who was found guilty of doing it themselves. Little did they know I had a secret weapon to protect my perfect performance ;)

#!/usr/bin/env ruby

puts "Running tests..."
`rake test > /dev/null 2>&1 && bundle exec cucumber features > /dev/null 2>&1`
if $? != 0
  puts "Tests failed"
  exit(1)
end

Making it more self-aware

This approach worked great for a couple of days, but I quickly got frustrated because I'd have to add the --no-verify parameter to commits quite regularly. I really only wanted to run all the tests when I was committing on master before I pushed changes upstream to everyone else. The other problem was that my "WiP" workflow meant I'd have to use --no-verify whenever I was amending a commit and it struck me the script should be intelligent enough to know I was trying to do the right thing.

Detecting master

Determining if the current branch was master was relatively straight-forward:

`git symbolic-ref HEAD | grep master` != ""

So just wrap that as part of the if statements you only want to be executed when you're on the master branch.

Detecting commit amend

Working out if you are amending a commit is a little trickier. The options passed to commit aren't passed through to your script, so it requires a bit of process hackery in both ruby and bash to find out if --amend was used. First we use the built in $$ variable in ruby to return the process ID of the ruby process, and use it with ps and grep to return all matching processes:

`ps -f | grep #{$$}`

We then pass that into awk to extract the parent's process ID, and make an assumption that the first line is the parent:

`ps -f | grep #{$$} | awk '{print $3}' | head -n 1`

Back into ps and grep again now that we have the process ID of the parent we use it to return the full command and options that were passed to git-commit, and then grep again to see if --amend was passed in:

`ps | grep \`ps -f | grep #{$$} | awk '{print $3}' | head -n 1\` | grep -e "--amend"`

Phew!

Wrapping it all up

All that would create a mess of if statements and duplication throughout your git pre-commit hook, and any other hook you might want to apply this logic to so I've bundled it all up in a reusable class that I include in any project. I'll keep updating it as my needs develop, feel free to fork it and add features for other languages and frameworks.

Agile Development: The quickstart guide to doing it right

2010-09-06T11:00:00+00:00

Lots of places don't do agile right, in fact most do it wrong. There are a number of factors why, chief among them I believe is that the percentage of people who have worked somewhere that nailed it completely right is so low that there aren't enough to go around and lead by example. So here are a few of the tips I've been given or learnt over the years, credit has to be given to all the wonderful places I've worked over the years. The ones that have done it well, and the ones that have done it poorly… you can learn just as much from reflecting on where it all went wrong.

Back to basics

Are you already using agile, or are you planning on implementing it in your company? If you're reading this chances are something isn't working as well as you'd like and so it's time to reset. First thing to do is throw out your preconceptions on agile, ignore almost everything you've read in practitioner manuals, because at it's core it's really quite simple. Most problems stem from people picking only certain features of agile or trying to let a tool guide the process for them.

Refocus on the manifesto

It's worth going back and looking at the agile manifesto and then questioning each aspect of your process to ensure it is giving priority to the values on the left side. If it's not, remove it.

Throw out the tools you're using

Scrumworks? Urgh! VersionOne? Atlassian? They're all cut from a similar cloth. If you're not already doing agile correctly, they will just get in the way. So until you're up and running, get them as far away as possible.

For planning, all you need are some index cards and a pen. If that doesn't offer you enough protection for your DRP or wont work because you've got a distributed team then use a wiki. Again, avoid at all costs anything more complicated until you've got it all working.

Planning

The great thing about stripping it back to basics is that it re-aligns focus on the things that are important, you're not distracted about calculating backlog points and predicting velocity. Instead you are collaborating with your customer to understand what they will consider working software.

You need to become the customer

I mentioned in my last post, nobody cares what tools you use, your customer is trusting you to make the right decisions to do the best job. You can only really do that when you understand the requirements completely, a full and deep appreciation for why the work is being done and not simply what needs to be implemented.

Your customer is unlikely to be a software developer so their scope, as broad as it is will be, is limited by their experience. They also can't possibly know upfront all the questions you want answered and all the seemingly irrelevant detail that would feed in to giving the best user experience possible. Expecting them to bridge this gap is completely unrealistic, instead the gap needs to be bridged by having the developers move towards the customer. The developer needs to be completely bought in to the product, understand the problems it solving, and how they'd best be solved.

This is precisely the reason why tools like cucumber don't work, they are bridging that gap from the wrong direction. And if we go back to the agile manifesto, that approach carries the potential to skew the priority back towards contract negotiation rather that customer collaboration.

Customers don't write the stories

And neither do business analysts, development managers, or project managers. I'd go so far to say that if you want to be "agile" and you've got a team of BAs then you should fire them all. Anybody getting between the developer(s) and the customer is stopping the developer from becoming the customer. Crucial detail will be missed and misplaced assumptions will go unchallenged. Sure, you'll still get a product out the door but it will either be what was agreed (which is often different to what is actually wanted) or it will not be as awesome as it could have been.

The people that get between developer and customer have the best of intentions, they're trying to let developers focus on writing code. I'll say it again though, they're job isn't to write code it's to deliver a solution. It's a false economy to think you're saving the developer time by taking on the requirements analysis on their behalf, it will take longer for them to appreciate 3rd hand the requirements than the half day it would take to know them first hand and write them up themselves.

Pull up your sleeves, get out a set of index cards, some pens, and some A4 paper. No laptops, no ipads, no technology. Force the customer to explain the issue without computer aids as though you know nothing. Each of you draw up wireframes, compare the differences between them to see what assumptions have been made, write the card together. Make it a tactile experience.

The result should be a concise description or the business problem, and the end-user benefit you're trying to deliver. Never more than a couple of sentences (maybe a few footnotes for pertinent implementation details that can't be neglected).

The most important stories get done first

This part of it will be difficult, if not impossible, for the developer to make a judgement call on. It is often very difficult even for the customer, but it has to be done. That being said, it shouldn't be as difficult as many people make it. If you're working on a new app that's not yet been released I've got a diagram to help you evaluate priorities. It's also worth keeping a focus on getting the smallest usable feature set out the door as soon as possible. Eric Ries said a good way to get to that point is to outline the minimum features you think you need before you can release it publicly, and then you can probably reduce it by 90%. You always need much, much, much less than you ever expect.

If it's an existing app and you're having problems prioritising it might be worth speaking to some users. Just make sure you keep focus on doing the smallest and simplest thing possible to meet the requirement.

A good way to gauge if story is really the next most important thing to get completed is to double the estimate, if that changes the priority in the customer's eyes it wasn't as important as they thought.

If you're not doing the work, you don't get to estimate it

It took me a long time to appreciate this as a manager, I was often guilty of sizing up tasks based on how long I thought it would take me to complete it. Then someone else picks it up and I wonder why it took so long. Truth is it would always take me longer too but that is too easily forgotten, do you really have a load factor of 1? I thought not.

How easy or difficult a task is has so many variables: familiarity with the code base, how recently you've solved a similar problem, access to an existing solution, etc. and each of those come from a fairly personal experience. Sit down, with a pair to rationalise the decision if you like, and come to a conclusion you're happy with. The important thing is that you are consistently optimistic/pessimistic on your approach, it's less important to get the estimate right than it is to be wrong on a relatively consistent basis. Everything averages out very quickly when it comes to the planning.

Don't mention time

Do everything you can to avoid estimating in hours or days. No matter how much you explain things like load factor when you see a story with "2 days" or "8 hours" written next to it people can't help but think that is how long the task will take. So you can't be entirely shocked when 3 days in they ask how that task that was meant to take 2 days is going. Avoid the awkward situations completely by not setting the expectation in the first place.

Estimate in points, or jelly beans, or anything other than hours or something that can be tracked by a watch or calendar.

Estimates aren't guesses, but they're not accurate either

Don't take a look at a story, close your eyes, and come up with a number. Take the time to analyse in detail what is required, what systems you'll need to talk to, read the 3rd party API/integration docs. Spend a whole day or more if you have to and write up a detailed list of tasks in the order you expect to tackle them, including any decisions on implementation you may come to, and put them at the bottom of your story card.

Time invested here pays for itself two-fold later in the iteration by maintaining focus on the minimum set of tasks that need to be completed and setting your direction each step of the way. Plus it gives you confidence that the task is achievable and unlikely to blowout.

When it comes to putting a number (points, jelly bean, whatever) on an estimate I now always advocate a 3 point system. 1 point roughly equates to a half day of effort (ssshhh, don't tell the customer ;), 2 points a whole day, 3 points 2 days. The reason being that it's too difficult to consistently estimate small tasks, if you estimate an hour for something and you get snagged and it takes you the first half of a day (not unusual) you've blown the estimate by 400%. At the other end of the spectrum you've got two problems, either a story that isn't really implementing the most simple solution possible or a broad scope that it's difficult to really sit down and estimate in the detail required to be confident in it.

If the tasks are obviously much smaller than 1 point, I'll try and group a couple of similar smaller tasks into a single 1 point task. If a task is bigger than 3 points, I'll split it into separate deliverable parts. "But it's really a 5 point task, everything has to be delivered at once!", Bollocks!

Estimates for work that isn't scheduled are worthless

They might be useful for getting a high-level idea of when some story might be delivered, maybe, if it's ever actually scheduled. Remember one of the core tenements of agile is responding to change, and nothing impacts priorities like actually giving software to users. So it's not uncommon for that really-urgent-we've-gotta-have-it next feature to get bumped for a glaring hole a bunch of users pointed out. And then bumped again for something else. It comes back to expectation management again, and putting an estimate next to it adds a misplaced finality to where the story sits in the priority queue.

The other issue is that each iteration of development feeds back into next. Some previously completed work may make a later story easier to implement, or it may actually highlight that something is more difficult than originally expected. Either way there are enough factors that can completely invalidate whatever estimates you come up with.

Execution

Once the planning is done, it is down to actually starting the work. It's not always a matter of ticking boxes off the work list though, the most effective teams I've worked in have shared some common traits.

Start mid-week

Few people love a Monday morning and getting motivated can sometimes be an issue. If you compound that difficulty by expecting everyone to be in the right mindset to sit down and write stories, well it doesn't always work out for the best. There is no real long-term cost associated with moving the start of the iterations from a Monday to a Wednesday, but allowing me to be productive first thing after a hazy weekend by allowing me to look at a list of work and just crack on with it is a huge win.

Have a fixed iteration length

I prefer to work with 2 week iterations, 1 week feels too short and 4 too long.

There is a comfort that comes from having some order to the world, and the importance of the general psyche and morale of a team is often underestimated. On those times that a story does take longer than expected, it's nice to know that you've still got a week or so to make it up. It's also great to have a regular sense of completion in predictable intervals, the dreaded risk of being stuck on the same task that is going to take months never materialises (partly because you never schedule a story with more than 2 days of effort, right?). There is a lot of subtlety in the impact this predictability has on developers which makes it hard to pinpoint all the benefits, but when combined they all make for a happier environment.

From a management perspective, it's great to have some clarity on when things are going to be delivered. And I'm not just talking about for the next fortnight, but you know on a specific day each and every fortnight something is going to be delivered. So does the rest of the team. It makes planning much easier, people know the most convenient windows to take holidays, and which days to stay home are particularly inconvenient because you're planning with the customer.

Developer attention is a premium

Jason Fried from 37signals has a great video explaining why you can't work at work that I'd recommend watching. When you're deep in the coding zone nothing is more disrupting to your flow than someone coming and interrupting you to ask you for opinion on something completely unrelated. That 10 minute interruption can take 30 minutes to recover from fully to get you back into the same state. If they happen 2-3 times a day that is up to 25% of the workday productivity lost to distractions.

And as Jason mentions in the video the very action is basically a way of saying "Hey, my immediate needs are more important than whatever you're focussing on, give me attention". It's selfish and unhealthy for productivity. Remove phones. Email people if you have to, but don't expect a response that day. Inter-team or important but non-urgent questions should be posed via an instant messenger, chat program, or something that doesn't demand immediate interruption (like phones and meetings do). What about things that are urgent? Well that needs to be assessed on a case-by-case basis, but the truth is truly urgent things happen very very rarely. Almost everything can wait at least 4 hours.

It's here that a good development manager can work wonderfully, acting as bouncer and preventing direct access to developers while they're busy coding and filtering the urgency of all requests.

Pair Programming

A fairly contentious part of agile is pair programming. When it works well, it's hugely efficient and carries lots of additional benefits (higher quality code, quicker delivery, lower documentation requirements, less "single developer" business risk, etc.) but when it's implemented poorly it's just a waste of a resource. In most places I've worked it sadly falls into the latter camp, and it's because people are just paying lip-service to the practice and working it as mentor/tutor type role rather than a pair actively writing code together.

That's not to say it doesn't also offer great potential as an approach to training, the important thing is both participants need to be as involved and active in writing the code.

Owning the workspace

The biggest problem when it doesn't work is normally down to the setup of the workspace. Generally Person B brings their chair over to share Person A's desk, and Person A stays almost exactly where they were to begin with. It's guaranteed to fail. Here's what you need to do:

Share a single monitor, you both need to be looking at the same screen. They're big enough these days that you can see everything you need with appropriate window management.
Each person gets their own keyboard and mouse, no co-piloting on shared inputs and it's not fine to have one person with a keyboard and the other person using the laptop it's connected to.
Divide the screen in half, then mark that virtual line on the desk with some tape or a marker… right down the desk. Each person gets their side and never should they encroach on the other half.

It might all seem a bit contrived and naff, but it makes a difference to the efficiency of the operation. Body position plays on the sub-conscious and determines whether you feel comfortable using the keyboard in front of you. Sitting too far to the side and you start to feel that you're a spectator on somebody else's computer, too far in front and you feel like you're a pilot and co-pilot rather than peers.

Everyone is in their own context

You're going to know what programs you need open to get the job done. For me it's an IDE, a terminal or 3, and a web browser or the running application. Agree on a common screen layout where everything can be seen at the same time, and then mandate it across the entire team (I've found Divvy really useful to keep it consistent). It means that when you come to someone else's computer it doesn't feel foreign, it's just like being on yours but at a different desk.

Most importantly though is that even though you're both working on the same task together on the same machine, you're probably both in slightly different head spaces and thinking about a slightly different aspect of the problem. Quickly switching between applications to satisfy your own curiosities can have a terrible impact on whatever it was your pair was thinking about. Being able to see everything at the one time prevents you from unexpectedly switching context on each other.

Make sure you get a monitor large enough to make it work. Given you're going to spend ~8 hours each and every day staring at it, it's not sensible to buy a cheap screen.

Silence is golden

This is difficult to nail until you've developed a good working relationship with your pair, but the best communication between the two of you is the non-verbal. For the same reasons above, you can never be sure what the head space of your pair is and interrupting them to mention a typo could completely break their train of thought. Instead wait for an obvious pause or context switch and simply point it out. Eventually it may get to the point where you know each other well enough that even more subtle queues like a shift in posture give away that you need to re-think the code you just wrote. But at least you can then do it once you've got your current thoughts committed to screen.

80 characters is enough for anyone

Like many of the tips on here, they've either been passed on to me by Graham Ashton or refined further by working with him for 3 years. This one seemed particularly pedantic to me, but I went along with it because it really wasn't that difficult to adhere to and he was adamant that he'd accept nothing less. And it's this, no line of code should ever extend beyond 80 characters.

I now try to enforce it wherever I go.

It took me a long time to fully appreciate how important it is, at least a year, but code readability is drastically improved and that is always a good thing. When pairing though, it is mandatory. You don't have the luxury of scrolling indefinitely to the right to read the full line of code to see the intention because doing so will interrupt the flow of your pair. Worse still, it actually takes the bulk of the code off the screen.

It also prevents a nasty scenario where something important that drastically changes the intention of a line of code is cleanly hidden by the window edge and the real flow of the application is the opposite of what you thought was happening. Say something like the following:

def my_example_method
  raise "Here is an example of something you would think gets raised!!!" unless coder_reads_this_part?
end

Refactor any line that stretches over 80 characters. It only requires a little thought, makes the intention clearer, ensures you can see all the important code at once, and stops you from having to break someone else's focus.

Test Driven Development

Tests first, then the code. It's not just about ensuring you've got stable and working code, it's about keep focussed on doing the smallest thing possible to make the tests pass.

This approach works great with pair programming, and here's the formula: One person writes a test, the other makes it pass. Once it's passing the person that just wrote the code writes the next test and hands it over. While one of you is trying to write a crafty test to keep the other busy for a while, your pair has already devised a cheeky way to have it always return the value you want and has devised a new test to catch you out.

It's generally a quick back and forth iterative process where you're each trying to outfox the other with edge cases or overly simplistic implementations that pass because the tests are robust enough. It's almost game-like, and eventually you both come to a stalemate. At that point you sit back and realise the story is complete, and with excellent test coverage.

Don't break the build

A continuous integration server is a must as is some form of reporting of any errors raised in production, and problems on both must be treated with a suitably high priority. Anything that is deemed "flakey" and raises an alert for non-legitimate reasons needs to be fixed immediately. As soon as the team starts to doubt the notifications, legitimate problems start to slip through too. It's a bit like the broken windows theory, if you allow these things to go by without action then it breeds complacency and soon some failing tests become accepted as the norm.

That all means that before any developer commits any code back upstream, they run all the tests… all the time. They are there for a reason, and it's inexcusable to not be running them. The CI box is there as a backstop only.

Deliver at the beginning of the end

If you're starting your iterations every second Wednesday then code needs to be completed at the close of play the Monday immediately prior. That means you've got a full day to deploy the code to production and catch any unexpected errors. The other benefit of the mid-week approach is that on the rare occasions it all goes pear-shaped people are more inclined to stay back a little late and fix it. Trying to keep people back after hours on a Friday isn't just difficult, they've often already mentally checked out and off on their weekend and you run the risk of actually making a bad situation worse.

If it all goes seamlessly and you roll-out and have a day free, fantastic! Time to look at that ever growing list of bugs your users have been sending back that haven't been getting scheduled into an iteration. You can always find small tasks to fill the day that are incredibly useful but not super urgent.

Avoid bringing planning and unscheduled work forward, you'll skew your workload for the next iteration and run the risk of over-stressing developers and/or over committing. The benefit of maybe squeezing in a day of development isn't worth the additional risks.

Review

The code is released, people are using it, and everyone is happy. Now it's time to look back and see how things went, what went really well and what could be improved upon.

Run a retrospective

Before the planning session, get all the team in the room to talk about the good and the bad of the previous iteration. Do it quickly, you should need an absolute max of 30mins but you can probably do it in much less. Get index cards or post-it notes of two different colours, one for "went well" and one for "needs improvement", and everyone has to write at least one thing on a card of each colour. There are no limits to how many things people can list though, allow it as a cathartic forum for all issues to be brought out in the open.

Put all the cards up on a board (grouping similar topics), so everyone can see the balance of opinion on how things went. Ensure everyone understands what all of the cards mean and what they are referring to. Now everyone gets 3 points to spend on the "needs improvement" cards, they can spread the points out across 3 cards or put them all onto a single card (or the obvious option in between). The 3 cards that have the most points the team agrees to make a concerted effort to improve upon during the next iteration.

It's only after a couple of iterations with consistently re-appearing top rated problem that I'd consider going back at looking at some of the tools I mentioned throwing out at the top of this article. And even then, only if you're certain they're the best way to fix the most pressing issue facing your team. If I'm honest, I don't think it's going to come up.

Split partially-completed stories

Occasionally a story only gets partly done, and there is much gnashing of teeth when people have to work out what to do with it. Because you've been taking an iterative test driven development you've got a bunch of code that is written, tested, and passing even though the story is incomplete. Fantastic!

Look at what is left to complete and write up a suitable story to reflect it, then go back and revise the previous card to reflect what was actually done. Now look at the original estimate and try and work out what percentage of the story you've completed, apply the appropriate number of points to each story (keeping in mind that if you're following along with my 3 point/jelly bean system, 3 points = 2 days. So if you've done half you've got 1 day on each or 2 points on each).

That way you get an accurate reflection of both what was completed in the previous iteration and a fair idea of what remains, based on your original estimates (remember, we wanted continual optimism. No revising estimates retrospectively).

Calculate your velocity

Now you know how much work you got completed, you can with a fair degree of confidence predict how much you can do next iteration. After a few iterations, you can look back and calculate the average for the past two or 3 to flatten out any particular fast or slow ones and get a good feel for what is achievable.

Under-promise, over-deliver

No kid likes waking up Christmas morning to find the present they asked Santa for to be nowhere under the tree. Likewise, no customer likes being told you're going to give them something in two weeks and for it to not materialise. So if you're uncertain, you're better off scheduling too little work and then pulling an extra story in later in the iteration than over committing. It's not just about managing customer expectations though, it's about limiting stress on developers by keeping the targets realistic.

Conclusion

These are the things that have contributed to making it work at places I've enjoyed working at. The most important thing is to measure overall success of your process as it's ability to create a happy and productive environment, good things will naturally flow from it. At the very least you should apply the principles of agile to the implementation of the process itself. Nothing is set in stone, pause at regular intervals to review the success, adjust the most important aspects to give people what they want.

Nobody cares what tools you use

2010-09-02T13:45:00+00:00

I've spent almost all of my career with something along the lines of "freelancer", "consultant", or "contractor" on my business cards and the forms HR departments make me sign. On the two occasions that hasn't been the case I've been working for as part of a development team that was primarily developing solutions for external customers, or managing a team of contractors working on launching a start-up. So I've been really very fortunate to work across a really broad spectrum of clients and environments and with some brilliant people, which over time makes it easier to appreciate what works and what doesn't.

You're probably doing it wrong

Coming into a new company is always interesting, no matter how wonderful the people or the environment there is always at least one aspect of their job that they don't enjoy. More often that not it's some process that has been enforced by management types who had the best of intentions, but didn't appreciate the impact on those caught in the process. I've been that manager, I've implemented that process, and I apologise to those in the past that were affected by it.

The worst manifestation of this is the implementation of "agile" at most places I've worked. What seems to happen is a progressive manager reads about the productivity gains and other benefits of being "more agile" and either implements it in a piecemeal fashion, picks and chooses the bits that makes the most sense, or thinks that somehow magically telling people to read the agile manifesto and crack to it will make it happen. Other times it's a developer that wants to make everyones lives better and their company more productive but doesn't have the authority to implement the changes alone. It's the same end result.

I'm going to follow this post with another on my tips for "doing agile right", and practices to generally avoid. I have to give credit however to Graham Ashton, working closely with him for almost 3 years refined my understanding of how to do it properly and I hope he does a post of his own to cover anything I may miss. In the interim I'll give you a secret to help you take the first step to salvation. Every time you go to write something, fill in some plan, have a meeting, or do anything mandated by your "process" ask yourself "is this going to make me deliver the product faster?". If the answer is no, don't do it.

You're not there to write insert language of choice here

I heard this explained best by Obie Fernandez during a Q&A session when someone asked how you convince customers that you're not going to offer an upfront fixed-cost bid on some work. I'm paraphrasing here, but he essentially said that Hash Rocket customers don't sign up for a project or deliverable but rather they engage Hash Rocket for their expertise and experience. Either you freelance or your an employee it's the same thing, whoever has employed you hasn't got you in to just cut code. They've assessed a whole range of your background and experience before moving forward with you, and they ultimately want to use that experience to their advantage and solve at least one problem they have. Whether you're writing code in Java or Ruby or PHP is inconsequential, you make a judgement call on what is best for the business and what will allow you to get the job done most effectively given your experience. In some places, that means you have to fall in line with existing technologies because at the end of the day that is "best for the business". Which brings me back to my original point…

Nobody gives a damn about what tools you use

And by nobody, I mean nobody outside the people who also have to use the tool(s) in question. A senior manager need not care if I'm writing Ruby or Python, using Rails, Django, or Sinatra. They've got a problem they need fixed, get it done as effectively as possible. My post yesterday about the benefits of testing with cucumber is another example. I've worked in places where we've tried this approach, giving the customer the ability to see the output of the test suite (sometimes even letting them write the stories), being able to see as features one-by-one go from being red to green as the continuous integration box output the results of the latest commit to a nice website where everyone can see where we are. And then when everything goes green there is much rejoicing, clapping of hands, and popping of champagne.

Except it never happens. The customer looks at the test output for a week, at best a month, and then stops caring. And why should they care? After you write the story (more on that in the next post) their interest in reading it ever again is almost zero. Whether or not you've ticked some boxes, made some text go green, or moved index cards from one end of a board to the other is of no importance to your customer. All they care about is working software. Delivered, working software. You can proclaim a "story" is finished until you are blue in the face they don't truly believe you until they've played with it themselves and made sure it meets all their expectations, including the ones they forgot to tell you about.

A lot of tools are just getting in your way

So if your customer is never going to read your story, why go to the hassle of incorporating it into a core part of your workflow in a fashion that only serves to slow you down? And I'm not pointing the finger just at testing tools here, I'm putting almost any tool I've used that is trying to bridge the gap between developer and customer into the firing line. I've never found a project management tool that made me think "Wow! This is incredible, I'll never work without this tool again" because they're all basically shit, just some less shit than others (Mingle and Pivotal Tracker are reasonable). Even when I've been managing a team I've been less than enthused about using them, and I'm convinced that if you're sharing a workspace with the others in your team that many of these tools offer no real benefit over far more simple and possibly less tech focussed solutions.

Their worst crime though is duping people into thinking that they're actually helping to foster an agile environment and workflow. If you look at the original manifesto and break it down, most tools are actually taking you further from the core principles.

Sit down and speak to your customer and then go do some work. Question the value of anything that happens in between.

You don't win friends with salad

2010-09-01T10:00:00+00:00

The ruby (and in particular the rails) community has grown at a pretty rapid rate since I started using it. The smaller community carried with it many benefits for a newbie, fewer options and fewer opinions on how to do things. Today though new developers have to contend with outdated documentation and tutorials (although http://guides.rubyonrails.org/ and the official docs have gone to great effort at late to keep everything updated), a veritable smorgasbord of options for everything from database access to asset packaging, and an even greater number of vocal opinions on how you should be doing it all. When it comes to testing it would be easy to think that the jury has spoken; You should be using Cucumber, Webrat, and RSpec.

Improved readability

One of the benefits of cucumber is that suddenly tests aren't written in ruby or any other language any more they're in plain text, and readable English text at that! And who can't read English? Let's take an example from Jonas Nicklas (let me also point out, Jonas has done a great job at pointing out where I think a lot of cuke practitioners get it fundamentally wrong… go read it):

Scenario: Adding a subpage
  Given I am logged in
  Given a microsite with a home page
  When I press "Add subpage"
  And I fill in "Title" with "Gallery"
  And I press "Ok"
  Then I should see a document called "Gallery"

Contrast that to some plain ol' ruby and the standard Webrat syntax (using shoulda):

context "logged in user"
  should "be able to add a subpage" do
    visit homepage
    click "Add subpage"
    fill_in "Title", :with => "Gallery"
    click "Ok"
    assert_have_selector ".documents" do
      assert_contain("Gallery")
    end
  end
end

Or if you want to refactor that last assertion into a helper method/macro:

context "logged in user"
  should "be able to add a subpage" do
    visit homepage
    click "Add subpage"
    fill_in "Title", :with => "Gallery"
    click "Ok"
    assert_has_document("Gallery")
  end
end

Now go back and read the first cucumber example, and the final shoulda/webrat example. Do it again. Is there any appreciable difference in readability? I'm confident I could put the latter in front of my non-tech involved fiancé and she'd be able to tell me what it's trying to do, but I'll discuss if that is even important later. The simple fact is that any ruby developer is going to be able to derive intention just as easily from both snippets of code.

Layers of indirection

Like an accomplished magician, cucumbers real trick is to have you looking somewhere other than where the magic is really happening. Open up a feature file and you'll be greeted by something like:

Scenario: Concealing behind a wall of smoke and mirros
  When I ask cucumber to run a scenario
  And I add 5 to 2 for a contrived step
  Then the scenario should have completed
  And it should add up to 7

Great, you know the intention behind this particular feature. But where do we go to find the steps for this test? Well there is obviously the steps file(s), but in a large project there are probably a few so which one? Say we've found the right file, how do you find the line responsible? You can't simply do a search for "ask cucumber to run a scenario" because the actual step definitions probably look something like:

When /^I ask ([a-z]+) to run a ([a-z]+)$/ do |framework,granularity|
  # do stuff
end
When /^I add ([0-9]+) to ([0-9]+) for a contrived step$/ do |a,b|
  # do stuff
end
Then /^the ([a-z]+) should have completed$/ do |granularity|
  # do stuff
end
Then /^it should add up to ([0-9]+)$/ do |result|
  # do stuff
end

Basically unless you already know the signature of the various methods you've got two options: 1) Read the whole damn file or 2) Run the features with verbose output so you can see which line gets invoked in your step file(s).

It's not an insurmountable problem, but it does present an additional barrier to understanding the code. And as I pointed out in the previous section, for a negligible gain given you can write plain ruby with the intention being just as clear. The benefit of that approach is that the plain ol' ruby isn't just your intention, it's actually the code that gets executed. There isn't an additional layer of comprehension required to see where the work is getting done, and if I've written a custom method/matcher/macro then I can actually search for it because it's immediately obvious what the name of that method is.

It's not actually English!

And here's my real issue with all those people drowning themselves in the cucumber kool-aid: they're convinced that this extra layer makes the intention of the application clear, that "the business" has visibility of what the product can do, and everybody is on the same page with regards to scope. If any of those have been a problem for you or your team writing it all down in a prescribed faux-English format isn't going to magically solve your problems. You need to sit down with your client/"the business" and actually engage with them, stop listening to them tell you what they want, and become involved enough to extract what it is they actually need.

I think the worst possible situation you can be in is thinking that at some point clients will be able to write the cucumber features themselves and you can focus on just making them pass. You've now forced upon them a foreign and limited grammar in which they are going to try and define the scope of their problems, and you've taken yourself out of the loop of truly understanding what the real nature of what you're building for the client is. (I discussed this further in my post: Nobody cares what tools you use)

The end result is you've both compromised and come to a middle ground that is less than optimal for both parties… and it takes a particularly skilled and dedicated developer (and a lot of work) to stop that being reflected in the quality of the delivered product.

How to do it better

If you're a one person dev team, you need to seriously ask yourself if you would derive any benefit from the indirection added by using cucumber? Will it improve the maintainability of your code? Your development velocity? Your ability to engage with your clients? The quality of the product you deliver? If you're part of a larger company and it sounds like cucumber will fix a lot of problems you've been suffering, I'll follow this up with a post arguing that most of it can probably be fixed by refocussing on doing agile properly.

Github-style capistrano deployments

2010-08-26T10:00:00+00:00

If you've deployed a rails app the chances are high that you've done it using capistrano. Capistrano has come a long way, but it was originally designed back when we were all using svn to manage our code. Times have changed, and we can make some significant improvements to the default process.

Credit where credit is due

First, I can't take the credit for this refactor of the capistrano deployment scripts. That honour goes to Chris @ GitHub, and he has a great explanation of what he did and why over on their blog.

I've been using this style of deployment on all my projects since Chris announced it, but there are some things I needed which were supported by either capistrano natively or by these new additions.

Setting up your web server configuration

I almost always deploy to an nginx server running passenger, apache is a rarity these days. In any event I've made an assumption that both servers will be setup to dynamically include their virtual host config files from a specific location. As a result, running cap deploy:setup will not also create the appropriate config file for your server and restart apache/nginx. You can set the nginx_conf_dir or apache_conf_dir to specify the appropriate place to put the file. Take a look at config/deploy/apache.rb and config/deploy/nginx.rb for the respective configuration templates.

Moving all application settings to a common place

Common application settings have all been moved to config/deploy/settings.rb, so this will include things like your application name, source control system (this approach only supports git) and repository location. You'll need to change things in this file to be relevant to your own application.

Deploying to different target environments

I'll usually have more than one environment I want to deploy the code to (production, staging, demo, etc.), and cap has a built in means of handling this… you don't need to go install some extra gem or plugin. Each target is defined in config/deploy/targets.rb, and example of definition is as following:

    task :production do
      role :web, ["192.168.1.10", "192.168.1.11"]
      role :app, ["192.168.1.10", "192.168.1.11"]
      role :db,  "192.168.1.12", :primary => true
      set :web_server, :nginx
      set :web_port, "80"
    end

This file is for any settings that will vary from machine-to-machine or between environments. Anything that is going to be the same on every machine belongs in config/deploy/settings.rb

How to get started

Go take a look at Git-based-deploy GitHub repository, grab the code and get deploying. Submit me any patches if you find problems or think of something worth including.

Don't put your private key on a public server

2010-08-25T14:30:00+00:00

You're administering a server (Server A), and you need access to something on another server (Server B) but access is denied from Server A because you don't have your private key on there. If this is a scenario you've run into while trying to deploy an application it's almost certain that you don't want to put your private key on the server in the home directory of your deployment user, readable to anyone that can log in as that account (i.e, people other than you!).

SSH Agent Forwarding to the rescue

To get around this SSH has built in support for forwarding on your private credentials. What happens is that when you connect to a remote server the ssh-agent creates a unix socket and then listens to connections from ssh, this socket is accessible only by your user account… and root.

Yes, what this in fact means is that the root user on the remote server has access to the unix socket you've created. They can't see your private keys, they're still safely held on your client, but while you're connected to the remote server the root user on that machine could potentially use that socket to connect as you to another server. In short, make sure you trust root before you set this up (still, it's better than putting your private key in your home directory where root would now be able to see they key and use it whenever they wanted).

Configure your SSH client to use Agent Forwarding

Getting it to work is as simple as adding an entry to ~/.ssh/config (create the file if it doesn't exist):

Host remote-server.example.com
  ForwardAgent yes

Just replace the Host value with the hostname or IP address of the server you are connecting to, or with * if you're brave and want to do it automatically for all hosts.

Restarting your local SSH Agent

On my Macbook Pro I've added the following to my ~/.bash_profile to make sure that the SSH Agent is started when I open a new terminal and that my identities (private keys) are added to it:

    SSH_ENV=$HOME/.ssh/environment

    function start_agent {
         echo "Initializing new SSH agent..."
         /usr/bin/ssh-agent | sed 's/^echo/#echo/' > ${SSH_ENV}
         echo succeeded
         chmod 600 ${SSH_ENV}
         . ${SSH_ENV} > /dev/null
         /usr/bin/ssh-add;
    }

    if [ -f "${SSH_ENV}" ]; then
         . ${SSH_ENV} > /dev/null
         ps -x | grep "^ *${SSH_AGENT_PID}" | grep ssh-agent$ > /dev/null || {
             start_agent;
         }
    else
         start_agent;
    fi

Getting started with MongoDB

2010-07-03T00:00:00+00:00

MongoDB is a document database, which if you've not come across before is best visualised as a way of persisting a complex Hash object. It varies from a key-value store (like memcache and redis) as the "documents" you store aren't retrieved by their key alone, but can be extracted by querying the values. The objects can be of any structure as you don't define a schema upfront, and can be nested to an infinite depth.

The beauty in a design like this is that you can have a flexible way of storing whatever you want, with all dependent objects nested within each other so you don't need to join across multiple tables to get the information you need. In situations where the most often use-case requires more than one query or a table join to retrieve everything you want (a blog post along with a series of comments is a good example) this can be a real performance win. You'd actually store the comments within the blog post itself rather than in a separate table or collection.

Where I've found it useful, is in storing the data from the Twitter Streaming API. I've had grand plans to build an app on top of this data and API, but life continues to get in the way and in the interim period Twitter keeps adding new features and richer data. Thanks to MongoDB I don't need to update my code to stay in step with their change, I just dump the data straight into the DB and it will automatically include the new fields that have been included. Even better, is that I can write queries against this new data immediately and don't have to worry about migrations.

Installing MongoDB

Installing MongoDB is easy. On OSX I just used Homebrew:

brew install mongodb

On other platforms there are binaries available to download.

By default mongod wants to store data in the /data/db directory so you'll need to create that first:

mkdir -p /data/db

Once that's done you can start the service by just running mongod. Open another terminal session and run mongo to connect to your local server. You'll be taken to a Mongo shell session where you can start issuing Javascript commands to talk to your database. So start by doing the following:

db.mydatabase.save({ name: "Steve" })

The command above with save the document {name: "Steve"} into the database called mydatabase. But that doesn't exist, does it? Well, it does now. If you try and issue a command to a database that doesn't exist, Mongo will create the database for you and then run the command against it. If you take a look in /data/db now you should see a couple of files, one of them is probably in the range of 64MB to 2GB. Hang about, what?! 2GB to store that one small document?

Mongo will pre-allocate disk for storing these objects, that means the next record you insert wont increase the size of these files. It will continue appending documents into these pre-allocated files until they are full, at which point it will pre-allocate another file of the same size and repeat the process.

Storing the Twitter Stream in MongoDB

As I said, I've been using it as a flexible store for persisting the Twitter stream. I've created a Tweet model to store each tweet that looks like the following:

require 'mongo'
require 'twitter-text'

class Tweet  
  def self.create!(tweets)
    collection.insert(tweets)
  end

  private
    def self.establish_connection
      Mongo::Connection.new.db("twitter")
    end

    def self.db
      @db ||= establish_connection
    end

    def self.collection
      @collection ||= db.collection("tweets")
    end
end

I'm just using the native ruby driver and not using one of the wrapper libraries, and to be honest I have no plans to as the interface is so simple I don't really see the point. You just need to open a connection to a database (in this code above it's called twitter) and then identify the collection this model is writing to (the equivalent of a table for those transitioning from SQL, above it's called tweets). Remember that if the database and collection doesn't exist, Mongo will just make it for us. Using the ruby code we've been building on from previous posts, we can take the data we've been receiving and store it with the following:

Twitter.stream("mytwittername", "secret") do |status|
  Tweet.create!(status)
end

Simple! Now you're storing the tweets as quickly as they are arriving. That in itself isn't very interesting though, so lets see how we'd go about retrieving some of our saved data.

Querying in MongoDB

To do the equivalent of a select * from where ... in Mongo you need to pass a JSON object to the find command. Something like the following:

collection.find(:user => { :screen_name => "glenngillen" })

What might not be entirely obvious here is that the Twitter data comes back in a format like the following:

{ :user => { :screen_name => "glenngillen", 
             :profile_image_url => "http://rubypond.com/image.png",
             :followers_count => 1000000 },
  :text => "This is the text from my tweet"
}

And you can see from the query above, I'm able to query based on value nested down within the :user key. Much like SQL, you can write queries to return data that is in a given list of values, is greater than or less than a certain value, you can even match based on a regular expression. For more examples of how to query the data, head over to the MongoDB query documentation

Taking it further

In coming posts I'll expand on the previous examples, show you how to easily:

Supply additional options to the stream to filter it down to just the tweets you are interested in
Setup MongoDB to store the data you need
Provision Amazon EC2 instances to help you deal with processing load
Get Chef involved to handle the provision and setup of your EC2 instances automatically
Use RabbitMQ to dispatch work to multiple servers
Load balance your MongoDB instances across EC2

Managing email lists with Mailchimp and MonkeyWrench

2010-07-02T00:00:00+00:00

Most of my projects and clients have a need to send email to the users to inform them of updates to the system, inform them of special marketing offers, or other such broadcast messages. Unfortunately, these type of messages can flag up warnings with many mail servers as your message may never arrive at the recipient. There are numerous providers out there that do what they can to make it easier, and most do a respectable job (I think I've used almost all of them at some stage). Mailchimp (or here via an affiliate link) has proven to be the best for my purposes to date.

They have a really rich and well documented API, a pretty good admin interface (although the bar is set pretty low here by most competitors), and they're free to use until you get over 500 users. And even then they're pretty good value for what you get.

Introducing MonkeyWrench

Over a year ago now I started first started integrating my apps to using Mailchimp, and while it was quite easy I did occasionally run into problems. The ruby APIs were just thin layer atop the Mailchimp API and it didn't feel natural, plus there were a range of edge cases that were non-obvious unless you were hanging out in the dev forums or using the API a lot (like the fact that you can't batch subscribe members to a list and send them a welcome email).

So I wrote MonkeyWrench to overcome these shortcomings. It feels more natural to use within a ruby app, and it will automatically process large actions in batches, subscribe members as a list in batch if possible but do them individually if that is what is required. It's all an attempt to make interacting with with your users as simple as possible.

It's in production use for a number of clients, and significant portions of the codebase have been contributed/improved upon by David Heath who I've been working with at Wordtracker.

To give an example of how to use it, I'm going to just paste some of the README in here.

Getting Started

To get started, you need to first connect to the appropriate datacenter with your API key:

MonkeyWrench::Config.new(:datacenter => "us1", 
                         :apikey => "your-api-key-goes-here")

From there you've got a rich API for managing Lists and Members. To subscribe a new user to a list simply do the following:

list = MonkeyWrench::List.find_by_name("My Example List")
list.subscribe("foo@bar.com")

Taking it Further

Code is available at GitHub, documentation is an evolving affair (all of the List methods are documented). I'm more than happy to receive patches for any functionality that isn't covered, we've only really implemented what we've needed to date. Just fork it, make a test, and send me a pull request with the patch.