<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
  <title>RailsTips by John Nunemaker</title>
  <link type="text/html" href="http://railstips.org/blog/" rel="alternate" />
  
  <id>http://railstips.org/blog/</id>
  <updated>2012-03-05T13:37:12-05:00</updated>
 
  
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/railstips" /><feedburner:info uri="railstips" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><geo:lat>41.650672</geo:lat><geo:long>-86.160028</geo:long><feedburner:emailServiceId>railstips</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><entry>
      <title>Misleading Title About Queueing</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/bsC5LKP7-7A/" />
      <id>4f54f0ecdabe9d1bb400cf11</id>
      <updated>2012-03-05T13:37:12-05:00</updated>
      <published>2012-03-05T11:00:00-05:00</published>
      <category term="gauges" /><category term="kestrel" />
      <summary type="html">&lt;p&gt;In which I discuss queueing Gauges track requests in Kestrel.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;I don&amp;#8217;t know about you, but I find it super frustrating when people blog about cool stuff at the beginning of a project, but then as it grows, &lt;strong&gt;they either don&amp;#8217;t take the time to teach or they get all protective about what they are doing&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I am going to do my best to &lt;strong&gt;continue to discuss the strategies we are using&lt;/strong&gt; to grow &lt;a href="http://gaug.es"&gt;Gauges&lt;/a&gt;. I hope you find them useful and, by all means, if you have tips or ideas, hit me. Without any further ado&amp;#8230;&lt;/p&gt;
&lt;p&gt;March 1st of last year (2011), we &lt;a href="http://orderedlist.com/blog/articles/gauges/"&gt;launched Gauges&lt;/a&gt;. March 1st of this year (a few days ago), we finally switched to a queue for track requests. Yes, for one full year, we did all report generation in the track request.&lt;/p&gt;
&lt;h2&gt;1. In the Beginning&lt;/h2&gt;
&lt;p&gt;My goal for Gauges in the beginning was realtime. I wanted data to be so freakin&amp;#8217; up-to-date that it blew people&amp;#8217;s minds. What I&amp;#8217;ve realized over the past year of talking to customers is that sometimes Gauges is so realtime, it is too realtime.&lt;/p&gt;
&lt;p&gt;That is definitely not to say that we are going to work on slowing Gauges down. More what it means, is that my priorities are shifting. As more and more websites use Gauges to track, availability moves more and more to the front of my mind.&lt;/p&gt;
&lt;h3&gt;Gut Detects Issue&lt;/h3&gt;
&lt;p&gt;A few weeks back, with much help from friends (&lt;a href="http://twitter.com/#!/bkeepers"&gt;Brandon Keepers&lt;/a&gt;, &lt;a href="http://twitter.com/#!/jnewland"&gt;Jesse Newland&lt;/a&gt;, &lt;a href="http://twitter.com/#!/hwaet"&gt;Kyle Banker&lt;/a&gt;, &lt;a href="http://twitter.com/#!/lindvall"&gt;Eric Lindvall&lt;/a&gt;, and the top notch dudes at &lt;a href="http://twitter.com/#!/fastestforward"&gt;Fastest Forward&lt;/a&gt;), I started digging into some performance issues that were getting increasingly worse. They weren&amp;#8217;t bad yet, but I had this gut feeling they would be soon.&lt;/p&gt;
&lt;p&gt;My gut was right. Our disk io utilization on our primary database doubled from January to February, which was also our biggest growth in terms of number of track requests. If we doubled again from February to March, &lt;strong&gt;it was not going to be pretty&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Back to the Beginning&lt;/h3&gt;
&lt;p&gt;From the beginning, Gauges built all tracking reports on the fly in the track request. When a track came in, Gauges did a few queries and then performed around 5-10 updates.&lt;/p&gt;
&lt;p&gt;When you are small, this is fine, but as growth happens, updating live during a track request can become an issue. I had no way to throttle traffic to the database. This meant if we had enough large sites start tracking at once, most likely our primary database would say uncle.&lt;/p&gt;
&lt;p&gt;As you can guess, if your primary says uncle, you start losing tracking data. In my mind, &lt;strong&gt;priority number one is now to never lose tracking data&lt;/strong&gt;. In order to do this effectively, I felt we were finally at the point where we needed to separate tracking from reporting.&lt;/p&gt;
&lt;h2&gt;2. Availability Takes Front Seat&lt;/h2&gt;
&lt;p&gt;My goal is for tracking to never be down. If, occasionally, you can&amp;#8217;t get to your reporting data, or if, occasionally, your data gets behind for a few minutes, I will survive. If, however, tracking requests start getting tossed to the wayside while the primary screams for help, I will not.&lt;/p&gt;
&lt;p&gt;I talked with some friends and found Kestrel to be very highly recommended, particularly by Eric (linked above). He swore by it, and was pushing it harder than we needed to, so I decided to give it a try.&lt;/p&gt;
&lt;p&gt;A few hours later, my lacking &lt;span class="caps"&gt;JVM&lt;/span&gt; skills (Kestrel is Scala) were bearing their head big time. I still had not figured out how to build or run the darn thing. I posted to the mailing list, where someone quickly pointed out that Kestrel defaults to /var for logging, data, etc. and, unfortunately, spits out no error on startup about lacking permissions on &lt;span class="caps"&gt;OSX&lt;/span&gt;. One sudo !! later and I was in business.&lt;/p&gt;
&lt;h2&gt;3. Kestrel&lt;/h2&gt;
&lt;p&gt;Before I get too far a long with this fairy tail, let&amp;#8217;s talk about Kestrel &amp;#8212; what is it and why did I pick it?&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/robey/kestrel"&gt;Kestrel&lt;/a&gt; is a simple, distributed message queue, based on Blaine Cook&amp;#8217;s starling. Here are a few great paragraphs from the readme:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Each server handles a set of reliable, ordered message queues. When you put a cluster of these servers together, with no cross communication, and pick a server at random whenever you do a set or get, you end up with a reliable, loosely ordered message queue.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;In many situations, loose ordering is sufficient. Dropping the requirement on cross communication makes it horizontally scale to infinity and beyond: no multicast, no clustering, no &amp;#8220;elections&amp;#8221;, no coordination at all. No talking! Shhh!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It features the memcached protocol, is durable (journaled), has fanout queues, item expiration, and even supports transactional reads.&lt;/p&gt;
&lt;p&gt;My favorite thing about Kestrel? &lt;strong&gt;It is simple, soooo simple&lt;/strong&gt;. Sound too good to be true? Probably is, but the honeymoon has been great so far.&lt;/p&gt;
&lt;p&gt;Now that we&amp;#8217;ve covered what Kestrel is and that it is amazing, let&amp;#8217;s talk about how I rolled it out.&lt;/p&gt;
&lt;h2&gt;4. Architecture&lt;/h2&gt;
&lt;p&gt;Here is the general idea. The app writes track requests to the tracking service. Workers process off those track requests and generate the reports in the primary database.&lt;/p&gt;
&lt;p&gt;After the primary database writes, we send the information through a pusher proxy process, which sends it off to &lt;a href="http://pusher.com"&gt;pusher.com&lt;/a&gt;, the service that provides all the live web socket goodness that is in Gauges. Below is a helpful sketch:&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4f54f4e9dabe9d46ae0032f6/article_full/sketch.jpg" class="image full" alt="" /&gt;&lt;/p&gt;
&lt;p&gt;That probably all makes sense, but remember that we weren&amp;#8217;t starting from scratch. We already had servers setup that were tracking requests and I needed to ensure that was uninterrupted.&lt;/p&gt;
&lt;h2&gt;5. Rollout&lt;/h2&gt;
&lt;p&gt;Brandon and I have been on a &lt;a href="http://railstips.org/blog/archives/2012/02/06/more-tiny-classes/"&gt;tiny classes&lt;/a&gt; and services kick of late. &lt;strong&gt;What I am about to say may sound heretical, but we&amp;#8217;ve felt that we need a few more layers in our apps&lt;/strong&gt;. We&amp;#8217;ve started using Gauges as a test bed for this stuff, while also spending a lot of time reading about clean code and design patterns.&lt;/p&gt;
&lt;p&gt;We decided to create a tiny standardization around exposing services and choosing which one gets used in which environment. Brandon took the standardization and &lt;a href="https://github.com/bkeepers/morphine"&gt;moved it into a gem&lt;/a&gt; where we could start trying stuff and share it with others. It isn&amp;#8217;t much now, but we haven&amp;#8217;t needed it to be.&lt;/p&gt;
&lt;h3&gt;Declaring Services&lt;/h3&gt;
&lt;p&gt;We created a Registry class for Gauges, which defined the various pieces we would use for Kestrel. It looked something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class Registry
  include Morphine

  register :track_service do
    KestrelTrackService.new(kestrel_client, track_config['queue'])
  end

  register :track_processor do
    KestrelTrackProcessor.new(blocking_kestrel_client, track_config['queue'])
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We then store an instance of this register in Gauges.app. We probably should have named it Gauges.registry, but we can worry about that later.&lt;/p&gt;
&lt;p&gt;At this point, what we did probably seems pointless. The kestrel track service and processor look something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class KestrelTrackService
  def initialize(client, queue)
    @client = client
    @queue  = queue
  end

  def record(attrs)
    @client.set(@queue, MessagePack.pack(attrs))
  end
end

class KestrelTrackProcessor
  def initialize(client, queue)
    @client = client
    @queue = queue
  end

  def run
    loop { process }
  end

  def process
    record @client.get(@queue)
  end

  def record(data)
    Hit.record(MessagePack.unpack(data))
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The processor uses a blocking kestrel client, which is just a decorator of the vanilla kestrel client. As you can see, all we are doing is wrapping the kestrel-client and making it send the data to the right place.&lt;/p&gt;
&lt;h3&gt;Using Services&lt;/h3&gt;
&lt;p&gt;We then used the track_service in our TrackApp like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class TrackApp &amp;lt; Sinatra::Base
  get '/track.gif' do
    # stuff
    Gauges.app.track_service.record(track_attrs)
    # more stuff
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, in our track_processor.rb process, we started the processor like so:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;Gauges.app.track_processor.run&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Like any good programmer, I knew that we couldn&amp;#8217;t just push this to production and cross our fingers. Instead, I wanted to roll it out to work like normal, but also push track requests to kestrel. This would allow me to see kestrel receiving jobs.&lt;/p&gt;
&lt;p&gt;On top of that, I also wanted to deploy the track processors to pop track requests off. At this point, I didn&amp;#8217;t want them to actually process those track requests and write to the database, I just wanted to make sure the whole system was wired up correctly and stuff was flowing through it.&lt;/p&gt;
&lt;p&gt;Another important piece was seeing how many track request we could store in memory with Kestrel, based on our configuration, and how it performed when it used up all the allocated memory and started going to disk.&lt;/p&gt;
&lt;h3&gt;Service Magic&lt;/h3&gt;
&lt;p&gt;The extra layer around tracking and processing proved to be super helpful. Note that the above examples used the new Kestrel system, but that I wanted to push this out and go through a verification process first. First, to do the verification process, we created a real-time track service:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class RealtimeTrackService
  def record(attrs)
    Hit.record(attrs)
  end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This would allow us to change the track_service in the registry to perform as it currently was in production. Now, we have two services that know how to record track requests in a particular way. What I needed next was to use both of these services at the same time so I created a multi track service:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class MultiTrackService
  include Enumerable

  def initialize(*services)
    @services = services
  end

  def record(attrs)
    each { |service| service.record(attrs) }
  end

  def each
    @services.each do |service|
      yield service
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This multi track services allowed me to record to both services for a single track request. The updated registry looked something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class Registry
  include Morphine

  register :track_service do
    which = track_config.fetch(:service, :realtime)
    send("#{which}_track_service")
  end

  register :multi_track_service do
    MultiTrackService.new(realtime_track_service, kestrel_track_service)
  end

  register :realtime_track_service do
    RealtimeTrackService.new
  end

  register :kestrel_track_service do
    KestrelTrackService.new(kestrel_client, track_config['queue'])
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that now, track_service selects which service to use based on the config. All I had to do was update the config to use &amp;#8220;multi&amp;#8221; as the track service and we were performing realtime track requests while queueing them in Kestrel at the same time.&lt;/p&gt;
&lt;p&gt;The only thing left was to beef up failure around the Kestrel service so that it was limited in how it could affect production. For this, I chose to catch failures, log them, and move on as if they didn&amp;#8217;t happen.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class KestrelTrackService

  def initialize(client, queue, options={})
    @client = client
    @queue  = queue
    @logger = options.fetch(:logger, Logger.new(STDOUT))
  end

  def record(attrs)
    begin
      @client.set(@queue, MessagePack.pack(attrs))
    rescue =&amp;gt; e
      log_failure(attrs, e)
      :error
    end
  end

  private

  def log_failure(attrs, exception)
    @logger.info "attrs: #{attrs.inspect}  exception: #{exception.inspect}"
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I also had a lot of instrumentation in the various track services, so that I could verify counts at a later point. These verifications counts would prove whether or not things were working. I left that out as it doesn&amp;#8217;t help the article, but you definitely want to verify things when you roll them out.&lt;/p&gt;
&lt;p&gt;Now that the track service was ready to go, I needed a way to ensure that messages would flow through the track processors without actually modifying data. I used a similar technique as above. I created a new processor, aptly titled NoopTrackProcessor.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class NoopTrackProcessor &amp;lt; KestrelTrackProcessor
  def record(data)
    # don't actually record
    # instead  just run verification
  end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The noop track processor just inherits from the kestrel track processor and overrides the record method to run verification instead of generating reports.&lt;/p&gt;
&lt;p&gt;Next, I adjusted the registry to allow flipping the processor that is used based on the config.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class Registry
  include Morphine

  register :track_processor do
    which = track_config.fetch(:processor, :noop)
    send("#{which}_track_processor")
  end

  register :kestrel_track_processor do
    KestrelTrackProcessor.new(blocking_kestrel_client, track_config['queue'])
  end

  register :noop_track_processor do
    NoopTrackProcessor.new(blocking_kestrel_client, track_config['queue'])
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With those changes in place, I could now set the track service to multi, the track processor to noop, and I was good to deploy. So I did. And it was wonderful.&lt;/p&gt;
&lt;h2&gt;6. Verification&lt;/h2&gt;
&lt;p&gt;For the first few hours, I ran the multi track service and turned off the track processors. This created the effect of queueing and never dequeueing. The point was to see how many messages kestrel could hold in memory and how it performed once messages started going to disk.&lt;/p&gt;
&lt;p&gt;I used scout realtime to watch things during the evening while enjoying some of my favorite TV shows. A few hours later and almost 530k track requests later, Kestrel hit disk and hummed along like nothing happened.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4f54ffa2dabe9d2bc4010ff0/article_full/kestrel_hits_disk.jpg" class="image full" alt="" /&gt;&lt;/p&gt;
&lt;p&gt;Now that I had a better handle of Kestrel, I turned the track processors back on. Within a few minutes they had popped all the messages off. Remember, at this point, I was still just noop&amp;#8217;ing in the track processors. All reports were still being built in the track request.&lt;/p&gt;
&lt;p&gt;I let the multi track service and noop track processors run through the night and by morning, when I checked my graphs, I felt pretty confident. I removed the error suppression from the kestrel service and flipped both track service and track processor to kestrel in the config.&lt;/p&gt;
&lt;p&gt;One more deploy and we were queueing all track requests in Kestrel and popping them off in the track processors after which, the reports were updated in the primary database. This meant our track request now performed a single Kestrel set, instead of several queries and updates. As you would expect, response times dropped like a rock.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4f5500eadabe9d5b6e0035e7/article_full/response_times.jpg" class="image full" alt="" /&gt;&lt;/p&gt;
&lt;p&gt;It is pretty obvious when Kestrel was rolled out as the graph went perfectly flat and dropped to ~4ms response times. &lt;span class="caps"&gt;BOOM&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;You might say, yeah, your track requests are now fast, but your track processors are doing the same work that the app was doing before. You would be correct. Sometimes growing is just about moving slowness into a more manageable place, until you have time to fix it.&lt;/p&gt;
&lt;p&gt;This change did not just move slowness to a different place though. It separated tracking and reporting. We can now turn the track processors off, make adjustments to the database, turn them back on, and instantly, they start working through the back log of track requests queued up while the database was down. No tracking data lost.&lt;/p&gt;
&lt;p&gt;I only showed you a handful of things that we instrumented to verify things were working. Another key metric for us, since we aim to be as close to realtime as possible, is the amount of time that it takes to go from queued to processing.&lt;/p&gt;
&lt;p&gt;Based on the numbers, it takes us around 500ms right now. I believe as long as we keep that number under a second, most people will have no clue that we aren&amp;#8217;t doing everything live.&lt;/p&gt;
&lt;h2&gt;7. Conclusion&lt;/h2&gt;
&lt;p&gt;By no means are we where I want us to be availability-wise, but at least we are one more step in the right direction. Hopefully this article gives you a better idea how to roll things out into production safely. Layers are good. Whether you are using Rails, Sinatra, or some other language entirely, layer services so that you can easily change them.&lt;/p&gt;
&lt;p&gt;Also, we are now a few days in and Kestrel is a beast. Much thanks to &lt;a href="https://github.com/robey"&gt;Robey&lt;/a&gt; for writing it and Twitter for open sourcing it!&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=bsC5LKP7-7A:xc9cwL7Bf-g:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=bsC5LKP7-7A:xc9cwL7Bf-g:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=bsC5LKP7-7A:xc9cwL7Bf-g:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/bsC5LKP7-7A" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2012/03/05/misleading-title-about-queueing/</feedburner:origLink></entry>
  
    <entry>
      <title>More Tiny Classes</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/ASwJ8eLYCi0/" />
      <id>4f2d9117dabe9d290b0076af</id>
      <updated>2012-02-06T07:00:10-05:00</updated>
      <published>2012-02-06T07:00:00-05:00</published>
      <category term="gauges" /><category term="refactoring" />
      <summary type="html">&lt;p&gt;In which I share how we are using more tiny classes to make Gauges more maintainable.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;My last post, &lt;a href="http://railstips.org/blog/archives/2012/02/04/keep-em-separated/"&gt;Keep &amp;#8217;Em Separated&lt;/a&gt;, made me realize I should start sharing more about what we are doing to make Gauges maintainable. This post is another in the same vein.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://gaug.es"&gt;Gauges&lt;/a&gt; allows you to share a gauge with someone else by email. That email does not have to exist prior to your adding it, because nothing is more annoying that wanting to share something with a friend or co-worker, but first having to get them to sign up for the service.&lt;/p&gt;
&lt;p&gt;If the email address is found, we add the user to the gauge and notify them that they have been added.&lt;/p&gt;
&lt;p&gt;If the email address is not found, we create an invite and then send an email to notify them they should sign up, so they can see the data.&lt;/p&gt;
&lt;h2&gt;The Problem: McUggo Route&lt;/h2&gt;
&lt;p&gt;The aforementioned sharing logic isn&amp;#8217;t difficult, but it was just enough that our share route was getting uggo. It started off looking something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;post('/gauges/:id/shares') do
  gauge = Gauge.get(params['id'])

  if user = User.first_by_email(params[:email])
    Stats.increment('shares.existing')
    gauge.add_user(user)
    ShareWithExistingUserMailer.new(gauge, user).deliver
    {:share =&amp;gt; SharePresenter.new(gauge, user)}.to_json
  else
    invite = gauge.invite(params['email'])
    Stats.increment('shares.new')
    ShareWithNewUserMailer.new(gauge, invite).deliver
    {:share =&amp;gt; SharePresenter.new(gauge, invite)}.to_json
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&amp;#8217;s be honest. We&amp;#8217;ve all seen Rails controller actions and Sinatra routes that are fantastically worse, but this was really burning my eyes, so I charged our &lt;a href="http://theprogrammingbutler.com"&gt;programming butler&lt;/a&gt; to refactor it.&lt;/p&gt;
&lt;h2&gt;The Solution: Move Logic to Separate Class&lt;/h2&gt;
&lt;p&gt;We talked some ideas through, and once he had finished, the route looked more like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;post('/gauges/:id/shares') do
  gauge    = Gauge.get(params['id'])
  sharer   = GaugeSharer.new(gauge, params['email'])
  receiver = sharer.perform
  {:share =&amp;gt; SharePresenter.new(gauge, receiver)}.to_json
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Perfect? Who cares. Waaaaaaaaay better? &lt;strong&gt;Yes&lt;/strong&gt;. The concern of a user existing or not is &lt;strong&gt;moved away to a place where the route could care less&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Also, the bonus is that &lt;strong&gt;sharing a gauge can now be used without invoking a route&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So what does GaugeSharer look like?&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class GaugeSharer
  def initialize(gauge, email)
    @gauge = gauge
    @email = email
  end

  def user
    @user ||= … # user from database
  end

  def existing?
    user.present?
  end

  def perform
    if existing?
      share_with_existing_user
    else
      share_with_invitee
    end
  end

  def share_with_existing_user
    # add user to gauge
    ShareWithExistingUserMailer.new(@gauge, user).deliver
    user
  end

  def share_with_invitee
    invite = ... # invite to db
    ShareWithNewUserMailer.new(@gauge, invite).deliver
    invite
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, instead of having several higher-level tests to check each piece of logic, we can just ensure that GaugeSharer is invoked correctly in the route test and then test the crap out of GaugeSharer with unit tests. We can also use GaugeSharer anywhere else in the application that we want to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;This isn&amp;#8217;t a dramatic change in code, but it has a dramatic effect on the coder&lt;/strong&gt;. Moving all these bits into separate classes and tiny methods improves &lt;strong&gt;ease of testing&lt;/strong&gt; and, probably more importantly, &lt;strong&gt;ease of grokking&lt;/strong&gt; for another developer, including yourself at a later point in time.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ASwJ8eLYCi0:l_5hQkG0nAA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ASwJ8eLYCi0:l_5hQkG0nAA:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ASwJ8eLYCi0:l_5hQkG0nAA:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/ASwJ8eLYCi0" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2012/02/06/more-tiny-classes/</feedburner:origLink></entry>
  
    <entry>
      <title>Keep 'Em Separated</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/aRF1TynWxyA/" />
      <id>4f2d73dddabe9d5bed005b52</id>
      <updated>2012-02-04T16:14:59-05:00</updated>
      <published>2012-02-04T13:00:00-05:00</published>
      <category term="gauges" /><category term="pusher" />
      <summary type="html">&lt;p&gt;In which I share a quick tale of refactoring.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: If you end up enjoying this post, you should do two things: &lt;a href="http://pusher.com/"&gt;sign up for Pusher&lt;/a&gt; and then &lt;a href="https://www.destroyallsoftware.com/"&gt;subscribe to destroy all software screencasts&lt;/a&gt;. I&amp;#8217;m not telling you do this because I get referrals, I just really like both services.&lt;/p&gt;
&lt;p&gt;For those that do not know, &lt;a href="http://get.gaug.es"&gt;Gauges&lt;/a&gt; currently uses &lt;a href="http://pusher.com"&gt;Pusher.com&lt;/a&gt; for flinging around all the traffic live.&lt;/p&gt;
&lt;p&gt;Every track request to Gauges sends a request to Pusher. We do this using EventMachine in a thread, as I have &lt;a href="http://railstips.org/blog/archives/2011/05/04/eventmachine-and-passenger/"&gt;previously written about&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;The Problem&lt;/h2&gt;
&lt;p&gt;The downside of this, is when you get to the point we were (thousands of a requests a minute), there are so many pusher notifications to send (thousands of a minute) that &lt;strong&gt;the EM thread starts stealing a lot of time&lt;/strong&gt; from the main request thread. You end up with random slow requests that have one to five seconds of &amp;#8220;uninstrumented&amp;#8221; time. &lt;strong&gt;Definitely not a happy scaler does this make&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the past, we had talked about keeping track of which gauges were actually being watched and only sending a notification for those, but never actually did anything about it.&lt;/p&gt;
&lt;h2&gt;The Solution&lt;/h2&gt;
&lt;p&gt;Recently, Pusher added &lt;a href="http://pusher.com/docs/webhooks"&gt;web hooks&lt;/a&gt; on channel occupy and channel vacate. This, combined with a growing number of slow requests, was just the motivation I needed to come up with a solution.&lt;/p&gt;
&lt;p&gt;We (&lt;a href="http://opensoul.org/"&gt;@bkeepers&lt;/a&gt; and I) started by mapping a simple route to a class.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class PusherApp &amp;lt; BaseApp
  post '/pusher/ping' do
    webhook = Pusher::WebHook.new(request)
    if webhook.valid?
      PusherPing.receive(webhook)
      'ok'
    else
      status 401
      'invalid'
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using a simple class method like this moves all logic out of the route and into a place that is easier to test. The receive method iterates the events and runs each ping individually.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class PusherPing
  def self.receive(webhook)
    webhook.events.each do |event|
      new(event, webhook.time).run
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At first, we had something like this for each PusherPing instance.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class PusherPing
  def initialize(event, time)
    @event         = event || {}
    @time          = time
    @event_name    = @event['name']
    @event_channel = @event['channel']
  end

  def run
    case @event_name
    when 'channel_occupied'
      occupied
    when 'channel_vacated'
      vacated
    end
  end

  def occupied
    update(@time)
  end

  def vacated
    update(nil)
  end

  def update(value)
    # update the gauge in the
    # db with the value
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We pushed out the change so we could start marking gauges as occupied. We then forced a browser refresh, which effectively vacated and re-occupied all gauges people were watching.&lt;/p&gt;
&lt;p&gt;Once we new the occupied state of each gauge was correct, we added the code to only send the request to pusher on track if a gauge was occupied.&lt;/p&gt;
&lt;p&gt;Deploy. Celebrate. Booyeah.&lt;/p&gt;
&lt;h2&gt;The New Problem&lt;/h2&gt;
&lt;p&gt;Then, less than a day later, we realized that pusher doesn&amp;#8217;t guarantee the order of events. Imagine someone vacating and then occupying a gauge, but receiving the occupy first and then the vacate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;This situation would mean that live tracking would never turn on&lt;/strong&gt; for the gauge. Indeed, it started happening to a few people, who quickly let us know.&lt;/p&gt;
&lt;h2&gt;The New Solution&lt;/h2&gt;
&lt;p&gt;We figured it was better to send a few extra notifications than never send any, so we decided to &amp;#8220;occupy&amp;#8221; gauges on our own when people loaded up the Gauges dashboard.&lt;/p&gt;
&lt;p&gt;We started in and quickly realized the error of our ways in the pusher ping. Having the database calls directly tied to the PusherPing class meant that we had two options:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Use the PusherPing class to occupy a gauge when the dashboard loads, which just felt wrong.&lt;/li&gt;
	&lt;li&gt;Re-write it to separate the occupying and vacating of a gauge from the PusherPing class.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Since we are good little developers, we went with 2. We created a GaugeOccupier class that looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class GaugeOccupier
  attr_reader :ids

  def initialize(*ids)
    @ids = ids.flatten.compact.uniq
  end

  def occupy(time=Time.now.utc)
    update(time)
  end

  def vacate
    update(nil)
  end

private

  def update(value)
    return if @ids.blank?
    # do the db updates
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We tested that class on its own quite quickly and refactored the PusherPing to use it.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class PusherPing
  def run
    case @event_name
    when 'channel_occupied'
      GaugeOccupier.new(gauge_id).occupy(@time)
    when 'channel_vacated'
      GaugeOccupier.new(gauge_id).vacate
    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Boom. PusherPing now worked the same and we had a way to &amp;#8220;occupy&amp;#8221; gauges separate from the PusherPing. We added the occupy logic to the correct point in our app like so:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;ids = gauges.map { |gauge| gauge.id }
GaugeOccupier.new(ids).occupy&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At this point, we were now &amp;#8220;occupied&amp;#8221; more than &amp;#8220;vacated&amp;#8221;, which is good. However, you may have noticed, that we still had the issue where someone loads the dashboard, we occupy the gauge, but then receive a delayed, or what I will now refer to as &amp;#8220;stale&amp;#8221;, hook.&lt;/p&gt;
&lt;p&gt;To fix the stale hook issue, we simply added a bit of logic to the PusherPing class to detect staleness and simple ignore the ping if it is stale.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class PusherPing
  def run
    return if stale?
    # do occupy/vacate
  end

  def stale?
    return false if gauge.occupied_at.blank?
    gauge.occupied_at &amp;gt; @time
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Closing Thoughts&lt;/h2&gt;
&lt;p&gt;This is by no means a perfect solution. There are still other holes. For example, a gauge could be occupied by us after we receive a vacate hook from pusher and stay in an &amp;#8220;occupied&amp;#8221; state, sending notifications that no one is looking for.&lt;/p&gt;
&lt;p&gt;To fix that issue, we can add a cleanup cron or something that occasionally gets all occupied channels from pusher and vacates gauges that are not in the list.&lt;/p&gt;
&lt;p&gt;We decided it wasn&amp;#8217;t worth the time. We pushed out the occupy fix and are now reaping the benefits of sending about 1/6th of the pusher requests we were before. This means our EventMachine thread is doing less work, which gives our main thread more time to process requests.&lt;/p&gt;
&lt;p&gt;You might think us crazy for sending hundreds of http requests in a thread that shares time with the main request thread, but it is actually working quite well.&lt;/p&gt;
&lt;p&gt;We know that some day we will have to move this to a queue and an external process that processes the queue, but &lt;strong&gt;that day is not today&lt;/strong&gt;. Instead, we can &lt;strong&gt;focus on the next round of features that will blow people&amp;#8217;s socks off&lt;/strong&gt;.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=aRF1TynWxyA:tkI5wtJnjaM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=aRF1TynWxyA:tkI5wtJnjaM:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=aRF1TynWxyA:tkI5wtJnjaM:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/aRF1TynWxyA" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2012/02/04/keep-em-separated/</feedburner:origLink></entry>
  
    <entry>
      <title>What a Year</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/K2rlkdwLFOw/" />
      <id>4f00aa8adabe9d3fb000d149</id>
      <updated>2012-01-01T14:13:58-05:00</updated>
      <published>2012-01-01T13:00:00-05:00</published>
      <category term="thoughts" />
      <summary type="html">&lt;p&gt;In which I share about a crazy year.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;The last 12 months have been nuts. My health and professional/personal life were completely at odds.&lt;/p&gt;
&lt;p&gt;Between January and August, I had three hernia surgeries. As if that wasn&amp;#8217;t enough for one year, the last few months of the year I&amp;#8217;ve been plagued by a few other ailments (which are still giving me a hard time). Definitely a rough stretch. I will never take health for granted again and really look forward to getting back to &amp;#8220;normal&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Quite the contrary to my health, Ordered List grew from 2 to 5 people, helped Zynga launch Words with Friends on Facebook, launched &lt;a href="http://gaug.es"&gt;Gauges&lt;/a&gt; and &lt;a href="http://speakerdeck.com"&gt;Speaker Deck&lt;/a&gt; while improving &lt;a href="http://harmonyapp.com"&gt;Harmony&lt;/a&gt;, and, finally, was acquired by the only other company in the world I wanted to be a part of, &lt;a href="http://github.com"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here is to a healthy 2012.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=K2rlkdwLFOw:tghfHC3i4Xo:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=K2rlkdwLFOw:tghfHC3i4Xo:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=K2rlkdwLFOw:tghfHC3i4Xo:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/K2rlkdwLFOw" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2012/01/01/what-a-year/</feedburner:origLink></entry>
  
    <entry>
      <title>Acquired</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/PmtZwu5tH0c/" />
      <id>4edd006ddabe9d380e005b02</id>
      <updated>2011-12-05T13:03:56-05:00</updated>
      <published>2011-12-05T12:00:00-05:00</published>
      
      <summary type="html">&lt;p&gt;In which I announce my first day at GitHub.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;Several times over the past few years, I have stated that GitHub is probably the only other place I could see myself working. Today, it is official. All of Ordered List has joined GitHub.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/blog/993-ordered-list-is-a-githubber"&gt;&lt;img src="/assets/4edcff3edabe9d1f7800e4bc/article_full/oloctocat.png" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Maybe someday I&amp;#8217;ll write about what Ordered List has meant to me, but today I am going to fully enjoy the present, instead of rambling about the past. I have no doubt great things will come of this.&lt;/p&gt;
&lt;p&gt;You can read more at &lt;a href="https://github.com/blog/993-ordered-list-is-a-githubber"&gt;GitHub&lt;/a&gt; and &lt;a href="http://orderedlist.com/blog/articles/ordered-list-acquired-by-github/"&gt;Ordered List&lt;/a&gt;.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PmtZwu5tH0c:t7pCZu_3Y8I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PmtZwu5tH0c:t7pCZu_3Y8I:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PmtZwu5tH0c:t7pCZu_3Y8I:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/PmtZwu5tH0c" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/12/05/acquired/</feedburner:origLink></entry>
  
    <entry>
      <title>Creating an API</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/ow3vUvRMELY/" />
      <id>4ed78dd8dabe9d5ded01f52c</id>
      <updated>2011-12-01T12:43:33-05:00</updated>
      <published>2011-12-01T09:00:00-05:00</published>
      <category term="api" /><category term="gauges" />
      <summary type="html">&lt;p&gt;In which I share a few things we used while building the Gaug.es &lt;span class="caps"&gt;API&lt;/span&gt;.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;A few weeks back, we publicly released the &lt;a href="http://get.gaug.es/documentation/api/"&gt;Gauges &lt;span class="caps"&gt;API&lt;/span&gt;&lt;/a&gt;. Despite building &lt;a href="http://gaug.es"&gt;Gauges&lt;/a&gt; from the ground up as an &lt;span class="caps"&gt;API&lt;/span&gt;, it was a lot of work. You really have to cross your t&amp;#8217;s and dot your i&amp;#8217;s when releasing an &lt;span class="caps"&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;1. Document as You Build&lt;/h2&gt;
&lt;p&gt;We made the mistake of documenting after most of the build was done. The problem is documenting sucks. Leaving that pain until the end, when you are excited to release it, makes doing the work twice as hard. Thankfully, we have a &lt;a href="http://theprogrammingbutler.com"&gt;closer&lt;/a&gt; on our team who powered through it.&lt;/p&gt;
&lt;h2&gt;2. Be Consistent&lt;/h2&gt;
&lt;p&gt;As we documented the &lt;span class="caps"&gt;API&lt;/span&gt;, we noticed a lot of inconsistencies. For example, in some places we return a hash and in others we returned an array. Upon realizing these issues, we started making some rules.&lt;/p&gt;
&lt;p&gt;To solve the array/hash issue, we elected that every response should return a hash. This is the most flexible solution going forward. It allows us to inject new keys without having to convert the response or release a whole new version of the &lt;span class="caps"&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Changing from an array to a hash meant that we needed to namespace the array with a key. We then noticed that some places were name-spaced and others weren&amp;#8217;t. Again, we decided on a rule. In this case, all top level objects should be name-spaced, but objects referenced from a top level object or a collection of several objects did not require name-spacing.&lt;/p&gt;
&lt;pre&gt;&lt;code class="javascript"&gt;{users:[{user:{...}}, {user:{...}}]} // nope
{users:[{...}, {...}]} // yep
{username: 'jnunemaker'} // nope
{user: {username:'jnunemaker'}} // yep &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You get the idea. Consistency is important. It is not so much how you do it as that you always do it the same.&lt;/p&gt;
&lt;h2&gt;3. Provide the URLs&lt;/h2&gt;
&lt;p&gt;Most of my initial open source work was wrapping APIs. The one thing that always annoyed me was having to generate urls. Each resource should know the URLs that matter. For example, a user resource in Gauges has a few URLs that can be called to get various data:&lt;/p&gt;
&lt;pre&gt;&lt;code class="javascript"&gt;{
  "user": {
    "name": "John Doe",
    "urls": {
      "self": "https://secure.gaug.es/me",
      "gauges": "https://secure.gaug.es/gauges",
      "clients": "https://secure.gaug.es/clients"
    },
    "id": "4e206261e5947c1d38000001",
    "last_name": "Doe",
    "email": "john@doe.com",
    "first_name": "John"
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The previous &lt;span class="caps"&gt;JSON&lt;/span&gt; is the response of the resource /me. /me returns data about the authenticated user and the URLs to update itself (self), get all gauges (/gauges), and get all &lt;span class="caps"&gt;API&lt;/span&gt; clients (/clients). Let&amp;#8217;s say next you request /gauges. Each gauge returned has the URLs to get more data about the gauge.&lt;/p&gt;
&lt;pre&gt;&lt;code class="javascript"&gt;{
  "gauges": [
    {
      // various attributes
      "urls": {
        "self":"https://secure.gaug.es/gauges/4ea97a8be5947ccda1000001",
        "referrers":"https://secure.gaug.es/gauges/4ea97a8be5947ccda1000001/referrers",
        "technology":"https://secure.gaug.es/gauges/4ea97a8be5947ccda1000001/technology",
        // ... etc
      },
    }
  ]
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We thought this would prove helpful. We&amp;#8217;ll see in the long run if it turns out to work well.&lt;/p&gt;
&lt;h2&gt;4. Present the Data&lt;/h2&gt;
&lt;p&gt;Finally, never ever use to_json and friends from a controller or sinatra get/post/put block. At least as a bare minimum rule, the second you start calling to_json with :methods, :except, :only, or any of the other options, you probably want to move it to a separate class.&lt;/p&gt;
&lt;p&gt;For Gauges, we call these classes presenters. For example, here is a simplified version of the UserPresenter.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class UserPresenter
  def initialize(user)
    @user = user
  end

  def as_json(*)
    {
      'id'          =&amp;gt; @user.id,
      'email'       =&amp;gt; @user.email,
      'name'        =&amp;gt; @user.name,
      'first_name'  =&amp;gt; @user.first_name,
      'last_name'   =&amp;gt; @user.last_name,
      'urls'        =&amp;gt; {
        'self'    =&amp;gt; "#{Gauges.api_url}/me",
        'gauges'  =&amp;gt; "#{Gauges.api_url}/gauges",
        'clients' =&amp;gt; "#{Gauges.api_url}/clients",
      }
    }
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nothing fancy. Just a simple ruby class that sits in app/presenters. Here is an example of the the /me route looks like in our Sinatra app.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;get('/me') do
  content_type(:json)
  sign_in_required
  {:user =&amp;gt; UserPresenter.new(current_user)}.to_json
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This simple presentation layer makes it really easy to test the responses in detail using unit tests and then just have a single integration test that makes sure overall things look good. I&amp;#8217;ve found this tiny layer a breath of fresh air.&lt;/p&gt;
&lt;p&gt;I am sure that nothing above was shocking or awe-inspiring, but I hope that it saves you some time on your next public &lt;span class="caps"&gt;API&lt;/span&gt;.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ow3vUvRMELY:HIoe_hNFlEQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ow3vUvRMELY:HIoe_hNFlEQ:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=ow3vUvRMELY:HIoe_hNFlEQ:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/ow3vUvRMELY" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/12/01/creating-an-api/</feedburner:origLink></entry>
  
    <entry>
      <title>Stupid Simple Debugging</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/-Ug8Jox3-HA/" />
      <id>4e5fa602dabe9d2e8d007bf0</id>
      <updated>2011-09-01T11:49:52-04:00</updated>
      <published>2011-08-31T23:00:00-04:00</published>
      <category term="gems" /><category term="logging" />
      <summary type="html">&lt;p&gt;In which I talk about old school debugging.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;There are all kinds of fancy debugging tools out there, but personally, I get the most mileage out of good old puts statements.&lt;/p&gt;
&lt;p&gt;When I started with Ruby, several years ago, I used puts like this to debug:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;puts account.inspect&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The problem with this is two fold. First, if you have a few puts statements, you don&amp;#8217;t know which one is actually which object. This always led me to doing something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;puts "account: #{account.inspect}"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Second, depending on whether you are just in Ruby or running an app through a web server, puts is sometimes swallowed. This led me to often times do something like this when using Rails:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;Rails.logger.debug "account: #{account.inspect}"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, not only do I have to think about which method to use to debug something, I also have to think about where the output will be sent so I can watch for it.&lt;/p&gt;
&lt;h2&gt;Enter Log Buddy&lt;/h2&gt;
&lt;p&gt;Then, one fateful afternoon, I stumbled across log buddy (gem install log_buddy). In every project, whether it be a library, Rails app, or Sinatra app, &lt;strong&gt;one of the first gems I throw in my Gemfile is log_buddy&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Once you have the gem installed, you can tell log buddy where your log file is and whether or not to actually log like so:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;LogBuddy.init({
  :logger   =&amp;gt; Gauges.logger,
  :disabled =&amp;gt; Gauges.production?,
})&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Simply provide log buddy with a logger and tell it if you want it to be silenced in a given situation or environment and you get some nice bang for your buck.&lt;/p&gt;
&lt;h2&gt;One Method, One Character&lt;/h2&gt;
&lt;p&gt;First, log buddy adds a nice and short method named &lt;code&gt;d&lt;/code&gt;. &lt;code&gt;d&lt;/code&gt; is 4X shorter than &lt;code&gt;puts&lt;/code&gt;, so right off the bat you get some productivity gains. The &lt;code&gt;d&lt;/code&gt; method takes any argument and calls inspect on it. Short and sweet.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;d account # will puts account.inspect
d 'Some message' # will puts "Some message"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The cool part is that on top of printing the inspected object to stdout, it also logs it to the logger provided in in LogBuddy.init. No more thinking about which method to use or where output will be. One method, output is sent to multiple places.&lt;/p&gt;
&lt;p&gt;This is nice, but it won&amp;#8217;t win you any new friends. Where log buddy gets really cool, is when you pass it a block.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;d { account } # puts and logs account = &amp;lt;Account ...&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Again, one method, output to stdout and your log file, but when you use a block, it does magic to print out the variable name and that inspected value. You can also pass in several objects, separating them with semi-colons.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;d { account; account.creator; current_user }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This gives you each variable on its own line with the name and inspected value. Nothing fancy, but log buddy has saved me a lot of time over the past year. I figured it was time I send it some love.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=-Ug8Jox3-HA:YdaOJOP7Qh0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=-Ug8Jox3-HA:YdaOJOP7Qh0:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=-Ug8Jox3-HA:YdaOJOP7Qh0:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/-Ug8Jox3-HA" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/08/31/stupid-simple-debugging/</feedburner:origLink></entry>
  
    <entry>
      <title>Counters Everywhere, Part 2</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/RN3cfQBmCMk/" />
      <id>4e09f91fdabe9d77d0003d5c</id>
      <updated>2011-08-01T10:44:26-04:00</updated>
      <published>2011-07-31T21:00:00-04:00</published>
      <category term="gauges" /><category term="mongodb" />
      <summary type="html">&lt;p&gt;In which I talk more about how data is stored in &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt;.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;In &lt;a href="http://railstips.org/blog/archives/2011/06/28/counters-everywhere/"&gt;Counters Everywhere&lt;/a&gt;, I talked about how to handle counting lots of things using single documents in Mongo. In this post, I am going to cover the flip side&amp;#8212;counting things when there are an unlimited number of variations.&lt;/p&gt;
&lt;h2&gt;Force the Data into a Document Using Ranges&lt;/h2&gt;
&lt;p&gt;Recently, we added window and browser dimensions to &lt;a href="http://get.gaug.es"&gt;Gaug.es&lt;/a&gt;. Screen width has far fewer variations as there are only so many screens out there. However, browser width and height can vary wildly, as everyone out there has there browser open just a wee bit different.&lt;/p&gt;
&lt;p&gt;I knew that storing all widths or heights in a single document wouldn&amp;#8217;t work because the number of variations was too high. That said, we pride ourselves at Ordered List on thinking through things so our users don&amp;#8217;t have to.&lt;/p&gt;
&lt;p&gt;Does anyone really care if someone visited their site with a browser open exactly 746 pixels wide? No. Instead, what matters is what ranges of widths are visiting their site. Knowing this, we plotted out what we considered were the most important ranges of widths (320, 480, 800, 1024, 1280, 1440, 1600, &amp;gt; 2000) and heights (480, 600, 768, 900, &amp;gt; 1024).&lt;/p&gt;
&lt;p&gt;Instead of storing each exact pixel width, we figure out which range the width is in and do an increment on that. This allows us to receive a lot of varying widths and heights, but keep them all in one single document.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;{
  "sx" =&amp;gt; {
    "320"  =&amp;gt; 237,
    "480"  =&amp;gt; 367,
    "800"  =&amp;gt; 258,
    "1024" =&amp;gt; 2273,
    "1280" =&amp;gt; 10885,
    "1440" =&amp;gt; 6144
    "1600" =&amp;gt; 13607,
    "2000" =&amp;gt; 2154,
  },
  "bx" =&amp;gt; {
    "320"  =&amp;gt; 121,
    "480"  =&amp;gt; 390,
    "800"  =&amp;gt; 3424,
    "1024" =&amp;gt; 9790,
    "1280" =&amp;gt; 11125,
    "1440" =&amp;gt; 3989
    "1600" =&amp;gt; 6757,
    "2000" =&amp;gt; 301,
  },
  "by" =&amp;gt; {
    "480"  =&amp;gt; 3940,
    "600"  =&amp;gt; 13496,
    "768"  =&amp;gt; 8184,
    "900"  =&amp;gt; 6718,
    "1024" =&amp;gt; 3516
  },
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I would call this first method for storing a large number of variations cheating, but in this instance, cheating works great.&lt;/p&gt;
&lt;h2&gt;When You Can&amp;#8217;t Cheat&lt;/h2&gt;
&lt;p&gt;Where the single document model falls down is when you do not know the number of variations, or at least know that it &lt;strong&gt;could&lt;/strong&gt; grow past 500-1000. Seeing how efficient the single document model was, I tried to store content and referrers in the same way, initially.&lt;/p&gt;
&lt;p&gt;I created one document per day per site and it had a key for each unique piece of content or referring url with a value that was an incrementing number of how many times it was hit.&lt;/p&gt;
&lt;p&gt;It worked great. Insanely small storage and no secondary indexes were needed, so really light on &lt;span class="caps"&gt;RAM&lt;/span&gt;. Then, a few larger sites signed up that were getting 100k views a day and had 5-10k unique pieces of content a day. This hurt for a few reasons.&lt;/p&gt;
&lt;p&gt;First, wildly varying document sizes. Mongo pads documents a bit, so they can be modified without moving on disk. If a document grows larger than the padding, it has to be moved. Obviously, the more you hit the disk the slower things are, just as the more you go across the network the slower things are. Having some documents with 100 keys and others with 10k made it hard for Mongo to learn the correct padding size, because there was no correct size.&lt;/p&gt;
&lt;p&gt;Second, when you have all the content for a day in one doc and have to send 10k urls plus page titles across the wire just to show the top fifteen, you end up with some slowness. One site consistently had documents that were over a MB in size. I quickly realized this was not going to work long term.&lt;/p&gt;
&lt;p&gt;In our case, we always write data in one way and always read data in one way. This meant I needed an index I could use for writes and one that I could use for reads. I&amp;#8217;ll get this out of the way right now. If I had it to do over again, I would definitely do it different. I&amp;#8217;m doing some stupid stuff, but we&amp;#8217;ll talk more about that later.&lt;/p&gt;
&lt;p&gt;The keys for each piece of content are the site_id (sid), path (p), views (v), date (d), title (t), and hash (h). Most of those should be obvious, save hash. Hash is a crc32 of the path. Paths are quite varying in length, so indexing something of consistent size is nice.&lt;/p&gt;
&lt;p&gt;For writes, the index is [[&amp;#8216;sid&amp;#8217;, 1], [&amp;#8216;d&amp;#8217;, -1], [&amp;#8216;h&amp;#8217;,  1]] and for reads the index is [[&amp;#8216;sid&amp;#8217;, 1], [&amp;#8216;d&amp;#8217;, -1], [&amp;#8216;v&amp;#8217;, -1]]. This allows me to upset based on site, date and hash for writes and then read the data by site, date and views descending, which is exactly what it looks like when we show content to the user.&lt;/p&gt;
&lt;p&gt;&lt;a href="/assets/4e35f935dabe9d6a6d00134c/top_content.jpeg"&gt;&lt;img src="/assets/4e35f935dabe9d6a6d00134c/article_full/top_content.jpeg" class="image full" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;As mentioned in the previous post, I do a bit of range based partitioning as well, keeping a collection per month. Overall, this is working great for content, referrers and search terms on &lt;a href="http://get.gaug.es"&gt;Gaug.es&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Learning from Mistakes&lt;/h2&gt;
&lt;p&gt;So what would I do differently if given a clean slate? Each piece of content and referring url have an _id key that I did not mention. It is never used in any way, but _id is automatically indexed. Having millions of documents each month, each with an _id that is never used starts to add up. Obviously, it isn&amp;#8217;t really hurting us now, but I see it as wasteful.&lt;/p&gt;
&lt;p&gt;Also, each document has a date. Remember that the collection is already partitioned by month (i.e.: c.2011.7 for July), yet hilariously, I store the full date with each document like so: yyyy-mm-dd. 90% of that string is completely useless. I could more easily store the day as an integer and ignore the year and month.&lt;/p&gt;
&lt;p&gt;Having learned my lesson on content and referrers, I switched things up a bit for search terms. Search terms are stored per month, which means we don&amp;#8217;t need the day. Instead of having a shorter but meaningless _id, I opted to use something that I knew would be unique, even though it was a bit longer.&lt;/p&gt;
&lt;p&gt;The _id I chose was &amp;#8220;site_id:hash&amp;#8221; where hash is a crc32 of the search term. This is conveniently the same as the fields that are upserted on, which combined with the fact that _id is always indexed means that we no longer need a secondary index for writes.&lt;/p&gt;
&lt;p&gt;I still store the site_id in the document so that I can have a compound secondary index on site_id (sid) and views (v) for reads. Remember that the collection is scoped by month, and that we always show the user search terms for a given month, so all we really need is which terms were viewed the most for the given site, thus the index is [[&amp;#8216;sid&amp;#8217;, 1], [&amp;#8216;v&amp;#8217;, -1]].&lt;/p&gt;
&lt;p&gt;Hope that all makes sense. The gist is rather than have an _id that is never used, I moved the write index to _id, since it will always be unique anyway, which means one less secondary index and no wasted &lt;span class="caps"&gt;RAM&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;Interesting Finding&lt;/h2&gt;
&lt;p&gt;The only other interesting thing about all this is our memory usage. Our index size is now ~1.6GB, but the server is only using around ~120MB of &lt;span class="caps"&gt;RAM&lt;/span&gt;. How can that be you ask? We&amp;#8217;ve all heard that you need to have at least as much &lt;span class="caps"&gt;RAM&lt;/span&gt; as your index size, right?&lt;/p&gt;
&lt;p&gt;The cool thing is you don&amp;#8217;t. You only need as much &lt;span class="caps"&gt;RAM&lt;/span&gt; as your active set of data. Gaug.es is very write heavy, but people pretty much only care about recent data. Very rarely do they page back in time.&lt;/p&gt;
&lt;p&gt;What this means is that our active set is what is currently being written and read, which in our case is almost the exact same thing. The really fun part is that I can actually get this number to go up and down just by adjusting the number of results we show per page for content, referrers and search terms.&lt;/p&gt;
&lt;p&gt;If we show 100 per page, we use more memory than 50 per page. The reason is that people click on top content often to see what is doing well, which continually loads in the top 100 or 50, but they rarely click back in time. This means that the active set is the first 100 or 50, depending on what the per page is. Those documents stay in &lt;span class="caps"&gt;RAM&lt;/span&gt;, but older pages get pushed out for new writes and are never really requested again.&lt;/p&gt;
&lt;p&gt;I literally have a graph that shows our memory usage drop in half when we moved pagination from the client-side to the server-side. I thought it was interesting, so figured I would mention it.&lt;/p&gt;
&lt;p&gt;As always, if you aren&amp;#8217;t using &lt;a href="http://get.gaug.es"&gt;Gaug.es&lt;/a&gt; yet, be sure to give the free trial a spin!&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=RN3cfQBmCMk:50KN0Xtb4bQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=RN3cfQBmCMk:50KN0Xtb4bQ:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=RN3cfQBmCMk:50KN0Xtb4bQ:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/RN3cfQBmCMk" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/07/31/counters-everywhere-part-2/</feedburner:origLink></entry>
  
    <entry>
      <title>Counters Everywhere</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/2YoUsWP7fHA/" />
      <id>4e09c978dabe9d4e0d002458</id>
      <updated>2011-06-28T11:57:32-04:00</updated>
      <published>2011-06-28T08:30:00-04:00</published>
      <category term="gauges" /><category term="mongodb" />
      <summary type="html">&lt;p&gt;In which I share a few things I have learned about storing stats while working on Gaug.es.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;Last week, coming off hernia surgery number two of the year (and hopefully the last for a while) I eased back into development by working on &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In three days, I cranked out tracking of three new features. The only reason this was possible is because I have tried, failed, and succeeded on repeat at storing various stats efficiently in Mongo.&lt;/p&gt;
&lt;p&gt;While I will be using Mongo as the examples for this article, most of it could very easily be applied to any data store that supports incrementing numbers.&lt;/p&gt;
&lt;h2&gt;How are you going to use the data?&lt;/h2&gt;
&lt;p&gt;The great thing about the boon of new data stores is the flexibility that most provide regarding storage models. Whereas &lt;span class="caps"&gt;SQL&lt;/span&gt; is about normalizing the storage of data and then flexibly querying it, NoSQL is about thinking how you will query data and then flexibly storing it.&lt;/p&gt;
&lt;p&gt;This flexibility is great, but it means if you do not fully understand how you will be accessing data, you can really muck things up. If, on the other hand, you do understand your data and how it is accessed, you can do some really fun stuff.&lt;/p&gt;
&lt;p&gt;So how do we access data on Gaug.es? Depends on the feature (views, browsers, platforms, screen resolutions, content, referrers, etc.), but it can mostly be broken down into these points:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;Time frame resolution&lt;/strong&gt;. What resolution is needed? To the month? Day? Hour? Which piece of content was viewed the most matters on a per day basis, but which browser is winning the war only matters per month, or maybe even over several months.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Number of variations&lt;/strong&gt;. Browsers is a finite number of variations (Chrome, Firefox, Safari, IE, Opera, Other). Content is completely the opposite, as it varies drastically from website to website.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Knowing that resolution and variation drive how we need to present data is really important.&lt;/p&gt;
&lt;h2&gt;One document to rule them all&lt;/h2&gt;
&lt;p&gt;Due to the amount of data a hosted stats service has to deal with, most store each hit and then process them into reports on intervals. This leads to delays between something happening on your site and you finding out, as reports can be hours or even a day behind. This always bothered me and is why I am working really hard at making Gaug.es completely live.&lt;/p&gt;
&lt;p&gt;Ideally, you should be able to check stats anytime and know exactly what just happened. Email newsletter? Watch the traffic pour in a few minutes after you hit send.  Post to your blog? See how quickly people pick it up on Twitter and in feed readers.&lt;/p&gt;
&lt;p&gt;In order to provide access to data in real-time, we have to store and retrieve our data differently. Instead of storing every hit and all the details and then processing those hits, we make decisions and &lt;strong&gt;build reports as each hit comes in&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Resolution and Variations&lt;/h3&gt;
&lt;p&gt;What kind of decisions? Exactly what I mentioned above.&lt;/p&gt;
&lt;p&gt;First, we &lt;strong&gt;determine what resolution&lt;/strong&gt; a feature needs. Top content and referrers need to be stored per day for at least a month. After that, probably month is a good enough resolution.&lt;/p&gt;
&lt;p&gt;Browsers and screen sizes are far less interesting on a per day basis. Typically, these are only used a few times a year to make decisions such as dropping IE 6 support or deciding to target 1024&amp;#215;768 instead of 800&amp;#215;600 (remember that back in the day?).&lt;/p&gt;
&lt;p&gt;Second, we &lt;strong&gt;determine the variations&lt;/strong&gt;. Content and referrers varies greatly on a per site basis, but we can choose the browsers and screen dimensions to track. For example, with browsers, we picked Chrome, Safari, Firefox, Opera, IE and then we lump the rest of the browsers into Other. Do I really care how many people visit RailsTips in Konquerer? Nope, so why even show it.&lt;/p&gt;
&lt;p&gt;The same goes for platforms. We track Mac, Windows, Linux, iPhone, iPad, iPod, Android, Blackberry, and Other.&lt;/p&gt;
&lt;h3&gt;Document Model&lt;/h3&gt;
&lt;p&gt;Knowing that we only have 6 variations of browsers and 9 variations of platforms to track, and that the list is not likely to grow much, I store all of them in one document per month per site. This means showing someone browser and/or platform data for an entire month is one query for a very tiny document that looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;{
  '_id' =&amp;gt; 'site_id:month',
  'browsers' =&amp;gt; {
    'safari' =&amp;gt; {
      '5-0' =&amp;gt; 5,
      '4-1' =&amp;gt; 2,
    },
    'ie' =&amp;gt; {
      '9-0' =&amp;gt; 5,
      '8-0' =&amp;gt; 2,
      '7-0' =&amp;gt; 1,
      '6-0' =&amp;gt; 1,
    }
  },
  'platforms' =&amp;gt; {
    'macintosh' =&amp;gt; 10,
    'windows'   =&amp;gt; 5,
    'linux'     =&amp;gt; 2,
  },
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When a track request comes in, I parse the user agent to get the browser, version, and platform. We only store the major and minor parts of the version. Who cares about 12.0.1.2? What matters is 12.0. This means we end up with 5-10 versions per month per browser instead of 50 or 100. Also, note that Mongo does not allow dots in key names, so I store the dot as a hyphen, thus 12-0.&lt;/p&gt;
&lt;p&gt;I then do a single query on that document to increment the platform and browser/version.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;query  = {'_id' =&amp;gt; "#{hit.site_id}:#{hit.month}"}
update = {'$inc' =&amp;gt; {
  "b.#{browser_name}.#{browser_version}" =&amp;gt; 1,
  "p.#{platform}" =&amp;gt; 1,
}}
collection(hit.created_on).update(query, update, :upsert =&amp;gt; true)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;b and p are short for browser and platform. No need to waste space. The dot syntax in the strings in the update hash tell Mongo to reach into the document and increment a value for a key inside of a hash.&lt;/p&gt;
&lt;p&gt;Also, the _id (or primary key) of the document is the site id and the month since the two together are always unique. There is no need to store a &lt;span class="caps"&gt;BSON&lt;/span&gt; ObjectId or incrementing number, as the data is &lt;strong&gt;always&lt;/strong&gt; accessed for a given site and month. _id is automatically indexed in Mongo and it is the only thing that we query on, so there is no need for secondary indexes.&lt;/p&gt;
&lt;h3&gt;Range based partitioning&lt;/h3&gt;
&lt;p&gt;I also do a bit of range based partitioning at the collection level (ie: technology.2011, technology.2012). That is why I pass the date of the hit to the collection method. The collection that stores the browser and platform information is split by year. Maybe unnecessary looking back at it, but it hurts nothing. It means that a given collection stores number of sites * 12 documents at a maximum.&lt;/p&gt;
&lt;p&gt;Mongo creates collections on the fly, so when a new year comes along, the new collection will be created automatically. As years go by, we can create smaller summary documents and drop the old collections or move them to another physical server (which is often easier and more performant than removing old data from an active collection).&lt;/p&gt;
&lt;p&gt;Because I know that the number of variations is small (&amp;lt; 100-ish), I know that the overall document size is not really going to grow and that it will always efficiently fly across the wire. When you have relatively controllable data like browsers/platforms, &lt;strong&gt;storing it all in one document works great&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;Closing Thoughts&lt;/h2&gt;
&lt;p&gt;As I said before, this article is using Mongo as an example. If you wanted to use Redis, Membase or something else with atomic incrementing, you could just have one key per month per site per browser.&lt;/p&gt;
&lt;p&gt;Building reports on the fly through incrementing counters means:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;less storage, as you do not need the raw data&lt;/li&gt;
	&lt;li&gt;less &lt;span class="caps"&gt;RAM&lt;/span&gt;, as there are fewer secondary indexes&lt;/li&gt;
	&lt;li&gt;real-time querying is no problem, as you do not need to generate reports, the data is the report&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It definitely involves more thought up front, but several areas of Gaug.es use this pattern and it is working great. I should also note that it increases the number of writes. Creating the reports on the fly means 7 or 8 writes for each &amp;#8220;view&amp;#8221; instead of 1.&lt;/p&gt;
&lt;p&gt;The trade off is that reading the data is faster and avoids the lag caused by having to post-process it. I can see a day in the future where having all these writes will force me to find a different solution, but that is a ways off.&lt;/p&gt;
&lt;p&gt;What do you do when you cannot limit the number of variations? I&amp;#8217;ll leave that for next time.&lt;/p&gt;
&lt;p&gt;Oh, and if you have not signed up for &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt; yet, what are you waiting on? Do it!&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=2YoUsWP7fHA:R4noLdaAAjM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=2YoUsWP7fHA:R4noLdaAAjM:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=2YoUsWP7fHA:R4noLdaAAjM:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/2YoUsWP7fHA" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/06/28/counters-everywhere/</feedburner:origLink></entry>
  
    <entry>
      <title>EventMachine and Passenger</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/L1JuNHUbK4Q/" />
      <id>4dc16344dabe9d38ab0001c5</id>
      <updated>2011-05-04T11:20:00-04:00</updated>
      <published>2011-05-04T10:31:00-04:00</published>
      <category term="eventmachine" /><category term="passenger" /><category term="pusher" />
      <summary type="html">&lt;p&gt;In which I share how to use EventMachine along side Passenger.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;In order to fully explain this post, we first need to cover some back story. Originally, &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt; was hosted on &lt;a href="http://heroku.com"&gt;Heroku&lt;/a&gt;. Recently, we moved Gaug.es to &lt;a href="http://railsmachine.com"&gt;RailsMachine&lt;/a&gt; (before the great &lt;span class="caps"&gt;AWS&lt;/span&gt; outage luckily), where we are already happily hosting &lt;a href="http://get.harmonyapp.com"&gt;Harmony&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;At Heroku, we were running on 1.9.2 and &lt;a href="http://code.macournoyer.com/thin/"&gt;thin&lt;/a&gt;. The most common RailsMachine stack is &lt;span class="caps"&gt;REE&lt;/span&gt; 1.8 and Passenger. Sticking with the common stack meant it would be a far easier and faster transition to Railsmachine, so we tweaked a few things and switched.&lt;/p&gt;
&lt;h2&gt;Heroku, Thin, and EventMachine&lt;/h2&gt;
&lt;p&gt;While at Heroku, we had been testing using &lt;a href="http://pusher.com/"&gt;PusherApp&lt;/a&gt; for live updating of analytics as they occurred. The pusher gem has two ways to trigger notifications, trigger (net/http) and trigger_async (em-http-request).&lt;/p&gt;
&lt;p&gt;Since Heroku runs on thin, we used trigger_async. This meant that sending the PusherApp notifications in the request cycle was fine, as they did not block.&lt;/p&gt;
&lt;p&gt;One of the changes when moving to RailsMachine was switching from trigger_async to trigger. Obviously, &lt;strong&gt;having an external &lt;span class="caps"&gt;HTTP&lt;/span&gt; request in your request path is less than ideal&lt;/strong&gt;, but backgrounding it seemed to go against the whole idea of &amp;#8220;live&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Our response times for Gaug.es average around 5ms, so even with 75-100ms for each PusherApp request, we were still in a normally acceptable response time range (not ok with me, but ok for now).&lt;/p&gt;
&lt;h2&gt;Pusher Conversation&lt;/h2&gt;
&lt;p&gt;I contacted the fine folks at Pusher and asked if they had any suggestions. One suggestion Martyn mentioned was &lt;code&gt;Thread.new { EM.run }&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Given my lack of experience with threads and event machine, this at first this struck me as dirty/scary and I was not sure if he was serious.&lt;/p&gt;
&lt;p&gt;I did a bit of research and discovered he was not only serious, but people we doing it. The &lt;span class="caps"&gt;AMQP&lt;/span&gt; gem even recommends it in the Readme.&lt;/p&gt;
&lt;h2&gt;Hmmm, This Might Actually Work&lt;/h2&gt;
&lt;p&gt;After a bit of googling and scouring code on Github I found a few different solutions. I started hacking and got something that was &amp;#8220;working&amp;#8221; pretty quickly. Quite intrigued &lt;strong&gt;I decided to hit up someone smarter than I&lt;/strong&gt;, Aman Gupta, who maintains the EventMachine and &lt;span class="caps"&gt;AMQP&lt;/span&gt; gems.&lt;/p&gt;
&lt;p&gt;He confirmed that it would work and recommended a few tweaks. Yesterday, I pushed it to production and thus far it is working great. Below is the code needed to make the magic happen.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;module GaugesEM
  def self.start
    if defined?(PhusionPassenger)
      PhusionPassenger.on_event(:starting_worker_process) do |forked|
        if forked &amp;amp;&amp;amp; EM.reactor_running?
          EM.stop
        end
        Thread.new { EM.run }
        die_gracefully_on_signal
      end
    end
  end

  def self.die_gracefully_on_signal
    Signal.trap("INT")  { EM.stop }
    Signal.trap("TERM") { EM.stop }
  end
end

GaugesEM.start&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Gaug.es is 100% &lt;a href="http://www.sinatrarb.com/"&gt;Sinatra&lt;/a&gt;, so I just put this in the file in Gaug.es that works similar to environment.rb or an initializer would in Rails.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;There are two key parts&lt;/strong&gt;. First, if we are running on Passenger and using smart spawning, we need to stop the event machine if it is started. Second, we create a new thread and start the event machine loop.&lt;/p&gt;
&lt;p&gt;Now, in the Notification class that we have in Gaug.es, I can do the following to make the Pusher request not block the main request.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;EM.next_tick {
  Pusher[channel].trigger_async('hit', doc)
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The main request carries on as usual and does not wait for the Pusher to request to finish. In the background, event machine is sending all these notifications. Once again, even on Passenger, &lt;strong&gt;we now have non-blocking pusher notifications&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;Hmm, This Does Work&lt;/h2&gt;
&lt;p&gt;Since it took me a bit to figure it out, I thought I would post it here for everyone to benefit from and maybe to start some discussion. If you have suggestions or see glaring issues, please let me know.&lt;/p&gt;
&lt;p&gt;I have no assumptions that I am wise or that this is perfect, but thus far it is getting the job done with no adverse affects.&lt;/p&gt;
&lt;h2&gt;Misleading Graph of Proof&lt;/h2&gt;
&lt;p&gt;Below is a graph of response times for Gaug.es thanks to New Relic. Seriously, where would we be without New Relic! Green is the time spent in external requests. I am sure you can tell at which point I pushed out the event machine integration.&lt;/p&gt;
&lt;p&gt;That said, don&amp;#8217;t think that all that time is instantly gone. It is still happening, just in a thread in the background without much affect, if any, on our normal response times.&lt;/p&gt;
&lt;p&gt;&lt;a href="/assets/4dc16ab2dabe9d435b00008b/gauges_event_machine.png"&gt;&lt;img src="/assets/4dc16ab2dabe9d435b00008b/article_full/gauges_event_machine.png" class="image full" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Demo, Plz!&lt;/h2&gt;
&lt;p&gt;If you are curious about what the live updating looks like currently in Gaug.es, I &lt;a href="http://www.screenr.com/xqo"&gt;posted a short video&lt;/a&gt; a few weeks back.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.screenr.com/xqo"&gt;&lt;img src="/assets/4dc169c0dabe9d435b000061/article_full/screenr_gauges_live_video.png" class="image full" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Also, if you too are &lt;a href="http://railstips.org/blog/archives/2011/03/21/hi-my-name-is-john/"&gt;addicted to analytics&lt;/a&gt;, you should definitely &lt;a href="http://gaug.es/signup"&gt;sign up and try it out&lt;/a&gt;. Lots of good stuff coming down the pipe!&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=L1JuNHUbK4Q:9wB0VBcdaqM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=L1JuNHUbK4Q:9wB0VBcdaqM:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=L1JuNHUbK4Q:9wB0VBcdaqM:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/L1JuNHUbK4Q" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/05/04/eventmachine-and-passenger/</feedburner:origLink></entry>
  
    <entry>
      <title>SSH Tunneling in Ruby</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/sHuYo-ioosc/" />
      <id>4dad9e05dabe9d43ed00028a</id>
      <updated>2011-04-19T10:42:44-04:00</updated>
      <published>2011-04-19T10:36:00-04:00</published>
      <category term="ruby" /><category term="ssh" />
      <summary type="html">&lt;p&gt;In which I show how to create and use an ssh tunnel with only Ruby.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;The other day I wanted to do some queries in production, but our servers are pretty locked down to the outside world. I was well aware that I could just make an ssh tunnel to connect to the database server, but I decided I wanted to do it in Ruby.&lt;/p&gt;
&lt;p&gt;I am not the brightest of crayons in the box, so it took me a bit. Since I struggled with it for a few, I figured others probably will someday as well and decided to post my solution here.&lt;/p&gt;
&lt;p&gt;Obviously, replace the strings with &amp;lt;&amp;#8230;&amp;gt; with your own information and change the host port information in the gateway.open call.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;require 'net/ssh/gateway'

gateway = Net::SSH::Gateway.new('&amp;lt;myremotehostorip.com&amp;gt;', '&amp;lt;remote_user&amp;gt;')

# Open port 27018 to forward to 127.0.0.1:27017
# on the remote host provided above
gateway.open('127.0.0.1', 27017, 27018)

# Connect to local port set in previous statement
conn = Mongo::Connection.new('127.0.0.1', 27018)

# Just printing out stats to show that it works
puts conn.db('&amp;lt;database_name&amp;gt;').stats.inspect

gateway.shutdown!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;With just a few lines of Ruby, I can make scripts that use my local ssh key to talk to production. Thanks go to &lt;a href="http://weblog.jamisbuck.org/"&gt;Jamis Buck&lt;/a&gt; for all the heavy lifting of writing net-ssh and company.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=sHuYo-ioosc:buu1QzyksQw:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=sHuYo-ioosc:buu1QzyksQw:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=sHuYo-ioosc:buu1QzyksQw:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/sHuYo-ioosc" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/04/19/ssh-tunneling-in-ruby/</feedburner:origLink></entry>
  
    <entry>
      <title>Hi My Name is John...</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/PZROJ_dVp2U/" />
      <id>4d87bab1dabe9d207c000011</id>
      <updated>2011-03-22T09:16:04-04:00</updated>
      <published>2011-03-21T17:36:00-04:00</published>
      <category term="analytics" /><category term="gauges" /><category term="statsd" />
      <summary type="html">&lt;p&gt;In which I share my addition to analytics and show you how to get your fix.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;&amp;#8230;and I am addicted to analytics. It all started when I was a wee lad. I quite enjoyed playing Tecmo &lt;span class="caps"&gt;NBA&lt;/span&gt; Basketball, among other games. One day, while rocking the house with Shawn Kemp and the Seattle Supersonics, I noticed that Tecmo &lt;span class="caps"&gt;NBA&lt;/span&gt; basketball did not seem to be correctly recording rebounds.&lt;/p&gt;
&lt;p&gt;Obviously, this kind of egregious error was unacceptable. With pad and paper, I began to keep track of rebounds on my own. After each rebound, I would record the stat for the player grabbing it. &lt;strong&gt;Yes, I actually paused game play&lt;/strong&gt; so that I could have correct analytics on rebounds.&lt;/p&gt;
&lt;h2&gt;The Joys of Blogging&lt;/h2&gt;
&lt;p&gt;Anyway, fast forward to 2011 where I now operate as a programmer. I could tell you that I grew out of that phase in my life, but alas I have not. From &lt;a href="http://www.shauninman.com/archive/shortstat"&gt;Shortstat&lt;/a&gt;, to &lt;a href="http://haveamint.com/"&gt;Mint&lt;/a&gt;, and now on to &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt;, I have maintained quite a fascination with analytics.&lt;/p&gt;
&lt;p&gt;If I am being completely honest, one of the main reasons I blog is to see the views come in after a new post. And oh the joys when it lands on Reddit or HN and brings me people in excess (and lame comments covering how stupid I am).&lt;/p&gt;
&lt;h2&gt;Graphite and Statsd&lt;/h2&gt;
&lt;p&gt;The great thing is that on top of websites, I now help maintain several applications. Applications are a fun and tricky beast full of opportunities to record metrics. Most of the time though, these metrics go unrecorded because it is too much work to store and maintain them.&lt;/p&gt;
&lt;p&gt;After reading &lt;a href="http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/"&gt;measuring anything and everything&lt;/a&gt;  by the fine folks at Etsy, I decided it was time to get dirty. I spent a few hours this weekend setting up &lt;a href="http://graphite.wikidot.com/"&gt;Graphite&lt;/a&gt; and &lt;a href="https://github.com/etsy/statsd"&gt;statsd&lt;/a&gt; on a small &lt;span class="caps"&gt;VPS&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Graphite is &amp;#8220;enterprise scalable realtime graphing&amp;#8221; and statsd, built by Etsy, is a &amp;#8220;network daemon for aggregating statistics, rolling them up, then sending them to Graphite&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Stealing &lt;a href="https://gist.github.com/862471"&gt;pieces of a gist&lt;/a&gt;, I fumbled my way through, and with a little help from &lt;a href="http://metaatem.net/"&gt;Kastner&lt;/a&gt;, I was good to go.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;UDP&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Once I was past the I feel stupid because I have never really setup python or node.js apps before, it was time to start sending my setup some data. statsd speaks &lt;a href="http://en.wikipedia.org/wiki/User_Datagram_Protocol"&gt;&lt;span class="caps"&gt;UDP&lt;/span&gt;&lt;/a&gt;, which I have certainly heard about, but never before actually looked into.&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;UDP&lt;/span&gt; is an unreliable, unordered, lightweight protocol for slinging messages around the interwebs. The best way to think of it for those that are unfamiliar is fire and forget. The huge upside of &lt;span class="caps"&gt;UDP&lt;/span&gt; for analytics is that the effect of sprinkling it all over your app is minimal.&lt;/p&gt;
&lt;p&gt;You lose a millisecond constructing and sending the message, but if statsd ever goes down, your app does not. You simply lose statistics until it comes back up. Lets look at a simple example.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;require 'socket'
socket = UDPSocket.new
socket.send('some message', 0, '127.0.0.1', 33333)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Go ahead and run that. Notice how it doesn&amp;#8217;t error? No, it does not magically spin up something in the background. It is fire and forget. The message is sent, but whether or not it makes it to its destination does not matter. Most of the time it will, sometimes it won&amp;#8217;t.&lt;/p&gt;
&lt;p&gt;I read somewhere that &lt;span class="caps"&gt;TCP&lt;/span&gt; is like a phone call and &lt;span class="caps"&gt;UDP&lt;/span&gt; is like a letter in the mail. Good analogy.&lt;/p&gt;
&lt;h2&gt;Statsd from Ruby&lt;/h2&gt;
&lt;p&gt;I started to work on a &lt;span class="caps"&gt;UDP&lt;/span&gt; client for statsd and then realized I should probably check Github before getting too far in. Thankfully, Rein already had a nice little &lt;a href="https://github.com/reinh/statsd"&gt;statsd library&lt;/a&gt; created.&lt;/p&gt;
&lt;p&gt;I felt like it was missing a few things, so I forked it and added a &lt;a href="https://github.com/reinh/statsd/pull/2"&gt;time method&lt;/a&gt; that works with blocks and &lt;a href="https://github.com/reinh/statsd/pull/3"&gt;namespacing&lt;/a&gt; (so I could track multiple apps from same graphite/statsd install). I have already talked with him and he plans on pulling both. Until then, you can checkout the &lt;a href="https://github.com/jnunemaker/statsd/tree/mine"&gt;mine branch&lt;/a&gt; on my fork.&lt;/p&gt;
&lt;p&gt;Now that I had the server side setup and was armed with a client library, I started to think about what kind of stats I would like to add to &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt;. The first thing I could think of was recording each track. I already store an all time number in Mongo, but minute/hour/day data could not hurt.&lt;/p&gt;
&lt;p&gt;I created a tiny wrapper around Rein&amp;#8217;s library so things would only be tracked in production. I certainly could do this other ways, and probably will, but it worked good enough to get things out the door.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class Stats
  cattr_accessor :client

  def self.record_stats?
    Gauges.environment == 'staging' || Gauges.environment == 'production'
  end

  def self.increment(*args)
    client.increment(*args) if record_stats?
  end

  def self.decrement(*args)
    client.decrement(*args) if record_stats?
  end

  def self.timing(*args)
    client.timing(*args) if record_stats?
  end
end

Stats.client = Statsd.new(ipaddr, port)
Stats.client.namespace = 'gauges'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using this, I added a increment to the track route &lt;code&gt;Stats.increment('routes.track')&lt;/code&gt;, deployed, and instantly had graphs to play with. Below is tracks per second since last night when I first added the tracking.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4d87c9f3dabe9d2dd6000038/article_full/gauges_tracks.png" class="full image" alt="" /&gt;&lt;/p&gt;
&lt;h2&gt;Fun Use Case&lt;/h2&gt;
&lt;p&gt;In Gaug.es, about 75% of the storage is in the contents collection. This collection tracks the views, titles and paths for each site. I was curious what was taking up more space, titles or paths.&lt;/p&gt;
&lt;p&gt;Abusing the timing method in statsd, I was able to send the length of the path and title for each piece of content as it was tracked and then get a nice graph of the lower, upper, mean, and upper 90 percentiles.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4d87cbdcdabe9d313d00000e/article_full/content_paths_before.png" class="full image" alt="" /&gt;&lt;/p&gt;
&lt;p&gt;I noticed right away that some pieces of content were over 600 characters long. This seemed odd, so I started logging the offending pieces of content. I tailed the log for a while and saw that it was Facebook&amp;#8217;s fault. :)&lt;/p&gt;
&lt;p&gt;For some reason sites using Facebook&amp;#8217;s &amp;#8220;like&amp;#8221; tools end up getting a querying string parameter named fbc_channel, which has a value that is hundreds of characters of json. Awesome.&lt;/p&gt;
&lt;p&gt;I created a test case out of the misbehaving content, stripping the fbc_channel param, and deployed a fix. Based on the graph below it is obvious when I pushed out the change.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4d87cd67dabe9d316000001c/article_full/content_paths_after.png" class="full image" alt="" /&gt;&lt;/p&gt;
&lt;p&gt;From adding the analytics, to detection, to deploying a fix, only a few minutes flew by. Note that previously I would not have even tracked content path length. I would have never discovered the issue and the sites that had this going on would have continued to have jacked up stats, probably never mentioning it to me.&lt;/p&gt;
&lt;h2&gt;You have no excuse&lt;/h2&gt;
&lt;p&gt;I spent a few hours getting things running, but oh the joy I have now. Setup a small &lt;span class="caps"&gt;VPS&lt;/span&gt; or an EC2 micro instance. Install graphite and statsd. Never again wonder. Graph all your theories and improve your apps. That is all for now, I have more metrics to track!&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PZROJ_dVp2U:aIGnByINmnA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PZROJ_dVp2U:aIGnByINmnA:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=PZROJ_dVp2U:aIGnByINmnA:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/PZROJ_dVp2U" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/03/21/hi-my-name-is-john/</feedburner:origLink></entry>
  
    <entry>
      <title>Give Yourself Constraints</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/UM7Z2exncy8/" />
      <id>4d5b0d78dabe9d1d76000031</id>
      <updated>2011-02-21T10:11:51-05:00</updated>
      <published>2011-02-20T23:02:00-05:00</published>
      <category term="applications" /><category term="gauges" /><category term="thoughts" />
      <summary type="html">&lt;p&gt;In which I ramble for a bit about a recent project and how the constraints I put on myself helped make it happen.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;Recently, I had a hernia and surgery to fix it. This knocked me out of the game and onto the couch for a couple weeks. During my recovery, I had a lot of time to think. I also had a lot of time &lt;strong&gt;to miss what I do every day&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This was the longest period in several years for me without creating. Once I felt good enough to get back at it, even if only for a few hours, I decided to focus all this pent up energy on something new.&lt;/p&gt;
&lt;p&gt;What I wanted to do, was &lt;strong&gt;to think through a problem different than I ever have&lt;/strong&gt;. I have been creating applications pretty much the same way for quite some time. Sure, MongoDB changed my methods a bit, but I knew I had not used it to its full potential, as I typically start all new Mongo projects with MongoMapper.&lt;/p&gt;
&lt;h2&gt;What to Build&lt;/h2&gt;
&lt;p&gt;First, I thought about what to build. I have a plethora of &amp;#8220;someday&amp;#8221; ideas that have never made it out of that stage. One of those ideas was to build a &lt;strong&gt;simple analytics program&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Sure, there are a crap ton of analytics apps out there, including Google Analytics, but not a one thus far has hit the sweet spot I am looking for.&lt;/p&gt;
&lt;h2&gt;The Old Constraint&lt;/h2&gt;
&lt;p&gt;Back in the day, I would entertain every whim I had. &lt;strong&gt;This is great for learning&lt;/strong&gt; a lot of new things, but I never really focused and finished anything. What I had was a project directory full of half (or less) finished ideas.&lt;/p&gt;
&lt;p&gt;When I actually forced myself to work on only a project or two (I chose MongoMapper and &lt;a href="http://get.harmonyapp.com"&gt;Harmony&lt;/a&gt;), I noticed that &lt;strong&gt;I actually finished things and had something to show for myself&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: The constraint of what I could work on made me more productive than working on whatever I was inspired to work on.&lt;/p&gt;
&lt;p&gt;That said, rules are made to be broken, right? Plus, this new project was not entirely misguided. &lt;a href="http://get.harmonyapp.com"&gt;Harmony&lt;/a&gt; manages websites. Websites need analytics. There is definitely a benefit to Harmony in building an analytics system that can be &lt;strong&gt;deeply integrated&lt;/strong&gt;, so I began work.&lt;/p&gt;
&lt;h2&gt;The New Constraints&lt;/h2&gt;
&lt;p&gt;Since I was bending the rules a bit, I decided to give myself different constraints this time. Rather than what I would work on, I focused on what tools I could use to do the work. The thought was that &lt;strong&gt;forcing myself to avoid my comfort tools would lead to thinking outside of my usual box&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The first constraint I made was that I could only use the Mongo Ruby driver. No ActiveRecord, MongoMapper or any other object mapper.&lt;/p&gt;
&lt;p&gt;Second, no aggregate querying for reports. All of the analytics had to be built on the fly as each hit came in. I told myself I could not even store the individual hit data that was generated each time a page was tracked.&lt;/p&gt;
&lt;p&gt;Third, no signup or authentication. I wanted to focus on the core functionality (tracking views) instead of spending my time authenticating users and all that crap.&lt;/p&gt;
&lt;p&gt;Fourth, I cannot remember, so lets move on.&lt;/p&gt;
&lt;h2&gt;The Prototype&lt;/h2&gt;
&lt;p&gt;Within a few hours, using Sinatra and the MongoDB Ruby driver, I had a little prototype working. Each hit was a single MongoDB operation, an &lt;a href="http://www.mongodb.org/display/DOCS/Updating#Updating-UpsertswithModifiers"&gt;upsert&lt;/a&gt; based on the host, with year, month, day, and hour information stored in nested hashes. The nested hashes were updated in the operation using &lt;a href="http://www.mongodb.org/display/DOCS/Updating#Updating-%24inc"&gt;$inc&lt;/a&gt;. It did not do much, but it was pretty cool.&lt;/p&gt;
&lt;p&gt;I am aware, especially now, how simple that first prototype was, but it felt great to create again. Next, I threw the app up on &lt;a href="http://heroku.com"&gt;Heroku&lt;/a&gt; and &lt;a href="http://mongohq.com"&gt;MongoHQ&lt;/a&gt;, so I could try it out tracking this site you are reading.&lt;/p&gt;
&lt;p&gt;Both are free to try, so it cost me nothing to get something up that I and others could react to. I showed it to the rest of the &lt;a href="http://orderedlist.com"&gt;Ordered List&lt;/a&gt; team and every one shared the excitement.&lt;/p&gt;
&lt;h2&gt;The Result&lt;/h2&gt;
&lt;p&gt;I am amazed at what we have come up with in such a short amount of time (4 weeks using occasional evenings/weekends). In fact, the constraint of time is why Gauges is where it is today. We were really productive in the hours we snatched to work on it, because they were few and far between.&lt;/p&gt;
&lt;p&gt;Steve came up with a great name, found a matching domain (which is a &lt;strong&gt;miracle&lt;/strong&gt; these days) and &lt;a href="http://gaug.es"&gt;Gaug.es&lt;/a&gt; was born.&lt;/p&gt;
&lt;h3&gt;Home Page&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://gaug.es"&gt;&lt;img src="/assets/4d61d9acdabe9d65b6000250/article_full/gauges_home.jpg" class="image full" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;Dashboard&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://gaug.es"&gt;&lt;img src="/assets/4d61dadbdabe9d5e73000647/article_full/gauges_dashboard.jpg" class="image full" alt="" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Holy crap did Steve (&lt;a href="http://twitter.com/orderedlist"&gt;@orderedlist&lt;/a&gt; on Twitter) knock the design out of the park. Dude has skills!&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The last 4 weeks working on Gauges has made me feel, at times, &lt;strong&gt;like a kid again&lt;/strong&gt;. It is almost like we were racing to the next fence post (sue me, I grew up on a farm). &lt;strong&gt;Sure, I broke the original constraints I set, but without them, I would never have started&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;To break constraint numero uno, I used &lt;a href="http://railstips.org/blog/archives/2011/01/27/data-modeling-in-performant-systems/"&gt;ToyStore&lt;/a&gt;, with my &lt;a href="https://github.com/jnunemaker/adapter-mongo"&gt;mongo adapter&lt;/a&gt;. Any service that involves user input needs validations and the like, and I was not in the mood to build them from scratch.&lt;/p&gt;
&lt;p&gt;I actually stuck with constraint number two. All of the tracking code uses upserts with MongoDB modifiers to create and update reports on the fly.&lt;/p&gt;
&lt;p&gt;Constraint number three also fell to the wayside. You cannot very well make money on a service that requires no signup (or at least it is increasingly difficult), so you do in fact have to sign up and in to use Gauges. That said, right now we are just testing things out, so we are controlling things with a code.&lt;/p&gt;
&lt;p&gt;Whether it be the tools that you use, the projects that you work on, or the people you work with, &lt;strong&gt;give yourself constraints&lt;/strong&gt;. Create something that you have always wanted. Get other people excited about it. Work cannot always be &amp;#8220;work&amp;#8221;.&lt;/p&gt;
&lt;p&gt;P.S. If Gauges is something you are interested in and you would like to be an early tester, &lt;a href="mailto:john@orderedlist.com"&gt;let me know&lt;/a&gt;. The main goals right now are hosted (no setup/servers), easy sharing with other people, how much traffic, where was it from and where was it to.&lt;/p&gt;
&lt;p&gt;Plus, we have a few great ideas for down the road, once we get the basics ironed out. Thus far, Gauges is really scratching an itch I have had for a while and I am stoked.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=UM7Z2exncy8:6AJwggyJ9H4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=UM7Z2exncy8:6AJwggyJ9H4:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=UM7Z2exncy8:6AJwggyJ9H4:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/UM7Z2exncy8" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/02/20/give-yourself-constraints/</feedburner:origLink></entry>
  
    <entry>
      <title>Data Modeling in Performant Systems</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/Otb2BDlt_7A/" />
      <id>4d40d9b8dabe9d733b00001b</id>
      <updated>2011-02-21T22:16:07-05:00</updated>
      <published>2011-01-27T11:15:00-05:00</published>
      <category term="adapter" /><category term="gems" /><category term="toystore" />
      <summary type="html">&lt;p&gt;In which I talk about how to make things fast and a new project that lets you model that way.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;I have been working on Words With Friends, a high traffic app, for over six months. Talk about trial by fire. I never knew what scale was. Suffice to say that I have learned a lot.&lt;/p&gt;
&lt;p&gt;Keeping an application performant is all about finding bottlenecks and fixing them. &lt;strong&gt;The problem is each bottleneck you fix leads to more usage and a new bottleneck&lt;/strong&gt;. It is a constant game of cat and mouse. Sometimes you are the cat and sometimes, well, you are not.&lt;/p&gt;
&lt;p&gt;Most of the time, the &lt;strong&gt;removal of those bottlenecks is about moving hot data to places that can serve it faster&lt;/strong&gt;. Disks are slow, memory is fast, enter more memcached.&lt;/p&gt;
&lt;p&gt;Over time, you work and work to move hot data into memory and simplify your data access to fit into memory. Key here, value there. Eventually, you get to a place where &lt;strong&gt;you have simplified how you access your data&lt;/strong&gt; into simple key/value lookups.&lt;/p&gt;
&lt;p&gt;Games get marshaled into a key named &lt;code&gt;"Game:#{id}"&lt;/code&gt;. Joins are simplified to selecting ids and caching the array of ids into a key such as &lt;code&gt;"User:#{id}:active_game_ids"&lt;/code&gt; or &lt;code&gt;"User:#{id}:over_game_ids"&lt;/code&gt;. In turn, those arrays are turned into objects by un-marshaling the contents of &lt;code&gt;"Game:#{id}"&lt;/code&gt;, etc.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Your data model morphs from highly relational to key/value&lt;/strong&gt; because key/value is fast and memcached can withstand a bruising.&lt;/p&gt;
&lt;p&gt;Do it once, and you know how to do it in the future. The problem is by the time you get to this data model, it is kind of bolted on/in to your app.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What if you could design it this way from the beginning?&lt;/strong&gt; What if you had no option but to think through your data model in keys and values? Need your data in two different ways? Put it in two different places, etc, etc.&lt;/p&gt;
&lt;p&gt;I have good news. &lt;strong&gt;Now you can&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;A Little History&lt;/h2&gt;
&lt;p&gt;Not long into my tenure with &lt;span class="caps"&gt;WWF&lt;/span&gt;, we were hitting a lot of walls and there was a lot of talk about NoSQL. Mongo? Membase? Cassandra? Riak?&lt;/p&gt;
&lt;p&gt;Which one will work best for the problem at hand? What if we could try them all really easily by just changing which place the data went to? What if we could try out more than one at once?&lt;/p&gt;
&lt;p&gt;I sat down one weekend and started thinking about the app and realized what I just talked about above. Along the way, our data access changed from relational to key lookups. This made me think about a hash.&lt;/p&gt;
&lt;p&gt;Hashes are so versatile, and yet, so constrained. &lt;strong&gt;Hashes are for reading, writing and deleting keys, just like key/value stores&lt;/strong&gt;. I did a bit of GitHub searching and stumbled across &lt;a href="https://github.com/wycats/moneta"&gt;moneta&lt;/a&gt;, by Yehuda Katz.&lt;/p&gt;
&lt;p&gt;Moneta immediately struck me as &lt;strong&gt;brilliant&lt;/strong&gt;. I was shocked there was no activity around it. If you only allow yourself to read, write and delete with the same &lt;span class="caps"&gt;API&lt;/span&gt;, you can make nearly any data store talk the correct language.&lt;/p&gt;
&lt;p&gt;I fiddled with it and forked it, but in the end, &lt;strong&gt;it was not quite what I was looking for&lt;/strong&gt;. I liken it to my first house. I like the house, but having lived in it for six years, I know exactly what I want out of my next house.&lt;/p&gt;
&lt;p&gt;The folks at Newtoy (now Zynga with Friends) had mentioned that they wanted to build their own object mapper and name it ToyStore&amp;#8212;such a great name.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In a fit of inspiration&lt;/strong&gt; over the 4th of July weekend, I cranked out attributes and initialization, relying heavily on ActiveModel. It was really fun. I emailed the crew when the next work day came around and they were stoked.&lt;/p&gt;
&lt;p&gt;It began to occupy some of my work-related time and &lt;a href="http://twitter.com/#!/gdagley/"&gt;Geoffrey Dagley&lt;/a&gt; started helping me with it. Over the next few weeks, Geof and I hammered out validations, serialization, callbacks, dirty tracking, and much more.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Everything was built on the premise&lt;/strong&gt; that the only acceptable methods that could be used to read, write and delete data were read, write and delete.&lt;/p&gt;
&lt;h2&gt;Adapter: The Common Interface&lt;/h2&gt;
&lt;p&gt;Over time &lt;a href="http://opensoul.org"&gt;Brandon Keepers&lt;/a&gt; got involved and ToyStore started looking pretty legit. We switched from using Moneta as the base to something I whipped together in a few hours, &lt;a href="https://github.com/newtoy/adapter"&gt;Adapter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Defining an adapter is as simple as telling it how the client reads, writes and deletes data. You also have to define a clear method for convenience and to stick close the Ruby hash &lt;span class="caps"&gt;API&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The client can be anything that you want to have a unified interface&lt;/strong&gt;. For example, this is how you would create an adapter to store things in a ruby hash.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;Adapter.define(:memory) do
  def read(key)
    decode(client[key_for(key)])
  end

  def write(key, value)
    client[key_for(key)] = encode(value)
  end

  def delete(key)
    client.delete(key_for(key))
  end

  def clear
    client.clear
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;key_for&lt;/code&gt; ensures that most things can work as a key. &lt;code&gt;encode&lt;/code&gt; and &lt;code&gt;decode&lt;/code&gt; allow one to hook some kind of serialization in, whatever you fancy, be it Marshal, &lt;span class="caps"&gt;JSON&lt;/span&gt;, or whatever you can imagine.&lt;/p&gt;
&lt;p&gt;By defining those methods, we can now get an instance of this adapter and connect it to a client. In the example above, the client is just a plain ruby hash, but in other adapters, it could be an instance of Redis (&lt;a href="https://github.com/jnunemaker/adapter-redis"&gt;adapter&lt;/a&gt;), Memcached (&lt;a href="https://github.com/jnunemaker/adapter-memcached"&gt;adapter&lt;/a&gt;), or maybe a Riak bucket (&lt;a href="https://github.com/jnunemaker/adapter-riak"&gt;adapter&lt;/a&gt;).&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;adapter = Adapter[:memory].new({}) # sets {} to client
adapter.write('foo', 'bar')
adapter.read('foo') # 'bar'
adapter.delete('foo')
adapter.fetch('foo', 'bar') # returns bar and sets foo to bar

# [] and []= are aliased to read and write
adapter['foo'] = 'bar'
adapter['foo'] # 'bar'&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Adapters can also be defined using a block (like above), a module, or both (module included first, then block so you can override module with block).&lt;/p&gt;
&lt;p&gt;Adapters can also define atomic locking mechanisms, see the &lt;a href="https://github.com/jnunemaker/adapter-memcached/blob/master/lib/adapter/memcached.rb"&gt;memcached&lt;/a&gt; and &lt;a href="https://github.com/jnunemaker/adapter-redis/blob/master/lib/adapter/redis.rb"&gt;redis&lt;/a&gt; adapters for their locking implementations. The more opaque the object, the more you need to lock. Or, in the case of riak, the adapter can handle &lt;a href="https://github.com/jnunemaker/adapter-riak/blob/master/lib/adapter/riak.rb"&gt;read conflicts&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;ToyStore: The Mapper Fixings on top of Adaper&lt;/h2&gt;
&lt;p&gt;Once you have secured how your data layer speaks the adapter interface you can use the real power, ToyStore.&lt;/p&gt;
&lt;p&gt;Lets say you want to store your users in redis. Create your class, include the Toy::Store, and set it to store in redis.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;require 'toystore'
require 'adapter/redis'

class User
  include Toy::Store
  store :redis, Redis.new

  attribute :email, String
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From there, you can go to town, defining attributes, validations, callbacks and more.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;class User
  include Toy::Store
  store :redis, Redis.new

  attribute :email, String
  validates_presence_of :email
  before_save :lower_case_email

private
  def lower_case_email
    self.email = email.downcase if email
  end
end

user = User.new
pp user.valid?

user.email = 'John'
pp user.save

pp user
pp User.get(user.id)

user.destroy
pp User.get(user.id)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change your mind? Decide that you do not want to use Redis? Fancy Riak? Simply change the store to use the riak adapter and you are rolling.&lt;/p&gt;
&lt;pre&gt;&lt;code class="ruby"&gt;require 'toystore'
require 'adapter/riak'

class User
  include Toy::Store
  store :riak, Riak::Client.new['users']

  attribute :email, String
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Boom. &lt;strong&gt;You just completely changed your data store in a couple lines of code&lt;/strong&gt;. Practical? Yes and no. Cool? Heck yeah.&lt;/p&gt;
&lt;p&gt;What all does Toy::Store come with out of the box? So glad you asked.&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;Attributes&lt;/strong&gt; &amp;#8211; attribute :name, String (or some other type) Can be virtual which works just like attr_accessor but all the power of dirty tracking, serialization, etc. Also, can be abbreviated which means :first_name could be the method you use, but in the data store the attribute is :fn. Save those bytes! Allows for default values and defaults can be procs.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Typecasting&lt;/strong&gt; &amp;#8211; Same type system as MongoMapper. One day they will share the exact same type system in its own gem, for now duplicated.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Callbacks&lt;/strong&gt; &amp;#8211; all the usual suspects.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Dirty Tracking&lt;/strong&gt; &amp;#8211; save, create, update, destroy&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Mass assignment security&lt;/strong&gt; &amp;#8211; attr_accessible and attr_protected&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Proper cloning&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Lists&lt;/strong&gt; &amp;#8211; arrays of ids. If user has many games, user would have list :games which stores in game_ids key on user and works just like an association.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Embedded Lists&lt;/strong&gt; &amp;#8211; array of hashes. More consistent than MongoMapper, which will soon reap the benefits of the work on Toy Store embedded lists.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;References&lt;/strong&gt; &amp;#8211; think belongs_to by a different (better?) name. Post model could reference :creator, User to add creator_id key and relate creator to post.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Identity Map&lt;/strong&gt; &amp;#8211; On by default. Should be thread-safe.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Read/write through caching&lt;/strong&gt; &amp;#8211; If you specify a cache adapter (say memcached), ToyStore will write to memcached first and read from memcached first, populating the cache if it was not present.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Indexing&lt;/strong&gt; &amp;#8211; Need to do lookups by email? index :email and whenever a user is saved the user data is written to one key and the email is written as another key with a value of the user id.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Logging&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Serialization&lt;/strong&gt; (&lt;span class="caps"&gt;XML&lt;/span&gt; and &lt;span class="caps"&gt;JSON&lt;/span&gt;)&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Validations&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;Primary key factories&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It pretty much has you covered. Adapters for &lt;a href="https://github.com/jnunemaker/adapter-redis"&gt;redis&lt;/a&gt;, &lt;a href="https://github.com/jnunemaker/adapter-memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://github.com/jnunemaker/adapter-riak"&gt;riak&lt;/a&gt;, and &lt;a href="https://github.com/therealadam/adapter-cassandra"&gt;cassandra&lt;/a&gt; already exist. Expect a Mongo one soon. Have to make a few tweaks to adapter. Yep, even Mongo.&lt;/p&gt;
&lt;p&gt;What are other adapters that could be created? Membase? Just start with the memcached adapter and override &lt;code&gt;key_for&lt;/code&gt;. Git? File system? &lt;span class="caps"&gt;REST&lt;/span&gt;? MySQL?! I love it!&lt;/p&gt;
&lt;h2&gt;The Future&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;The future is not picking a database and forcing all your data into it&lt;/strong&gt;. The future (heck, now even) is the right database for the job and your application may need several of them.&lt;/p&gt;
&lt;p&gt;All this said, in no way do I think ToyStore is going to take the world by storm. &lt;strong&gt;It is a different way to build applications&lt;/strong&gt;. This way comes with great power, but great confusion as well.&lt;/p&gt;
&lt;p&gt;Currently, each model is serialized &lt;strong&gt;into one key in the store&lt;/strong&gt;, based on how the adapter does encode/decode. Eventually, I would like to add the ability &lt;strong&gt;to store different attributes in different keys&lt;/strong&gt;. For example, maybe you want active_game_ids to be stored in a key by itself so you don&amp;#8217;t have to constantly save the entire user object.&lt;/p&gt;
&lt;p&gt;I can also see a use for &lt;strong&gt;being able to store an attribute not just a different key, but a different store entirely&lt;/strong&gt;. Store your user objects in Riak, but active_game_ids in a Redis set. This is where it would get &lt;strong&gt;really&lt;/strong&gt; powerful.&lt;/p&gt;
&lt;p&gt;At any rate, I am very excited about this project and I think it has a lot of potential. I would also like to add that &lt;strong&gt;MongoMapper is here to stay&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In fact, &lt;strong&gt;I learned from my mistakes on MongoMapper when building ToyStore&lt;/strong&gt; and will be back-porting those learned experiences very soon. Expect a flurry of activity over the next little while.&lt;/p&gt;
&lt;h2&gt;Closing Thanks&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Huge thanks&lt;/strong&gt; to &lt;a href="http://newtoyinc.com/"&gt;Newtoy&lt;/a&gt; (now &lt;a href="http://newtoyinc.com/wp/blog/today-is-a-big-day-for-newtoy/"&gt;Zynga with Friends&lt;/a&gt;) for allowing Geof and I to open source this. Several pieces of ToyStore were built on their dime and I really appreciate their contribution to the Ruby and Rails community!&lt;/p&gt;
&lt;p&gt;As is typical with new projects, there are probably rough spots and good luck finding documentation. I have included a bevy of examples and the tests do a superb job at explaining the functionality of each method/feature.&lt;/p&gt;
&lt;p&gt;Let me know what your thoughts are and be sure to kick the tires!&lt;/p&gt;
&lt;h2&gt;Roundup of Links&lt;/h2&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href="https://github.com/newtoy/adapter"&gt;Adapter&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://github.com/newtoy/toystore"&gt;ToyStore&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://github.com/jnunemaker/adapter-redis"&gt;Redis Adapter&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://github.com/jnunemaker/adapter-memcached"&gt;Memcached Adapter&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://github.com/jnunemaker/adapter-riak"&gt;Riak Adapter&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href="https://github.com/therealadam/adapter-cassandra"&gt;Cassandra Adapter&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=Otb2BDlt_7A:sAaUwqACRPw:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=Otb2BDlt_7A:sAaUwqACRPw:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=Otb2BDlt_7A:sAaUwqACRPw:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/Otb2BDlt_7A" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2011/01/27/data-modeling-in-performant-systems/</feedburner:origLink></entry>
  
    <entry>
      <title>Year In Review</title>
      <link href="http://feedproxy.google.com/~r/railstips/~3/peGrSOv-b5k/" />
      <id>4d1e189bdabe9d6407000005</id>
      <updated>2010-12-31T13:56:57-05:00</updated>
      <published>2010-12-31T13:05:00-05:00</published>
      <category term="gems" /><category term="harmony" /><category term="mongodb" />
      <summary type="html">&lt;p&gt;In which I recap a very good year.&lt;/p&gt;</summary>
      <content type="html">&lt;p&gt;One of the main reasons that I write is for reflection. Blogging gives me a history of what I was interested in and when. In 2008, I &lt;a href="http://railstips.org/blog/archives/2008/12/22/the-2008-smrgsbord/"&gt;posted the Smörgåsbord&lt;/a&gt;. I skipped 2009, for whatever reason, but 2010 will not suffer the same fate.&lt;/p&gt;
&lt;h2&gt;Bang!&lt;/h2&gt;
&lt;p&gt;The first post of the year was pretty huge for me. It was the first post using &lt;a href="http://get.harmonyapp.com"&gt;Harmony&lt;/a&gt;, the website management system that we are building at &lt;a href="http://orderedlist.com"&gt;Ordered List&lt;/a&gt;. We even released it for public consumption on &lt;a href="http://railstips.org/blog/archives/2010/07/23/august-3rd/"&gt;August 3rd&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Post number two was my most popular post of all time. It took off so fast that I could not even enjoy it. I immediately had to &lt;a href="http://railstips.org/blog/archives/2010/01/22/multiple-domain-page-caching/"&gt;add page caching&lt;/a&gt; to Harmony so it could handle the load.&lt;/p&gt;
&lt;p&gt;I Have No Talent had 25k views in the first two days, but that did not prepare me for day three. Day three ended with over 60k views and by the time the dust settled on day four, &lt;strong&gt;it was over 100k views&lt;/strong&gt;. Crazy!&lt;/p&gt;
&lt;p&gt;Below is a screenshot of views for the entire year. Note how most of the year looks like a flat line next to the no talent post and the other hit of the year, &lt;a href="http://railstips.org/blog/archives/2010/10/14/stop-googling/"&gt;Stop Googling&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="/assets/4d1e1b2bdabe9d6562000001/article_full/rt_traffic.jpg" alt="" /&gt;&lt;/p&gt;
&lt;h2&gt;Thoughts&lt;/h2&gt;
&lt;p&gt;Every year I try to do a few posts that are just thoughts I have been having or that someone else had and I respect.&lt;/p&gt;
&lt;p&gt;This year I posted on &lt;a href="http://railstips.org/blog/archives/2010/01/24/just-in-time-not-just-in-case/"&gt;Just in Time, Not Just in Case&lt;/a&gt;, &lt;a href="http://railstips.org/blog/archives/2010/01/26/correct-beautiful-fast-in-that-order/"&gt;Correct, Beautiful, Fast, In That Order&lt;/a&gt;, and &lt;a href="http://railstips.org/blog/archives/2010/12/24/improving-your-methods/"&gt;Improving Your Methods&lt;/a&gt;. Each of these posts are something I believe really strongly in.&lt;/p&gt;
&lt;h2&gt;Presentations&lt;/h2&gt;
&lt;p&gt;This was the year of speaking for me. There were a few stretches where I was prepping for a presentation every other week. That is definitely a bit hectic when each presentation is on a new and different topic.&lt;/p&gt;
&lt;p&gt;A few of the highlights were &lt;a href="http://railstips.org/blog/archives/2010/04/18/i-have-no-talent-redux/"&gt;keynoting the Great Lakes Ruby Bash&lt;/a&gt; on my lack of talent, &lt;a href="http://railstips.org/blog/archives/2010/05/10/mongosf-mongomapper-video/"&gt;talking about MongoMapper at MongoSF&lt;/a&gt;, and &lt;a href="http://railstips.org/blog/archives/2010/06/16/railsconf-2010/"&gt;teaching how to steal&lt;/a&gt; at RailsConf 2010.&lt;/p&gt;
&lt;p&gt;I even posted on how to &lt;a href="http://railstips.org/blog/archives/2010/05/05/improve-your-presentations-in-under-50/"&gt;Improve Your Presentations in Under $50&lt;/a&gt;. If you are presenting any time soon, definitely check that post out.&lt;/p&gt;
&lt;h2&gt;Projects&lt;/h2&gt;
&lt;p&gt;This was the year of the tiny project for me. I have worked on so many applications now that I am beginning to see a lot of patterns. Each time I repeated myself, I did my best to move the abstraction into its own gem and share it for all or none to use.&lt;/p&gt;
&lt;p&gt;This year, I released &lt;a href="http://railstips.org/blog/archives/2010/02/27/canable-the-flesh-eating-permission-system/"&gt;Canable&lt;/a&gt; to help with permissions, &lt;a href="http://railstips.org/blog/archives/2010/03/26/a-nunemaker-joint/"&gt;Joint&lt;/a&gt; to make file uploads with GridFS easy, &lt;a href="http://railstips.org/blog/archives/2010/03/30/because-gem-names-are-like-domains-in-the-90s/"&gt;Whois&lt;/a&gt; to aid in naming your project, &lt;a href="http://railstips.org/blog/archives/2010/07/15/caching-with-mongo/"&gt;Bin&lt;/a&gt; for caching in Mongo, &lt;a href="http://railstips.org/blog/archives/2010/06/16/mongomapper-08-goodies-galore/"&gt;Plucky&lt;/a&gt; for sexy Mongo querying, &lt;a href="http://railstips.org/blog/archives/2010/12/20/hunt-an-experiment-in-search/"&gt;Hunt&lt;/a&gt; for easy search with Mongo, and &lt;a href="http://railstips.org/blog/archives/2010/12/28/a-scam-i-say/"&gt;Scam&lt;/a&gt; a simple enum/fake model helper.&lt;/p&gt;
&lt;p&gt;In addition to the aforementioned new projects, I released several updates to MongoMapper, including &lt;a href="http://railstips.org/blog/archives/2010/02/21/mongomapper-07-plugins/"&gt;plugins&lt;/a&gt;, &lt;a href="http://railstips.org/blog/archives/2010/02/21/mongomapper-07-identity-map/"&gt;identity map&lt;/a&gt;, and &lt;a href="http://railstips.org/blog/archives/2010/06/16/mongomapper-08-goodies-galore/"&gt;scopes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I also tried to share some of what I learned building MongoMapper in &lt;a href="http://railstips.org/blog/archives/2010/07/19/creating-duplicable-objects/"&gt;Creating Duplicable Objects&lt;/a&gt; and &lt;a href="http://railstips.org/blog/archives/2010/08/29/building-an-object-mapper-override-able-accessors/"&gt;Over-ridable Accessors&lt;/a&gt;, and &lt;a href="http://railstips.org/blog/archives/2010/10/24/the-chain-gang/"&gt;The Chain Gang&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Numbers&lt;/h2&gt;
&lt;p&gt;Though I posted about half as many times as in 2009, I feel the quality went way up. In total, &lt;strong&gt;RailsTips garnered over 700,000 views in 2010&lt;/strong&gt;. Blows my mind that it has grown to that in three and half years.&lt;/p&gt;
&lt;p&gt;Even more mid blowing is how much I have grown in that time. From barely knowing Rails, to creating one of the more popular object mappers. From working at Notre Dame, to owning my own business.&lt;/p&gt;
&lt;p&gt;I feel that all of this really is &lt;strong&gt;a testament to what a lot hard work can result in&lt;/strong&gt;. I cannot wait to see what 2011 brings.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/railstips?a=peGrSOv-b5k:vVcZL9pivF0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=peGrSOv-b5k:vVcZL9pivF0:dnMXMwOfBR0"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=dnMXMwOfBR0" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/railstips?a=peGrSOv-b5k:vVcZL9pivF0:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/railstips?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/railstips/~4/peGrSOv-b5k" height="1" width="1"/&gt;</content>
      <author>
        <name>John Nunemaker</name>
      </author>
    <feedburner:origLink>http://railstips.org/blog/archives/2010/12/31/year-in-review/</feedburner:origLink></entry>
  
</feed>

