<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0"> <channel><title>SKORKS</title> <link>http://www.skorks.com</link> <description>For the betterment of the software craft...</description> <lastBuildDate>Wed, 23 Nov 2011 00:18:05 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/softwaretechandmore" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="softwaretechandmore" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">softwaretechandmore</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://feedburner.google.com</feedburner:feedburnerHostname><item><title>What Is A Startup (Or A Startup Idea)?</title><link>http://www.skorks.com/2011/11/what-is-a-startup-or-a-startup-idea/</link> <comments>http://www.skorks.com/2011/11/what-is-a-startup-or-a-startup-idea/#comments</comments> <pubDate>Wed, 23 Nov 2011 00:18:05 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[startup]]></category> <category><![CDATA[startup assumptions]]></category> <category><![CDATA[startup idea]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1997</guid> <description><![CDATA[Is a startup all about the idea? Is it a bunch of people hacking together? It is not quite a company in the traditional sense of the word (at least not yet). I like to think of a startup as a series of assumptions. Your startup idea is just one assumption or alternatively there are [...]
No related posts.]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Pirate" class="alignleft size-medium wp-image-2001" height="199" hspace="25" src="http://www.skorks.com/wp-content/uploads/2011/11/pirate-300x199.jpg" title="Pirate" vspace="5" width="300" />Is a startup all about the idea? Is it a bunch of people hacking together? It is not quite a company in the traditional sense of the word (<em>at least not yet</em>). I like to think of a startup as a series of assumptions. Your startup idea is just one assumption or alternatively there are many assumptions inherent in your idea. Let&#39;s take a simple one we&#39;re all familiar with:</p><div><u><strong>&quot;People want to connect with their friends online&quot;</strong></u></div><div>&nbsp;</div><div>The main assumption is that people actually DO want to connect with their friends online (<em>of course we know now that they do, but a few years ago it wasn&#39;t so obvious</em>). Some of the inherent assumptions might be:</div><ul><li>People want to know what their friends are doing right now</li><li>People want to see interesting links from their friends</li><li>We have a way to acquire users in the initial stages</li><li>People will want to let their friends know about our service</li><li>People will want to pay for our service</li><li>etc.</li></ul><div>Your startup is really just a means to validate all those assumptions. If you can do it, then you may pronounce your startup a success (<em>it&#39;s not necessarily a success yet, but the rest is really just optimisation</em>), if you can&#39;t then it is a failure. Of course things are a bit more complex than that.&nbsp;</div><div>&nbsp;</div><div>If you follow the <a
href="http://theleanstartup.com/book" target="_blank">Lean Startup movement</a>&nbsp;at all, you&#39;re probably aware of <a
href="http://market-by-numbers.com/tag/pirate-metrics/" target="_blank">pirate metrics</a>&nbsp;(<em>the pirate says &#39;AARRR&#39;</em>). We have Acquisition, Activation, Retention, Referral, Revenue. <strong>The thing about those metrics is that they are a funnel </strong>and they are all about users. We start with lots of users in the Acquisition stage and we end up with less users in the Revenue stage.&nbsp;</div><div>&nbsp;</div><div
style="text-align: center; "><a
href="http://www.skorks.com/wp-content/uploads/2011/11/aarrr.png"><img
align="middle" alt="AARRR" class="aligncenter size-medium wp-image-2003" height="250" src="http://www.skorks.com/wp-content/uploads/2011/11/aarrr-265x300.png" title="AARRR" vspace="5" width="221" /></a></div><div>&nbsp;</div><div><strong>All the assumptions inherent in your idea are a funnel too</strong> (<em>think of them as a more specific version of the pirate funnel</em>) &#8211; you start with a bunch of potential users at the top and validating each of the assumptions will whittle away some of those users. Let&#39;s take this simplified example:</div><div>&nbsp;</div><div
style="text-align: center; "><img
align="middle" alt="Assumption funnel" class="aligncenter size-medium wp-image-2004" height="250" src="http://www.skorks.com/wp-content/uploads/2011/11/funnel-300x229.png" title="Assumption funnel" vspace="5" width="328" /></div><div>&nbsp;</div><div>The more people you can get through that funnel successfully the more sure you can be that you&#39;re onto a good thing, but if you end up with zero users before you get to the end of your funnel then you have a problem. If no-one wants to click the red button, then people are unlikely to ever tell their friends or upgrade their accounts. Your assumption is flawed and you need to adjust. Perhaps it&#39;s as simple as making the button green, or you may have to think of a way to achieve the same result without a button, or there might be a fundamental flaw in the model, in which case you might need to go back to square one, or pivot, or abandon the idea all together.&nbsp;</div><div>&nbsp;</div><div>Of course your assumption funnel is not strict, <strong>you can work on any assumption at any time, but you need to be fully aware of what you&#39;re doing</strong>. Firstly, it is easier to test the assumptions at the top of the funnel. You can figure out whether or not you can drive some initial traffic with nothing more than a dummy landing page, but to test retention you need to have built something. Secondly, there is &#39;<em>Danger, Will Robinson!</em>&#39; every time you work on one of the assumptions lower down in the funnel first. You may optimise your big red button with 3 of your mates (<em>the only users you have</em>), but if you can never get anyone to come to the site at all, what&#39;s the use? It can still all work out, especially if you&#39;re flush with money and have all the time in the world, but it&#39;s probably faster and easier to prove those assumptions in order.&nbsp;</div><div>&nbsp;</div><div>Incidentally this is very similar to the problem we ran into with <a
href="https://www.crowdhired.com/" target="_blank">CrowdHired</a>&nbsp;forcing us to essentially rebuild the site (<em>in a much slimmed down fashion</em>) and to try and bootstrap (<em>userbase-wise</em>) from the <a
href="http://yowaustralia.com.au/" target="_blank">YOW! developer conference</a>. Since we&#39;re not flush with time and money it was a costly lesson, but one well learned. &nbsp;</div><div>&nbsp;</div><div>The relevant concept is this, even without building anything you can easily figure out the key actions you and your potential users will have to take in order for your idea to work. You can then come up with a set of assumptions relating to each of those actions and arrange them into a funnel. <strong>You can then build the minimal amount of stuff to try and prove each assumption in turn</strong>. As you follow this process it can potentially give you some guidance regarding a number of things, such as user behaviour, where to take your idea next, when to pivot, when to get some more funding (<em>e.g. the only way to prove this next assumption is with 10 people working for a year :)</em>) etc.&nbsp;</div><div>&nbsp;</div><div>It goes without saying that depending on the type of startup your trying to build (<em>network effect, B2B, B2C, etc. &#8211; see what I did there, heh</em>), the mechanics will need adjustment. It also goes without saying that you need to validate all assumptions properly, asking 2 of your friends if they would click a button you just drew on a napkin is not an indicator of anything. In-fact this is one of the things I&#39;d like to explore further, but I guess I&#39;ll <a
href="http://feeds.feedburner.com/softwaretechandmore">leave it until next time</a>.</div><p><span
style="font-size: 10px; font-family: trebuchet ms">Image by <a
href="http://www.flickr.com/photos/spaceninja/1407558031/" target="_blank">spaceninja</a> </span></p><p>No related posts.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=EL-DlXYETEs:7ZiA7lj8xRw:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=EL-DlXYETEs:7ZiA7lj8xRw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=EL-DlXYETEs:7ZiA7lj8xRw:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/EL-DlXYETEs" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/11/what-is-a-startup-or-a-startup-idea/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Why Developers Never Use State Machines</title><link>http://www.skorks.com/2011/09/why-developers-never-use-state-machines/</link> <comments>http://www.skorks.com/2011/09/why-developers-never-use-state-machines/#comments</comments> <pubDate>Thu, 01 Sep 2011 00:38:47 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[software development]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1990</guid> <description><![CDATA[A few months ago I saw a great little blog post about state machines on the Shopify blog. The message was that state machines are great and developers should use them more &#8211; given my recent experiences with state machines at CrowdHired, I could certainly agree with that. But it got me thinking, how many [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/why-your-developers-dont-want-to-swap-pairs-practical-agility/' rel='bookmark' title='Why Your Developers Don&#8217;t Want To Swap Pairs &#8211; Practical Agility'>Why Your Developers Don&#8217;t Want To Swap Pairs &#8211; Practical Agility</a></li><li><a
href='http://www.skorks.com/2009/03/the-current-state-of-the-agile-nation-agile-process-adoption/' rel='bookmark' title='The Current State Of The Agile Nation &#8211; Agile Process Adoption'>The Current State Of The Agile Nation &#8211; Agile Process Adoption</a></li><li><a
href='http://www.skorks.com/2008/08/fitness-for-software-developers-and-other-it-professionals/' rel='bookmark' title='Fitness for Software Developers (and Other IT Professionals)'>Fitness for Software Developers (and Other IT Professionals)</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p>A few months ago I saw a great little blog post about <a
href="http://www.shopify.com/technology/3383012-why-developers-should-be-force-fed-state-machines" target="_blank">state machines on the Shopify blog</a>. The message was that state machines are great and developers should use them more &#8211; given my recent experiences with state machines at <a
href="https://www.crowdhired.com/" target="_blank">CrowdHired</a>, I could certainly agree with that. But it got me thinking, how many times in my developer career have I actually used a state machine (<em>either separate library or even hand-rolled abstraction</em>)? The answer is zero times &#8211; which surprised the hell out of me since state machines really are very useful. So I decided to engage in a bit of introspection and figure out why we tend to manage our &quot;<em>state</em>&quot; and &quot;<em>status</em>&quot; fields in an ad-hoc fashion rather than doing what is clearly called for.</p><h2>We Don&#39;t Need One Until We Do</h2><p>The problem is that <strong>you almost never create an object fully formed with all the behaviour it is ever going to need</strong>, rather you build it up over time. The same is true for the &quot;<em>states</em>&quot; that a state machine candidate object can be in. So, early on you don&#39;t feel like your objects&#39; state machine behaviour is complex enough to warrant a &quot;<em>full-blown</em>&quot; state machine (<em><a
href="http://www.skorks.com/2009/08/does-yagni-mean-you-ignore-the-obvious/" target="_blank">YAGNI</a>&nbsp;and all that jazz</em>), but later on &#8211; when it IS complex enough &#8211; you feel like you&#39;ve invested too much time/effort to replace it with something that has equivalent functionality. It&#39;s a bit of a <a
href="http://en.wikipedia.org/wiki/Catch-22_(logic)" target="_blank">catch-22</a>. <strong>It&#39;s overkill and by the time it&#39;s not, it&#39;s too late</strong>.</p><h2>A State Machine Is A Fluffy Bunny (<em>Not Particularly Threatening</em>)</h2><p
style="text-align: center; "><img
align="middle" alt="Bunny" class="aligncenter size-medium wp-image-1992" height="240" src="http://www.skorks.com/wp-content/uploads/2011/09/bunny-300x240.jpg" title="Bunny" vspace="5" width="300" /></p><p>Those of us who went through computer science degrees remember state machines from our computing theory subjects and the memories are often not fond ones. There are complex diagrams and math notation, determinism and non-determinism, <a
href="http://en.wikipedia.org/wiki/Finite-state_machine#Transducers" target="_blank">Moore and Mealy</a>, as well as acronyms galore (<em>DFA, NFA, GNFA etc.</em>). We come to believe that state machines are more complex than they actually are and it is therefore nothing but pragmatism that makes us consider a &quot;<em>full-blown</em>&quot; state machine overkill.</p><p>But <strong>most state machines you&#39;re likely to need in your day-to-day development have nothing in common with their computing theory counterparts</strong> (<em>except the &#8230; errr &#8230; theory</em>). You have states which are strings, and events which are methods that cause transitions from one state to another &#8211; that&#39;s pretty much it (<em>at least for the <code>state_machine</code> gem in Ruby</em>). The point is, even if you have two states, a state machine is not overkill, it might be easier that rolling an ad-hoc solution, as long as you have a good library to lean on.</p><h2>Even A Good Tool Is Not A Good Tool</h2><p>I would hazard a guess that there are decent state machine libraries for most languages that you can use (<em>the aforementioned <code>state_machine</code> for Ruby is just one example</em>). But even a fluffy bunny has a learning curve (<em>I am stretching the metaphor well past breaking point here</em>). That wouldn&#39;t be such an issue if you were solving a problem, but all you&#39;re likely doing is replacing an existing solution. Since we tend to turn to a state machine library after the fact (<em>our ad-hoc solution is working right now</em>). Just <strong>like with everything that has &quot;<em>potential future benefits</em>&quot; the immediate value is very hard to justify even to yourself </strong>(<em>unless you&#39;ve had experience with it before</em>). The slight learning curve only tips the scale further towards the &quot;<em>we can live without it</em>&quot; side. It doesn&#39;t matter how good a tool is if you never give it a chance.</p><p>It is really difficult to appreciate (<em>until you&#39;ve gone through it</em>) &#8211; how much better life can be if you do give a good state machine library a chance. When we finally &quot;<em>bit the bullet</em>&quot; at <a
href="https://www.crowdhired.com/" target="_blank">CrowdHired</a>&nbsp;and rejigged some of our core objects to use the <code>state_machine</code> gem, the difference was immediately apparent.</p><ul><li>Firstly the learning curve was minor, I did spend a few hours of going through the source and documentation, but after that I had a good idea what could and couldn&#39;t be done (<em>I might do <a
href="http://feeds.feedburner.com/softwaretechandmore">an in-depth look at the <code>state_machine</code> gem at some point</a></em>).&nbsp;</li><li>The integration itself was almost painless, but moving all the code around to be inline with the new state machine was a big pain. In hindsight <strong>had we done this when our objects only had a couple of states it would have been a breeze</strong>.&nbsp;</li><li>We&#39;re now able to easily introduce more states to give our users extra information as well as allow us to track things to a finer grain. Before it was YAGNI cause it was a pain, now we find that we &quot;<em>ai gonna need</em>&quot; after all, cause it&#39;s so easy.</li><li>Our return values from state transitions are now 100% consistent (<code>true/false</code>). Before we were returning objects, arrays of objects, nil, true/false depending on who was writing it and when.&nbsp;</li><li>We&#39;re now able to keep an audit trail of our state transitions simply by dropping in <a
href="https://github.com/wvanbergen/state_machine-audit_trail" target="_blank">state_machine-audit_trail</a>&nbsp;(<em>see that Shopify post</em>), before it was too hard to hook it in everywhere so we had nothing.</li><li>We removed a bunch of code and improved our codebase &#8211; always worthy goals as far as I am concerned.</li></ul><p>My gut-feel is that most people who read <a
href="http://www.shopify.com/technology/3383012-why-developers-should-be-force-fed-state-machines" target="_blank">that Shopify post</a> agreed with it in spirit, but did nothing about it (t<em>hat&#39;s kinda how it was with me</em>). <strong>We seem to shy away from state machines due to misunderstanding of their complexity and/or an inability to quantify the benefits.</strong> But, there is less complexity than you would think and more benefits than you would expect as long you don&#39;t try to retrofit a state machine after the fact. So next time you have an object that even hints at having a &quot;<em>status</em>&quot; field, just chuck a state machine in there, you&#39;ll be glad you did. I guarantee it or your money back :).</p><p><span
style="font-size: 10px; font-family: trebuchet ms">Image by <a
href="http://www.flickr.com/photos/tfangel/3478405761/" target="_blank">tfangel</a> </span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/why-your-developers-dont-want-to-swap-pairs-practical-agility/' rel='bookmark' title='Why Your Developers Don&#8217;t Want To Swap Pairs &#8211; Practical Agility'>Why Your Developers Don&#8217;t Want To Swap Pairs &#8211; Practical Agility</a></li><li><a
href='http://www.skorks.com/2009/03/the-current-state-of-the-agile-nation-agile-process-adoption/' rel='bookmark' title='The Current State Of The Agile Nation &#8211; Agile Process Adoption'>The Current State Of The Agile Nation &#8211; Agile Process Adoption</a></li><li><a
href='http://www.skorks.com/2008/08/fitness-for-software-developers-and-other-it-professionals/' rel='bookmark' title='Fitness for Software Developers (and Other IT Professionals)'>Fitness for Software Developers (and Other IT Professionals)</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=9nAQBGCK3Ks:VLFA-3mNm-0:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=9nAQBGCK3Ks:VLFA-3mNm-0:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=9nAQBGCK3Ks:VLFA-3mNm-0:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/9nAQBGCK3Ks" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/09/why-developers-never-use-state-machines/feed/</wfw:commentRss> <slash:comments>56</slash:comments> </item> <item><title>When Developers Go To Great Length To Save Typing 4 Letters</title><link>http://www.skorks.com/2011/08/when-developers-go-to-great-length-to-save-typing-4-letters/</link> <comments>http://www.skorks.com/2011/08/when-developers-go-to-great-length-to-save-typing-4-letters/#comments</comments> <pubDate>Wed, 24 Aug 2011 07:43:19 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[metaprogramming]]></category> <category><![CDATA[rails]]></category> <category><![CDATA[ruby]]></category> <category><![CDATA[thin]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1976</guid> <description><![CDATA[Heroku&#160;is a great platform. Long before I joined and when I say long, I mean in startup terms (i.e. a few weeks before I joined :)) &#8211; the decision was made that CrowdHired would be hosted on Heroku. Shortly after I came on board, Heroku released their new Cedar stack&#160;and we quickly migrated across to [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2008/10/how-to-save-money-as-a-student/' rel='bookmark' title='How To Save Money As A Student'>How To Save Money As A Student</a></li><li><a
href='http://www.skorks.com/2010/02/fetching-rss-feeds-with-ruby-from-behind-a-proxy/' rel='bookmark' title='Fetching RSS Feeds With Ruby From Behind A Proxy'>Fetching RSS Feeds With Ruby From Behind A Proxy</a></li><li><a
href='http://www.skorks.com/2009/07/how-to-write-a-web-crawler-in-ruby/' rel='bookmark' title='How To Write A Simple Web Crawler In Ruby'>How To Write A Simple Web Crawler In Ruby</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Great Lengths" class="alignleft size-medium wp-image-1987" height="200" hspace="20" src="http://www.skorks.com/wp-content/uploads/2011/08/great_lengths-300x225.jpg" title="Great Lengths" vspace="5" width="267" /><a
href="http://www.heroku.com/" target="_blank">Heroku</a>&nbsp;is a great platform. Long before I joined and when I say long, I mean in startup terms (<em>i.e. a few weeks before I joined :)</em>) &#8211; the decision was made that CrowdHired would be hosted on Heroku. Shortly after I came on board, Heroku released their new <a
href="http://blog.heroku.com/archives/2011/5/31/celadon_cedar/" target="_blank">Cedar stack</a>&nbsp;and we quickly migrated across to that. I find it kinda amusing that <a
href="https://www.crowdhired.com/" target="_blank">we&#39;re currently in alpha</a>, deploying to a platform that&#39;s in beta. Latest and greatest FTW. While migrating to the new stack we also settled on <a
href="http://code.macournoyer.com/thin/" target="_blank">Thin</a>&nbsp;as our web server. The Cedar stack allows you to use <a
href="http://devcenter.heroku.com/articles/cedar" target="_blank">whatever web server you want in production</a>&nbsp;and will run on Webrick by default &#8211; not ideal. Since we were going to use Thin in production it made sense that we&#39;d also use it in development instead of Webrick.</p><p><strong>When you&#39;re using Rails, swapping a new web server in is pretty painless</strong>. Just include the gem and then use the <code>rails s</code> command to launch your new server e.g.:</p><pre>rails s thin
</pre><p>Pretty simple right? Well, it is, but I am very used to typing <code>rails s</code> to launch my server and no matter what gem you include in your project, <code>rails s</code> still starts Webrick (<em>this is not entirely accurate but bare with me</em>). Muscle memory being what it is, after typing <code>rails s</code> instead of <code>rails s thin</code> a couple of times (<em>and only realising this a few hours later</em>) I decided to see if I could make Thin the default for the <code>rails s </code>command.</p><h2>Digging Into The Rails Command Line</h2><p>The key thing here was to figure out how <code>rails s</code> actually works &#8211; only way to do that is to <a
href="http://www.skorks.com/2010/05/why-i-love-reading-other-peoples-code-and-you-should-too/" target="_blank">read some code</a>. We know that there is a <code>script/rails</code> executable that lives in all our Rails project so it makes sense that this would be the entry point into figuring out <code>rails s</code>, but in reality it&#39;s not (<em>or at least it&#8217;s not that simple</em>). We don&#8217;t actually type <code>script/rails s</code>, we do <code>rails s</code>, so <strong>there must be an executable called <code>rails</code> within the Rails gem itself</strong> which is declared as such in rails&#39; gemspec. This is indeed the case, it looks like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#008000; font-style:italic;">#!/usr/bin/env ruby</span>
&nbsp;
<span style="color:#9966CC; font-weight:bold;">if</span> <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">exists</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">join</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../..'</span>, <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>, <span style="color:#996600;">'.git'</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  railties_path = <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../../railties/lib'</span>, <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  $:.<span style="color:#9900CC;">unshift</span><span style="color:#006600; font-weight:bold;">&#40;</span>railties_path<span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">&quot;rails/cli&quot;</span></pre></div></div><p>But even that is not the start of the story. Apparently,<strong> when you have an executable in a gem, rubygems will not use it as is, but will generate another executable which wraps your one.</strong> For the rails command it looks like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#008000; font-style:italic;">#!/usr/bin/env ruby</span>
<span style="color:#008000; font-style:italic;">#</span>
<span style="color:#008000; font-style:italic;"># This file was generated by RubyGems.</span>
<span style="color:#008000; font-style:italic;">#</span>
<span style="color:#008000; font-style:italic;"># The application 'rails' is installed as part of a gem, and</span>
<span style="color:#008000; font-style:italic;"># this file is here to facilitate running it.</span>
<span style="color:#008000; font-style:italic;">#</span>
&nbsp;
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rubygems'</span>
&nbsp;
version = <span style="color:#996600;">&quot;&gt;= 0&quot;</span>
&nbsp;
<span style="color:#9966CC; font-weight:bold;">if</span> ARGV.<span style="color:#9900CC;">first</span> =~ <span style="color:#006600; font-weight:bold;">/</span>^_<span style="color:#006600; font-weight:bold;">&#40;</span>.<span style="color:#006600; font-weight:bold;">*</span><span style="color:#006600; font-weight:bold;">&#41;</span>_$<span style="color:#006600; font-weight:bold;">/</span> <span style="color:#9966CC; font-weight:bold;">and</span> <span style="color:#6666ff; font-weight:bold;">Gem::Version</span>.<span style="color:#9900CC;">correct</span>? $1 <span style="color:#9966CC; font-weight:bold;">then</span>
  version = $1
  ARGV.<span style="color:#9900CC;">shift</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
gem <span style="color:#996600;">'rails'</span>, version
<span style="color:#CC0066; font-weight:bold;">load</span> Gem.<span style="color:#9900CC;">bin_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'rails'</span>, <span style="color:#996600;">'rails'</span>, version<span style="color:#006600; font-weight:bold;">&#41;</span></pre></div></div><p>This is the true entry point you hit when you type <code>rails s</code>. Of course, all this does is load/call the original executable from the Rails gem.</p><p>The Rails source is broken up into several projects such as <em><strong>activerecord</strong></em>, <em><strong>activesupport</strong></em> etc. Probably the most important one of these is <em><strong>railties</strong></em>. This is where the <code>rails</code> executable takes us. Of course, before it does that it needs to put the <code>lib/</code> folder of the <em><strong>railties</strong></em> project on the load path, but eventually we end up in <code>railties/lib/rails/cli.rb</code>. Here we almost immediately execute the following:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#6666ff; font-weight:bold;">Rails::ScriptRailsLoader</span>.<span style="color:#9900CC;">exec_script_rails</span>!</pre></div></div><p>All this does is essentially figure out if we&#39;re inside a Rails app and if we are it executes <code>script/rails</code> passing through the command line arguments that you supply. So, we&#39;re now back in our Rails app; <code>script/rails</code> is the real entry point after all (<em>although we&#39;re about to be taken straight back to <strong>railties</strong></em>). The file looks like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#008000; font-style:italic;">#!/usr/bin/env ruby</span>
<span style="color:#008000; font-style:italic;"># This command will automatically be run when you run &quot;rails&quot; with Rails 3 gems installed from the root of your application.</span>
&nbsp;
APP_PATH = <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../../config/application'</span>,  <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../../config/boot'</span>,  <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rails/commands'</span></pre></div></div><p>We <code>require boot.rb</code> so that we can hook into Bundler and make sure the relevant gems are on the load path (<em>such as rails for example</em>), we then jump into <code>railties/rails/lib/commands.rb</code>. Here everything is pretty straight forward, we have a big case statement which has a clause for &quot;<em>server</em>&quot;. We instantiate a new <code>Rails::Server</code> and start it, which tells us very little by itself, but if we jump into <code>railties/rails/lib/commands/server.rb</code> we can see that <code>Rails::Server</code> simply extends <code>Rack::Server</code> (<strong><em>and delegates to <code>Rack::Server&#39;s</code> start method from its start method</em></strong>) all it adds is those familiar lines we&#39;re all used to seeing e.g.:</p><pre>=&gt; Booting Thin
=&gt; Rails 3.0.7 application starting in development on http://0.0.0.0:3000
=&gt; Call with -d to detach
=&gt; Ctrl-C to shutdown server
</pre><p>So, if we want to change which server is picked up by default when we type <code>rails s</code> we need to go look in Rack.</p><h2>A Quick Glance At Rack</h2><p>Luckily we can easily grab it from <a
href="https://github.com/rack/rack" target="_blank">Github</a>&nbsp;and crack it open (<strong><em>you have to admire the Ruby open source ecosystem, it is truly awesome, largely due to Github, so big props to those guys</em></strong>). We of course need to check out <code>lib/rack/server.rb</code> where we find the following method:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> server
  <span style="color:#0066ff; font-weight:bold;">@_server</span> <span style="color:#006600; font-weight:bold;">||</span>= <span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>.<span style="color:#9900CC;">get</span><span style="color:#006600; font-weight:bold;">&#40;</span>options<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#ff3333; font-weight:bold;">:server</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">||</span> <span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>.<span style="color:#9900CC;">default</span><span style="color:#006600; font-weight:bold;">&#40;</span>options<span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>So, if we don&#39;t pass in the name of the server we want, <code>Rack::Handler.default</code> will try to determine it for us.</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">default</span><span style="color:#006600; font-weight:bold;">&#40;</span>options = <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">&#125;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#008000; font-style:italic;"># Guess.</span>
  <span style="color:#9966CC; font-weight:bold;">if</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;PHP_FCGI_CHILDREN&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#008000; font-style:italic;"># We already speak FastCGI</span>
    options.<span style="color:#9900CC;">delete</span> :<span style="color:#CC00FF; font-weight:bold;">File</span>
    options.<span style="color:#9900CC;">delete</span> <span style="color:#ff3333; font-weight:bold;">:Port</span>
&nbsp;
    <span style="color:#6666ff; font-weight:bold;">Rack::Handler::FastCGI</span>
  <span style="color:#9966CC; font-weight:bold;">elsif</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;REQUEST_METHOD&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>::<span style="color:#CC00FF; font-weight:bold;">CGI</span>
  <span style="color:#9966CC; font-weight:bold;">else</span>
    <span style="color:#9966CC; font-weight:bold;">begin</span>
      <span style="color:#6666ff; font-weight:bold;">Rack::Handler::Mongrel</span>
    <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">LoadError</span>
      <span style="color:#6666ff; font-weight:bold;">Rack::Handler::WEBrick</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>As you can see, it turns out that the real default server is actually Mongrel. So if you included Mongrel in your Rails project, typing <code>rails s</code> would automatically pick that up without you having to do anything. Only if Mongrel fails, do we fall back to Webrick which is part of Ruby and therefore is always present. So what do we do if we want Thin to be one of the defaults? Well, first thing first, we need to check if Rack includes a handler for Thin. If we look in <code>lib/rack/handlers/</code> we can see that Rack itself includes the following:</p><pre>cgi.rb
evented_mongrel.rb
fastcgi.rb
lsws.rb
mongrel.rb
scgi.rb
swiftiplied_mongrel.rb
thin.rb
webrick.rb
</pre><p>Luckily Thin is there, so what we really want is that default method to look something like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">default</span><span style="color:#006600; font-weight:bold;">&#40;</span>options = <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">&#125;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#008000; font-style:italic;"># Guess.</span>
  <span style="color:#9966CC; font-weight:bold;">if</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;PHP_FCGI_CHILDREN&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#008000; font-style:italic;"># We already speak FastCGI</span>
    options.<span style="color:#9900CC;">delete</span> :<span style="color:#CC00FF; font-weight:bold;">File</span>
    options.<span style="color:#9900CC;">delete</span> <span style="color:#ff3333; font-weight:bold;">:Port</span>
&nbsp;
    <span style="color:#6666ff; font-weight:bold;">Rack::Handler::FastCGI</span>
  <span style="color:#9966CC; font-weight:bold;">elsif</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;REQUEST_METHOD&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>::<span style="color:#CC00FF; font-weight:bold;">CGI</span>
  <span style="color:#9966CC; font-weight:bold;">else</span>
    <span style="color:#9966CC; font-weight:bold;">begin</span>
      <span style="color:#6666ff; font-weight:bold;">Rack::Handler::Thin</span>
    <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">LoadError</span>
      <span style="color:#9966CC; font-weight:bold;">begin</span>
        <span style="color:#6666ff; font-weight:bold;">Rack::Handler::Mongrel</span>
      <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">LoadError</span>
        <span style="color:#6666ff; font-weight:bold;">Rack::Handler::WEBrick</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>This way Thin will be the first default, followed by Mongrel and only then falling through to Webrick. Luckily, since we&#39;re using Ruby, we can reopen the class and replace the method with our version. I think <strong>this is a perfect example where the ability to reopen classes comes in extremely handy, without any need to worry about &quot;<em>scary consequences</em>&quot;</strong>.</p><h2>Getting It Working</h2><p>All we really need to figure out is where to put the code that reopens the class so that it gets picked up before Rails tries to launch the server. The only logical place is the <code>script/rails</code> executable itself, which will now look like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#008000; font-style:italic;">#!/usr/bin/env ruby</span>
<span style="color:#008000; font-style:italic;"># This command will automatically be run when you run &quot;rails&quot; with Rails 3 gems installed from the root of your application.</span>
&nbsp;
APP_PATH = <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../../config/application'</span>,  <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#CC00FF; font-weight:bold;">File</span>.<span style="color:#9900CC;">expand_path</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'../../config/boot'</span>,  <span style="color:#0000FF; font-weight:bold;">__FILE__</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rack/handler'</span>
<span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>.<span style="color:#9900CC;">class_eval</span> <span style="color:#9966CC; font-weight:bold;">do</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">default</span><span style="color:#006600; font-weight:bold;">&#40;</span>options = <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">&#125;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#008000; font-style:italic;"># Guess.</span>
    <span style="color:#9966CC; font-weight:bold;">if</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;PHP_FCGI_CHILDREN&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      <span style="color:#008000; font-style:italic;"># We already speak FastCGI</span>
      options.<span style="color:#9900CC;">delete</span> :<span style="color:#CC00FF; font-weight:bold;">File</span>
      options.<span style="color:#9900CC;">delete</span> <span style="color:#ff3333; font-weight:bold;">:Port</span>
&nbsp;
      <span style="color:#6666ff; font-weight:bold;">Rack::Handler::FastCGI</span>
    <span style="color:#9966CC; font-weight:bold;">elsif</span> ENV.<span style="color:#9966CC; font-weight:bold;">include</span>?<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;REQUEST_METHOD&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      <span style="color:#6666ff; font-weight:bold;">Rack::Handler</span>::<span style="color:#CC00FF; font-weight:bold;">CGI</span>
    <span style="color:#9966CC; font-weight:bold;">else</span>
      <span style="color:#9966CC; font-weight:bold;">begin</span>
        <span style="color:#6666ff; font-weight:bold;">Rack::Handler::Mongrel</span>
      <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">LoadError</span>
        <span style="color:#9966CC; font-weight:bold;">begin</span>
          <span style="color:#6666ff; font-weight:bold;">Rack::Handler::Thin</span>
        <span style="color:#9966CC; font-weight:bold;">rescue</span> <span style="color:#CC00FF; font-weight:bold;">LoadError</span>
          <span style="color:#6666ff; font-weight:bold;">Rack::Handler::WEBrick</span>
        <span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'rails/commands'</span></pre></div></div><p>This works without any problems, we type <code>rails s</code> and as long as Thin is in our Gemfile it starts up by default. As an aside, notice that I used <code>class_eval</code> to reopen <code>Rack::Handler</code>. Metaprogramming tricks like this should be part of every Ruby developer&#39;s toolbox, <a
href="http://feeds.feedburner.com/softwaretechandmore">I&#39;ll talk more about this some other time</a>&nbsp;(<em>seeing as I am well into TL;DR land here</em>).</p><p>Going through this exercise didn&#39;t take long (<em>under 30 minutes</em>) and taught me a bit more about Rails and Rack. Shortly after doing this &#8211; in a curious case of the universe working in interesting ways &#8211; I came across a <a
href="http://stackoverflow.com/q/4853393/639386" target="_blank">Stackoverflow question asking about this exact scenario</a>&nbsp;and got an inordinate amount of satisfaction from being able to easily answer it :). In-fact, even the fact that shortly after <a
href="https://www.crowdhired.com/about_us?show_tab=jason_nah" target="_blank">Jason</a>&nbsp;found <a
href="http://pow.cx/" target="_blank">Pow</a>&nbsp;and we switched over to that, doesn&#39;t really diminish the satisfaction of quickly solving a seemingly difficult problem in a neat way. The lesson here is this, <strong>no matter what problems you come across don&#39;t automatically look for a library to handle it</strong>. Do spend a few minutes investigating &#8211; it might be enough to solve it (<em>especially if you&#39;re using Ruby</em>) and you&#39;ll certainly learn something.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Image by <a
href="http://www.flickr.com/photos/gerardstolk/2327590838/" target="_blank">Gerard Stolk presque 64</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2008/10/how-to-save-money-as-a-student/' rel='bookmark' title='How To Save Money As A Student'>How To Save Money As A Student</a></li><li><a
href='http://www.skorks.com/2010/02/fetching-rss-feeds-with-ruby-from-behind-a-proxy/' rel='bookmark' title='Fetching RSS Feeds With Ruby From Behind A Proxy'>Fetching RSS Feeds With Ruby From Behind A Proxy</a></li><li><a
href='http://www.skorks.com/2009/07/how-to-write-a-web-crawler-in-ruby/' rel='bookmark' title='How To Write A Simple Web Crawler In Ruby'>How To Write A Simple Web Crawler In Ruby</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=4RmPhA5yTjM:kDpTrxRJ2Jo:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=4RmPhA5yTjM:kDpTrxRJ2Jo:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=4RmPhA5yTjM:kDpTrxRJ2Jo:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/4RmPhA5yTjM" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/08/when-developers-go-to-great-length-to-save-typing-4-letters/feed/</wfw:commentRss> <slash:comments>19</slash:comments> </item> <item><title>Even Boring Form Data Can Be Interesting (For A Developer)</title><link>http://www.skorks.com/2011/08/even-boring-form-data-can-be-interesting-for-a-developer/</link> <comments>http://www.skorks.com/2011/08/even-boring-form-data-can-be-interesting-for-a-developer/#comments</comments> <pubDate>Wed, 17 Aug 2011 13:41:29 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[algorithms]]></category> <category><![CDATA[australian business number]]></category> <category><![CDATA[credit card numbers]]></category> <category><![CDATA[developers]]></category> <category><![CDATA[learning]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1964</guid> <description><![CDATA[What could be more boring than capturing credit card data on a form? Well, it&#39;s actually not that boring since you may want to encrypt this particular data, which presents it&#39;s own set of challenges. Nevertheless, it&#39;s still a textbox which takes digits that you store in a database &#8211; whoopty doo &#8211; not exactly [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/the-most-common-things-you-do-to-a-large-data-file-with-bash/' rel='bookmark' title='The Most Common Things You Do To A Large Data File With Bash'>The Most Common Things You Do To A Large Data File With Bash</a></li><li><a
href='http://www.skorks.com/2009/08/are-you-the-best-developer-in-the-world/' rel='bookmark' title='Are You The Best Developer In The World?'>Are You The Best Developer In The World?</a></li><li><a
href='http://www.skorks.com/2009/09/become-a-better-developer-by-indexing-your-brain/' rel='bookmark' title='Become A Better Developer By Indexing Your Brain'>Become A Better Developer By Indexing Your Brain</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Bored" class="alignleft size-medium wp-image-1969" hspace="20" src="http://www.skorks.com/wp-content/uploads/2011/08/bored-300x199.jpg" style="width: 321px; height: 212px;" title="Bored" vspace="5" />What could be more boring than capturing credit card data on a form? Well, it&#39;s actually not that boring since you may want to encrypt this particular data, which presents it&#39;s own set of challenges. Nevertheless, it&#39;s still a textbox which takes digits that you store in a database &#8211; whoopty doo &#8211; not exactly rocket surgery. Well, I&#39;ve got a piece of data that&#39;s got the credit card beat for sheer mundanity &#8211; the <a
href="http://help.abr.gov.au/content.asp?sid=42&amp;doc=/content/16974.htm&amp;usertype=BC" target="_blank">ABN</a>. If you&#39;re an Australian you know all about this. For everybody else, it stands for Australian Business Number which is an 11 digit number, provided by the government to every company. It&#39;s not secret (<em>you can <a
href="http://www.abr.business.gov.au/" target="_blank">look them up online</a></em>), so you don&#39;t even need to encrypt it &#8211; difficult to get excited about that. Of course if that was the end of the story, this wouldn&#39;t be much of a blog post, so &#8211; as you might imagine &#8211; <strong>things are not as bland as they appear</strong>.</p><h2>The Interesting Thing About A Credit Card Number&#8230;</h2><p>At <a
href="https://www.crowdhired.com" target="_blank">CrowdHired</a>, we don&#39;t tend to deal much with credit card numbers, but ABNs are another matter entirely &#8211; since companies are one of the two types of users we have in the system (<em>by the way &#8211; as you may have deduced &#8211; I&#39;ve been working for a startup for the last few months, I should really <a
href="http://feeds.feedburner.com/softwaretechandmore">talk about how that happened</a>, it&#39;s an interesting story</em>). Just like any piece of data, you want to validate the user input if you possibly can. When I started looking into this for ABNs I discovered that they had an interesting trait, it is a trait which credit card numbers share. <strong>You see, both credit cards and ABNs are self-verifying numbers</strong>.</p><p>I&#39;ve been doing web development for many years now, but had no idea this was the case. So naturally &#8211; being the curious developer that I am &#8211; I had to dig a little further. It turns out that these kinds of numbers are quite common, with other well-known examples being <a
href="http://en.wikipedia.org/wiki/International_Standard_Book_Number" target="_blank">ISBNs</a>, <a
href="http://en.wikipedia.org/wiki/Universal_Product_Code" target="_blank">UPCs</a>&nbsp;and <a
href="http://en.wikipedia.org/wiki/Vehicle_identification_number" target="_blank">VINs</a>. <strong>Most of these use a variation of a <a
href="http://en.wikipedia.org/wiki/Check_digit" target="_blank">check digit</a>-based algorithm for both validation and generation</strong>. Probably the most well-known of these algorithms is the <a
href="http://en.wikipedia.org/wiki/Luhn_algorithm" target="_blank">Luhn algorithm</a>&nbsp;which is what credit cards use. So, we&#39;ll use a credit card as an example.</p><h2>Validating And Generating Credit Cards (And Every Other Check-Digit Based Number)</h2><p>Let us say we have the following credit card number:</p><pre>4870696871788604</pre><p>It is 16 digits (<em>Visa and MasterCard are usually 16, but Amex is 15</em>). This number is broken down in the following way:</p><pre>Issuer Number | Account Number | Check Digit
487069        | 687178860      | 4
</pre><p>You can read lots <a
href="http://www.merriampark.com/anatomycc.htm" target="_blank">more about the anatomy of a credit card</a>, but all we want to do is apply the Luhn algorithm to check if this credit card is valid. It goes something like this:</p><h4>1. Starting from the back, double every second digit</h4><pre>  4 | 8 | 7 | 0 | 6 | 9 | 6 | 8 | 7 | 1 | 7 | 8 | 8 | 6 | 0 | 4
  8 | 8 |14 | 0 |12 | 9 |12 | 8 |14 | 1 |14 | 8 |16 | 6 |00 | 4
</pre><h4>2. If the doubled numbers form a double digit number, add the two digits</h4><pre>  4 | 8 | 7 | 0 | 6 | 9 | 6 | 8 | 7 | 1 | 7 | 8 | 8 | 6 | 0 | 4
  8 | 8 |14 | 0 |12 | 9 |12 | 8 |14 | 1 |14 | 8 |16 | 6 |00 | 4
  8 | 8 | 5 | 0 | 3 | 9 | 3 | 8 | 5 | 1 | 5 | 8 | 7 | 6 | 0 | 4
</pre><h4>3. Sum up all the digits of this new number</h4><pre>  8+8+5+0+3+9+3+8+5+1+5+8+7+6+0+4 = 80
</pre><h4>4. If the number is perfectly divisible by 10 it is a valid credit card number. Which in our case it is.</h4><p>&nbsp;</p><p>You can see how we can use the same algorithm to generate a valid credit card number. All we have to do is set the check digit value to <strong>X</strong> and then perform all the same steps. During the final step we simply pick our check digit in such a way as to make sure the sum of all the digits is divisible by 10. Let&#39;s do this for a slightly altered version of our previous credit card number (<em>we simply set the digit before the check digit to 1 making the credit card number invalid</em>).</p><pre>  4 | 8 | 7 | 0 | 6 | 9 | 6 | 8 | 7 | 1 | 7 | 8 | 8 | 6 | 1 | X
  8 | 8 |14 | 0 |12 | 9 |12 | 8 |14 | 1 |14 | 8 |16 | 6 | 2 | X
  8 | 8 | 5 | 0 | 3 | 9 | 3 | 8 | 5 | 1 | 5 | 8 | 7 | 6 | 2 | X
  8+8+5+0+3+9+3+8+5+1+5+8+7+6+2+X = 78+X
  X = (78%10 == 0) ? 0 : 10 - 78%10
  X=2
</pre><p>As you can see <strong>no matter what the other 15 digits are, we&#39;ll always be able to pick a check digit between 0 and 9 that will make the credit card number valid</strong>.</p><p>Of course not every self-verifying number uses the Luhn algorithm, most don&#39;t use <code>mod(10)</code> to work out what the check digit should be, and for some numbers like the <a
href="http://en.wikipedia.org/wiki/International_Bank_Account_Number#Algorithms" target="_blank">IBAN</a>, the check digit actually consists of 2 digits. And yet, the most curious self-verifying number of the lot is the first one I learned about &#8211; the ABN. This is because, for the life of me, <strong>I couldn&#39;t work out what the check digit of the ABN could be</strong>.</p><h2>The Curious Case Of The ABN</h2><p
style="text-align: center; "><img
align="middle" alt="Curious" class="aligncenter size-medium wp-image-1970" src="http://www.skorks.com/wp-content/uploads/2011/08/curious-199x300.jpg" style="width: 214px; height: 322px;" title="Curious" vspace="5" /></p><p>Australia is certainly is not averse to using check digit-based algorithms. The Australian Tax File Number (<em>TFN</em>) and the Australian Company Number (<em>ACN</em>) are just two examples, but the ABN seems to be different. At first glance the <a
href="http://www.ato.gov.au/businesses/content.aspx?doc=/content/13187.htm&amp;pc=001/003/021/002/001&amp;mnu=610&amp;mfp=001/003&amp;st=&amp;cy=1" target="_blank">ABN validation algorithm</a>&nbsp;is just more of the same, it just has a larger than normal &quot;<em>mod</em>&quot; step at the end (<code>mod(89)</code>).</p><ul><li>Subtract 1 from the first (left) digit to give a new eleven digit number</li><li>Multiply each of the digits in this new number by its weighting factor</li><li>Sum the resulting 11 products</li><li>Divide the total by 89, noting the remainder</li><li>If the remainder is zero the number is valid</li></ul><p>In-fact, here is some ruby code to validate an ABN which I appropriated from the <a
href="https://rubygems.org/gems/abn" target="_blank">Ruby ABN gem</a>&nbsp;(<em>and then rolled it into a nice Rails 3, ActiveRecord validator so we could do </em><code>validates_abn_format_of</code><em> in all out models :)</em>) :</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  <span style="color:#9966CC; font-weight:bold;">def</span> is_integer?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#CC0066; font-weight:bold;">Integer</span><span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0000FF; font-weight:bold;">true</span>
  <span style="color:#9966CC; font-weight:bold;">rescue</span>
    <span style="color:#0000FF; font-weight:bold;">false</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#9966CC; font-weight:bold;">def</span> abn_valid?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
    raw_number = number
    number = number.<span style="color:#9900CC;">to_s</span>.<span style="color:#9900CC;">tr</span> <span style="color:#006600; font-weight:bold;">&amp;</span><span style="color:#008000; font-style:italic;">#39; &amp;#39;,&amp;#39;&amp;#39;</span>
    <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">false</span> <span style="color:#9966CC; font-weight:bold;">unless</span> is_integer?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&amp;&amp;</span> number.<span style="color:#9900CC;">length</span> == <span style="color:#006666;">11</span>
&nbsp;
    weights = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">10</span>, <span style="color:#006666;">1</span>, <span style="color:#006666;">3</span>, <span style="color:#006666;">5</span>, <span style="color:#006666;">7</span>, <span style="color:#006666;">9</span>, <span style="color:#006666;">11</span>, <span style="color:#006666;">13</span>, <span style="color:#006666;">15</span>, <span style="color:#006666;">17</span>, <span style="color:#006666;">19</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    sum = <span style="color:#006666;">0</span>
    <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span>..<span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>i<span style="color:#006600; font-weight:bold;">|</span>
      c = number<span style="color:#006600; font-weight:bold;">&#91;</span>i,<span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>
      digit = c.<span style="color:#9900CC;">to_i</span> <span style="color:#006600; font-weight:bold;">-</span> <span style="color:#006600; font-weight:bold;">&#40;</span>i.<span style="color:#9900CC;">zero</span>? ? <span style="color:#006666;">1</span> : <span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      sum <span style="color:#006600; font-weight:bold;">+</span>= weights<span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">*</span> digit
    <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
    sum <span style="color:#006600; font-weight:bold;">%</span> <span style="color:#006666;">89</span> == <span style="color:#006666;">0</span> ? <span style="color:#0000FF; font-weight:bold;">true</span> : <span style="color:#0000FF; font-weight:bold;">false</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>But, <strong>while validating ABNs is easy, generating them is a whole other matter</strong>. As we&#39;ve seen, with a check digit-based algorithm, generating the number is the same as validating the number, except we pick the digit in such a way as to make our &#39;<code>mod</code>&#39; step evaluate to zero. But with a number such as the ABN, where there is no apparent check digit (<em>perhaps I am just having a bout of stupid, so if you can see an obvious check digit with ABNs do let me know</em>), how do you easily generate a valid number? In-fact, <em><strong>why would you want to generate these numbers in the first place, isn&#39;t being able to validate them enough</strong></em>?</p><p>Well, in the case of <a
href="https://www.crowdhired.com" target="_blank">CrowdHired</a>, we tend to create object trees that are quite deep, so we build an maintain some infrastructure code to allow us to create fake data for use during development (<em>another interesting thing to talk about at a later date</em>). Before we started using the self-validating properties of ABNs we simply generated any old 11 digit number as fake data for ABN fields, but once the validations started kicking in this was no longer an option. Being the pragmatic developers that we are (<em>even if we do say so ourselves</em>), we took some real ABNs (<em>like our own</em>) chucked them into an array and randomly picked from there. But, this offended the developer gods, or my developer pride &#8211; whichever, so one Saturday I decided to take a couple of hours to generate some truly random ABNs that were still valid. Here is the code I came up with (<em>it is now a proud part of our fake data generation script</em>):</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  <span style="color:#9966CC; font-weight:bold;">def</span> random_abn
    weights = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">10</span>,<span style="color:#006666;">1</span>,<span style="color:#006666;">3</span>,<span style="color:#006666;">5</span>,<span style="color:#006666;">7</span>,<span style="color:#006666;">9</span>,<span style="color:#006666;">11</span>,<span style="color:#006666;">13</span>,<span style="color:#006666;">15</span>,<span style="color:#006666;">17</span>,<span style="color:#006666;">19</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    reversed_weights = weights.<span style="color:#9900CC;">reverse</span>
    initial_numbers = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    final_numbers = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#006666;">9</span>.<span style="color:#9900CC;">times</span> <span style="color:#006600; font-weight:bold;">&#123;</span>initial_numbers <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#CC0066; font-weight:bold;">rand</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">9</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">+</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#125;</span>
    initial_numbers = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#CC0066; font-weight:bold;">rand</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">8</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">+</span><span style="color:#006666;">1</span>, <span style="color:#CC0066; font-weight:bold;">rand</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">7</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">+</span><span style="color:#006666;">2</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">+</span> initial_numbers
    products = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    weights.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>weight, index<span style="color:#006600; font-weight:bold;">|</span>
      products <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> weight <span style="color:#006600; font-weight:bold;">*</span> initial_numbers<span style="color:#006600; font-weight:bold;">&#91;</span>index<span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    product_sum = products.<span style="color:#9900CC;">inject</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>sum, value<span style="color:#006600; font-weight:bold;">|</span> sum <span style="color:#006600; font-weight:bold;">+</span> value<span style="color:#006600; font-weight:bold;">&#125;</span>
    remainder = product_sum <span style="color:#006600; font-weight:bold;">%</span> <span style="color:#006666;">89</span>
    <span style="color:#9966CC; font-weight:bold;">if</span> remainder == <span style="color:#006666;">0</span>
      final_numbers = initial_numbers
    <span style="color:#9966CC; font-weight:bold;">else</span>
      current_remainder = remainder
      reversed_numbers = initial_numbers.<span style="color:#9900CC;">reverse</span>
      reversed_weights.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>weight, index<span style="color:#006600; font-weight:bold;">|</span>
        <span style="color:#9966CC; font-weight:bold;">next</span> <span style="color:#9966CC; font-weight:bold;">if</span> weight <span style="color:#006600; font-weight:bold;">&gt;</span> current_remainder
        <span style="color:#9966CC; font-weight:bold;">if</span> reversed_numbers<span style="color:#006600; font-weight:bold;">&#91;</span>index<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">0</span>
          reversed_numbers<span style="color:#006600; font-weight:bold;">&#91;</span>index<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">-</span>= <span style="color:#006666;">1</span>
          current_remainder <span style="color:#006600; font-weight:bold;">-</span>= weight
          <span style="color:#9966CC; font-weight:bold;">if</span> current_remainder <span style="color:#006600; font-weight:bold;">&lt;</span> reversed_weights<span style="color:#006600; font-weight:bold;">&#91;</span>index<span style="color:#006600; font-weight:bold;">+</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>
            <span style="color:#9966CC; font-weight:bold;">redo</span>
          <span style="color:#9966CC; font-weight:bold;">end</span>
        <span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
      final_numbers = reversed_numbers.<span style="color:#9900CC;">reverse</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    final_numbers<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span>
    final_numbers.<span style="color:#9900CC;">join</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>The idea is pretty simple. Let&#39;s go through an example to demonstrate:</p><h4>1. Firstly we randomly generate 11 digits between 0 and 9 to make up our probably ABN (<em>they are actually not all between 0 and 9 but more on that shortly</em>)</h4><pre>  7 5 8 9 8 7 3 4 1 5 3</pre><h4>2. We then perform the validation steps on that number</h4><ul><li><p>multiply the digits by their weights to get weight-digit products&nbsp;</p><pre>  7x10=70
  5x1=5
  8x3=24
  9x5=45
  8x7=56
  7x9=63
  3x11=33
  4x13=52
  1x15=15
  5x17=85
  3x19=57
</pre></li><li>sum the products<pre>70+5+24+45+56+63+33+52+15+85+57 = 505</pre></li><li>divide by 89 to get the remainder<pre>505 mod 89 = 60</pre></li></ul><h4>3. Since we do <code>mod(89)</code> at worst we&#39;ll be off by 88 (<em>although if we get 0 as the remainder we lucked out with a valid ABN straight away</em>), we now use the weight-digit products to &quot;<em>give change</em>&quot;, subtracting from the remainder as we go until we hit zero.</h4><p>We start with the last digit where the weight is 19. We subtract 1 from this digit, which means we can subtract 19 from our remainder. We then move on to the next digit until the remainder hits zero<span
class="Apple-style-span" style="font-family: monospace; white-space: pre; "> </span></p><pre>  Initial | Change  | Remainder
  -------------------------------
  7x10=70 | 7x10=70 | 0
  5x1=5   | 5x1=5   | 0
  8x3=24  | 8x3=24  | 0
  9x5=45  | 9x5=45  | 0
  8x7=56  | 8x7=56  | 0
  7x9=63  | <strong>6</strong>x9=63  | 0
  3x11=33 | 3x11=33 | 9
  4x13=52 | 4x13=52 | 9
  1x15=15 | 0x15=0  | 9
  5x17=85 | <strong>4</strong>x17=68 | 24
  3x19=57 | <strong>2</strong>x19=38 | 41
</pre><h4>4. This gives us our new number</h4><pre>7 5 8 9 8 6 3 4 0 4 2</pre><h4>5. Now we just need to add 1 to the very first number (<em>as per the ABN validation steps</em>) and we have our valid ABN</h4><pre>85898634042</pre><p>There are a couple of nuances to those steps.</p><ul><li><strong>None of our initial generated digits can be zero</strong>. Since we &quot;<em>give change</em>&quot; by subtracting 1 from each digit we need to ensure we can actually subtract 1 (otherwise things become much more complex). So we ensure our digits are randomly generated between 1 and 9 rather tahn between 0 and 9.</li><li><strong>Even when all our initial digits are at least 1 we may still fail to &quot;<em>give change</em>&quot; for some of the remainders</strong>, the easiest example of this is when our remainder is 2. The only digit when can use to give change is the one with the weight of 1 (<em>i.e. the second digit of the ABN</em>). If this digit is initially generated to be 1, we can only give change once at which point we end up with a remainder of 1 and there is nothing more we can do about it. In-fact this exact scenario can happen for a few other remainders namely 86, 77, 66, 53, 38, 21. The easies way to overcome this problem is to ensure that the digit that has a weight of 1, is always randomly generated to have a value of at least 2. This way we can use it to give change twice and our problematic remainders are covered.</li><li>Lastly, <strong>since we have to add 1 to the very first digit as our final step, we need to make sure that this digit is not already equal to 9</strong>, so we generate that one to be between 1 and 8.</li></ul><p>Given these nuances, this algorithm won&#39;t generate every possible ABN, but it will give you a large percentage of possible ABNs which is good enough for our needs. It took about an hour to get that working (<em>we won&#39;t mention the little bug where I forgot the remainder could be zero from the start, which caused much grief to our random data generator :)</em>), but it was a fun little exercise &#8211; time well spent as far as I am concerned. And to think, all this learning about self-validating numbers and algorithmic coding fun was triggered by trying to capture the most mundane piece of data on a form. It just goes to show that <strong>you can learn and grow no matter where you are and what you&#39;re doing, you just need to see the opportunities</strong> for what they are.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Images by <a
href="http://www.flickr.com/photos/56223083@N06/5514161603/" target="_blank">Sn.Ho</a> and <a
href="http://www.flickr.com/photos/sangudo/5530904231/" target="_blank">Sangudo</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/the-most-common-things-you-do-to-a-large-data-file-with-bash/' rel='bookmark' title='The Most Common Things You Do To A Large Data File With Bash'>The Most Common Things You Do To A Large Data File With Bash</a></li><li><a
href='http://www.skorks.com/2009/08/are-you-the-best-developer-in-the-world/' rel='bookmark' title='Are You The Best Developer In The World?'>Are You The Best Developer In The World?</a></li><li><a
href='http://www.skorks.com/2009/09/become-a-better-developer-by-indexing-your-brain/' rel='bookmark' title='Become A Better Developer By Indexing Your Brain'>Become A Better Developer By Indexing Your Brain</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=AHnDeMgnYHA:xnaLAeukawg:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=AHnDeMgnYHA:xnaLAeukawg:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=AHnDeMgnYHA:xnaLAeukawg:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/AHnDeMgnYHA" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/08/even-boring-form-data-can-be-interesting-for-a-developer/feed/</wfw:commentRss> <slash:comments>14</slash:comments> </item> <item><title>Algorithms, A Dropbox Challenge And Dynamic Programming</title><link>http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/</link> <comments>http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/#comments</comments> <pubDate>Sun, 27 Feb 2011 04:08:39 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[algorithms]]></category> <category><![CDATA[dynamic programming]]></category> <category><![CDATA[subset sum problem]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1950</guid> <description><![CDATA[Lately I&#39;ve slowly been trying to grok the fullness of dynamic programming. It is an algorithmic technique that the vast majority of developers never master, which is unfortunate since it can help you come up with viable solutions for seemingly intractable problems. The issue with dynamic programming (besides the totally misleading name), is that it [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/07/effective-vs-ineffective-pair-programming/' rel='bookmark' title='Effective vs Ineffective Pair Programming'>Effective vs Ineffective Pair Programming</a></li><li><a
href='http://www.skorks.com/2010/05/sort-files-like-a-master-with-the-linux-sort-command-bash/' rel='bookmark' title='Sort Files Like A Master With The Linux Sort Command (Bash)'>Sort Files Like A Master With The Linux Sort Command (Bash)</a></li><li><a
href='http://www.skorks.com/2010/03/how-to-answer-a-programming-interview-question-and-look-good-doing-it/' rel='bookmark' title='How To Answer A Programming Interview Question And Look Good Doing It'>How To Answer A Programming Interview Question And Look Good Doing It</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Algorithm" class="alignleft size-full wp-image-1954" hspace="20" src="http://www.skorks.com/wp-content/uploads/2011/02/algorithm.jpg" style="width: 260px; height: 159px;" title="Algorithm" vspace="5" />Lately I&#39;ve slowly been trying to <a
href="http://www.grokinfullness.com/" target="_blank">grok the fullness</a> of <a
href="http://en.wikipedia.org/wiki/Dynamic_programming" target="_blank">dynamic programming</a>. It is an algorithmic technique that the vast majority of developers never master, which is unfortunate since <strong>it can help you come up with viable solutions for seemingly intractable problems</strong>. The issue with dynamic programming (<em>besides the totally misleading name</em>), is that it can be very difficult to see how to apply it to a particular problem and even when you do, it is a real pain to get it right. Anyway, I don&#39;t want to expound on this, I have something more interesting in mind.</p><h2>The Dropbox Challenges</h2><p>I was surfing the web the other day and in the course of my random wanderings I ended up at the <a
href="http://www.dropbox.com/jobs/challenges" target="_blank">Dropbox programming challenges</a> page. Apparently, the <a
href="http://www.dropbox.com/" target="_blank">Dropbox</a> guys have posted up some coding challenges for people who want to apply to work there (<em>and everyone else, I guess, since it&#39;s on the internet and all :)</em>). Challenge 3 (<em><strong>The Dropbox Diet</strong></em>) immediately caught my eye since it looked like one of those problems that should have a dynamic programming solution, so I decided to use it as an opportunity to practice. The full description of the problem is on the <a
href="http://www.dropbox.com/jobs/challenges" target="_blank">challenge page</a>, but here is the gist of it.</p><blockquote><p>We get a list of up to 50 activities and an associated calorie value for each (either positive or negative), we need to find a subset of activities where the sum of all the calorie values is zero.</p></blockquote><p>It sounded easy enough until<strong> </strong>I thought about it and realised it was more complex than it first appeared. So, I went for a walk :) and when I came back I settled in to analyse it for real. <strong>The first part of solving any problem is to really understand what problem you&#39;re trying to solve</strong> (<em>that one sentence really <a
href="http://feeds.feedburner.com/softwaretechandmore">deserves its own article</a></em>). In this case the activities list is just extraneous information, what we really have is a list of numbers and we need to find a subset of these numbers where the sum of the subset is equal to a particular value. It took me quite a while to come up with that definition, but once you have something like that, you can do some research and see if it is a known problem.</p><p>Of course, I did nothing of the kind, I had already decided that there must be a dynamic programming solution so I went ahead and tried to come up with it myself. This wasted about an hour at the end of which I had absolutely nothing; I guess my dynamic programming chops are still lamb-chops as opposed to nice meaty beef-chops :). Having failed I decided to do what I should have done in the first place &#8211; some research. Since I had taken the time to come up with a decent understanding of the problem, it only took 5 minutes of Googling to realise that I was dealing with the <a
href="http://en.wikipedia.org/wiki/Subset_sum_problem" target="_blank">subset sum problem</a>.</p><h2>The Subset Sum Problem</h2><p
style="text-align: center;"><img
align="middle" alt="Subset" class="aligncenter size-full wp-image-1955" height="231" src="http://www.skorks.com/wp-content/uploads/2011/02/subset.jpg" title="Subset" vspace="5" width="240" /></p><p>The unfortunate thing about the subset sum problem is the fact that it&#39;s <a
href="http://en.wikipedia.org/wiki/NP-complete" target="_blank">NP-complete</a>. This means that if our input is big enough we may be in trouble. Wikipedia does give some algorithmic approaches to the problem (<em>no code though</em>), but just to cross our <strong>t&#39;s</strong> I also cracked open <a
href="http://www.amazon.com/Introduction-Algorithms-Third-Thomas-Cormen/dp/0262033844" target="_blank">Cormen et al</a> (<em>have you ever noticed how that book has everything when it comes to algorithms :)</em>). In this case the book agreed with Wikipedia, but once again, no code (<em>there are only two things I don&#39;t like about <strong>Intro To Algorithms</strong>, the lack of real code and the lack of examples</em>). I browsed the web some more, in case it would give me further insight into the problem, but there wasn&#39;t much more to know &#8211; it was time to get my code on.</p><h2>The Exponential Time Algorithm</h2><p>The problem with the exponential time algorithm is its runtime complexity (<em>obviously</em>), but our maximum input size was only 50 and even if that turned out to be too big, perhaps there were some easy optimizations to be made. Regardless I decided to tackle this one first, if nothing else it would immerse me in the problem. I&#39;ll demonstrate how it works via example. Let&#39;s say our input looks like this:</p><pre>[1, -3, 2, 4]
</pre><p>We need to iterate through the values and on every iteration produce all the possible subsets that can be made with all the numbers we&#39;ve looked at up until now. Here is how it looks:</p><blockquote><p><em>Iteration 1:</em></p><pre>[[1]]
</pre><p><em>Iteration 2:</em></p><pre>[[1], [-3], [1, -3]]
</pre><p><em>Iteration 3:</em></p><pre>[[1], [-3], [1, -3], [2], [1, 2], [-3, 2], [1, -3, 2]]
</pre><p><em>Iteration 4:</em></p><pre>[[1], [-3], [1, -3], [2], [1, 2], [-3, 2], [1, -3, 2], [4], [1, 4], [-3, 4], [1, -3, 4], [2, 4], [1, 2, 4], [-3, 2, 4], [1, -3, 2, 4]]
</pre></blockquote><p>On every iteration we simply take the number we&#39;re currently looking at as well as a clone of the list of all the subsets we have seen so far, we append the new number to all the subsets (<em>we also add the number itself to the list since it can also be a subset</em>) and then we concatenate this new list to the list of subsets that we generated on the previous iteration. Here is the previous example again, but demonstrating this approach:</p><blockquote><p><em>Iteration 1:</em></p><pre>[] + [1]</pre><p><em>Iteration 2:</em></p><pre>[1] + [-3], [1, -3]
</pre><p><em>Iteration 3:</em></p><pre>[1], [-3], [1, -3] + [2], [1, 2], [-3, 2], [1, -3, 2]
</pre><p><em>Iteration 4:</em></p><pre>[1], [-3], [1, -3], [2], [1, 2], [-3, 2], [1, -3, 2] + [4], [1, 4], [-3, 4], [1, -3, 4], [2, 4], [1, 2, 4], [-3, 2, 4], [1, -3, 2, 4]
</pre></blockquote><p><strong>This allows us to generate all the possible subsets of our input</strong>, all we have to do then is pick out the subsets that sum up to the value we&#39;re looking for (<em>e.g. 0</em>).</p><p>The list of subsets grows exponentially (<em>it being an exponential time algorithm and all :)</em>), but since we know what sum we&#39;re looking for, there is one small optimization we can make. We can sort our input list before trying to generate the subsets, this way all the negative values will be first in the list. The implication here is this, once the sum of any subset exceeds the value we&#39;re looking for, we can instantly discard it since all subsequent values we can append to it will only make it bigger. Here is some code:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> subsets_with_sum_less_than_or_equal<span style="color:#006600; font-weight:bold;">&#40;</span>reference_value, <span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#CC0066; font-weight:bold;">array</span> = <span style="color:#CC0066; font-weight:bold;">array</span>.<span style="color:#9900CC;">sort</span> <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>a,b<span style="color:#006600; font-weight:bold;">|</span> a <span style="color:#006600; font-weight:bold;">&lt;=&gt;</span> b<span style="color:#006600; font-weight:bold;">&#125;</span>
  previous_sums = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
  <span style="color:#CC0066; font-weight:bold;">array</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>element<span style="color:#006600; font-weight:bold;">|</span>
    new_sums = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    new_sums <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#006600; font-weight:bold;">&#91;</span>element<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#9966CC; font-weight:bold;">if</span> element <span style="color:#006600; font-weight:bold;">&lt;</span>= reference_value
    previous_sums.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>previous_sum<span style="color:#006600; font-weight:bold;">|</span>
      current_sum = previous_sum <span style="color:#006600; font-weight:bold;">+</span> <span style="color:#006600; font-weight:bold;">&#91;</span> element <span style="color:#006600; font-weight:bold;">&#93;</span>
      new_sums <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> current_sum <span style="color:#9966CC; font-weight:bold;">if</span> current_sum.<span style="color:#9900CC;">inject</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>accumulator,value<span style="color:#006600; font-weight:bold;">|</span>accumulator<span style="color:#006600; font-weight:bold;">+</span>value<span style="color:#006600; font-weight:bold;">&#125;</span> <span style="color:#006600; font-weight:bold;">&lt;</span>= reference_value
    <span style="color:#9966CC; font-weight:bold;">end</span>
    previous_sums = previous_sums <span style="color:#006600; font-weight:bold;">+</span> new_sums
  <span style="color:#9966CC; font-weight:bold;">end</span>
  previous_sums
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>If we execute that (<em>with our reference value being 0 and our array being [1, -3, 2, 4]</em>), we get the following output:</p><pre>[[-3], [-3, 1], [-3, 2], [-3, 1, 2]]
</pre><p>All the subsets in that list sum up to less than or equal to our reference value (<em>0</em>). All we need to do now is pick out the ones that we&#39;re after.</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> subsets_with_sums_equal<span style="color:#006600; font-weight:bold;">&#40;</span>reference_value, <span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  subsets_with_sums_less_than_or_equal = subsets_with_sum_less_than_or_equal<span style="color:#006600; font-weight:bold;">&#40;</span>reference_value, <span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  subsets_adding_up_to_reference_value = subsets_with_sums_less_than_or_equal.<span style="color:#9900CC;">inject</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>accumulator, subset<span style="color:#006600; font-weight:bold;">|</span>
    accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> subset <span style="color:#9966CC; font-weight:bold;">if</span> subset.<span style="color:#9900CC;">inject</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">|</span>sum, value<span style="color:#006600; font-weight:bold;">|</span> sum<span style="color:#006600; font-weight:bold;">+</span>value<span style="color:#006600; font-weight:bold;">&#125;</span> == reference_value
    accumulator
  <span style="color:#9966CC; font-weight:bold;">end</span>
  subsets_adding_up_to_reference_value
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>This function calls the previous one and then picks out the subset we&#39;re after:</p><pre>[[-3, 1, 2]]
</pre><p><strong>It&#39;s simple and works very well for any input array with less than 20 values or so, and if you try it with more than 25 &#8211; good luck waiting for it to finish :)</strong>. Exponential time is no good if we want it to work with an input size of 50 (<em>or more</em>) numbers.</p><h2>The Dynamic Programming Algorithm</h2><p
style="text-align: center;"><img
align="middle" alt="Matrix" class="aligncenter size-full wp-image-1956" height="172" src="http://www.skorks.com/wp-content/uploads/2011/02/matrix.jpg" style="width: 259px; height: 172px;" title="Matrix" vspace="5" width="259" /></p><p>Both Wikipedia and Cormen tell us that there is a polynomial time approximate algorithm, but that&#39;s no good for us since we want the subsets that add up to exactly zero, not approximately zero. Fortunately, <strong><em>just like I suspected, there is a dynamic programming solution, Wikipedia even explains how it works, which is only marginally helpful when it comes to implementing it</em></strong>. I know because that was the solution I tackled next. Here is how it works, using the same input as before:</p><pre>[1, -3, 2, 4]
</pre><p>Just like with any dynamic programming problem, we need to produce a matrix, the key is to figure out what it&#39;s a matrix of (<em>how do we label the rows and how do we label the columns</em>). In this case the rows are simply the indexes of our input array; the columns are labelled with every possible sum that can be made out of the input numbers. In our case, the smallest sum we can make from our input is -3 since that&#39;s the only negative number we have, the biggest sum is seven (<em>1 + 2 + 4</em>). So, our uninitialized matrix looks like this:</p><pre>+---+----+----+----+---+---+---+---+---+---+---+---+
|   | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+----+----+----+---+---+---+---+---+---+---+---+
| 0 |    |    |    |   |   |   |   |   |   |   |   |
| 1 |    |    |    |   |   |   |   |   |   |   |   |
| 2 |    |    |    |   |   |   |   |   |   |   |   |
| 3 |    |    |    |   |   |   |   |   |   |   |   |
+---+----+----+----+---+---+---+---+---+---+---+---+
</pre><p>So far so good, but what should we put in every cell of our matrix. In this case every cell will contain either T (<em>true</em>) or F (<em>false</em>).</p><p><u><strong>A T value in a cell means that the sum that the column is labelled with can be constructed using the input array numbers that are indexed by the current row label and the labels of all the previous rows we have already looked at</strong></u>. An F in a cell means the sum of the column label cannot be constructed. Let&#39;s try to fill in our matrix to see how this works.</p><p>We start with the first row, the number indexed by the row label is 1, there is only one sum that can be made using that number &#8211; 1. So only one cell gets a T in it, all the rest get an F.</p><pre>+---+----+----+----+---+---+---+---+---+---+---+---+
|   | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+----+----+----+---+---+---+---+---+---+---+---+
| 0 | F  | F  | F  | F | T | F | F | F | F | F | F |
| 1 |    |    |    |   |   |   |   |   |   |   |   |
| 2 |    |    |    |   |   |   |   |   |   |   |   |
| 3 |    |    |    |   |   |   |   |   |   |   |   |
+---+----+----+----+---+---+---+---+---+---+---+---+
</pre><p>The number indexed by the second row label is -3, so in the second row, the column labelled by -3 will get a T in it. However, <strong>we&#39;re considering the numbers indexed by the current row and all previous rows, which means any sum that can be made using the numbers 1 and -3 will get a T in its column</strong>. This means that the column labelled with 1 gets a T and the column labelled with -2 gets a T since</p><pre>1 + -3 = -2
</pre><pre>+---+----+----+----+---+---+---+---+---+---+---+---+
|   | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+----+----+----+---+---+---+---+---+---+---+---+
| 0 | F  | F  | F  | F | T | F | F | F | F | F | F |
| 1 | T  | T  | F  | F | T | F | F | F | F | F | F |
| 2 |    |    |    |   |   |   |   |   |   |   |   |
| 3 |    |    |    |   |   |   |   |   |   |   |   |
+---+----+----+----+---+---+---+---+---+---+---+---+
</pre><p>We continue in the same vein for the next row, we&#39;re now looking at number 2 since it&#39;s indexed by the third row in our matrix. So, the column labelled by 2 will get a T, all the columns labelled by T in the previous row propagate their T value down, since all those sums are still valid. But we can produce a few other sums given the numbers at our disposal:</p><pre>2 + -3 = -1
1 + 2 + -3 = 0
1 + 2 = 3
</pre><p>All those sums get a T in their column for this row.</p><pre>+---+----+----+----+---+---+---+---+---+---+---+---+
|   | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+----+----+----+---+---+---+---+---+---+---+---+
| 0 | F  | F  | F  | F | T | F | F | F | F | F | F |
| 1 | T  | T  | F  | F | T | F | F | F | F | F | F |
| 2 | T  | T  | T  | T | T | T | T | F | F | F | F |
| 3 |    |    |    |   |   |   |   |   |   |   |   |
+---+----+----+----+---+---+---+---+---+---+---+---+
</pre><p>There are three patterns that are starting to emerge.</p><ol><li>For every row, the column which is equivalent to the number indexed by the row get a T in it (<em>e.g. row zero represents the number 1, so the column labelled by 1 gets a T in row 0, row one represents the number -3 so the column labelled by -3 get a T in row 1 etc.</em>), every row will get one T in one of the columns via this pattern. <strong>This is because a single number by itself is a valid subset sum</strong>.</li><li>If a column already has a T in the previous row, this T propagates down to the current row (<em>e.g. when looking at the second row, the column labelled by 1 has a T in the first row and will therefore have a T in the second row also, when looking at the third row columns labelled by -3, -2 and 1 all had a T in the second row and will therefore contain a T in the third row</em>). <strong>This is due to the fact that once it is possible to construct a certain sum using a subset of our input numbers, looking at more of the input numbers does not invalidate the existing subsets</strong>.</li><li>Looking at any column label X in the current row which still has a value of F, if we subtract the number indexed by the current row from this column label we get a new number Y, we then check the row above the current row in the column labelled by Y, if we see a T, this T is propagated into the column X in the current row (<em>e.g. if we&#39;re looking at the second row, column labelled with -2, we subtract the number of the current row -3 from the column label, -2 &#8211; -3 = -2 + 3 = 1, this new number is the column label in the first row, we can see that in the first row in the column labelled with 1 there is a T, therefore this T gets propagated to the second row into the column labelled with -2</em>). <strong>This is due to the fact that if we take a sum that is already possible and add another number to it, this obviously creates a new sum which is now possible</strong>.</li></ol><p>Those three patterns are the algorithm that we use to fill in our matrix one row at a time. We can now use them to fill in the last row. The number indexed by the last row is 4. Therefore in the last row, the column labelled by 4 will get a T (<em>via the first pattern</em>). All the columns that already have a T will have that T propagate to the last row (<em>via the second pattern</em>). This means the only columns with an F will be those labelled by 5, 6 and 7. However using pattern 3, if we subtract 4 from 5, 6 and 7 we get:</p><pre>5 - 4 = 1
6 - 4 = 2
7 - 4 = 3
</pre><p>If we now look at the previous row in the columns labelled by those numbers we can see a T for all three cases, therefore, <em><strong>even the columns labelled with 5, 6 and 7 in the last row will pick up a T via the third pattern</strong></em>. Our final matrix is:</p><pre>+---+----+----+----+---+---+---+---+---+---+---+---+
|   | -3 | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+----+----+----+---+---+---+---+---+---+---+---+
| 0 | F  | F  | F  | F | T | F | F | F | F | F | F |
| 1 | T  | T  | F  | F | T | F | F | F | F | F | F |
| 2 | T  | T  | T  | T | T | T | T | F | F | F | F |
| 3 | T  | T  | T  | T | T | T | T | T | T | T | T |
+---+----+----+----+---+---+---+---+---+---+---+---+
</pre><p>One final problem remains, how can we use this matrix to get the subset that adds up to the value we want (<em>i.e. 0</em>). This is also reasonably simple. We start in the column labelled by the sum we&#39;re after, in our case we start in the column labelled by zero. If this column does not contain a T then our sum is not possible and the input does not have a solution. In our case, the column does have a T so we&#39;re in business.</p><ul><li>We start at the last row in this column; if it has a T and the row above has a T we go to the row above.</li><li>If the row above has an F then we take the number which is indexed by the current row and write it into our final output.</li><li>We then subtract this number from the column label to get the next column label. We jump to the new column label and go up a row.</li><li>Once again if there is a T there and there is an F above, then we write the number indexed by the row into our output and subtract it from the current column label to get the next column label.</li><li>We then jump to that column and go up a row again.</li><li>We keep doing this until we get to the top of the matrix, at this point the numbers we have written to the output will be our solution.</li></ul><p>Let&#39;s do this for our matrix. We start at the column labelled by 0 since that&#39;s the sum we&#39;re looking for. We look at the last row and see a T, but there is also a T in the row above so we go up to that row. Now there is an F in the row above, so we write the number indexed by this row into our output:</p><pre>output = [2]
</pre><p>We now subtract this number from our column label to get the new column label:</p><pre>0 - 2 = -2
</pre><p>We jump to the column labelled by -2 and go up a row, there is another T there with an F in the row above, so we write the number indexed by the row to our output:</p><pre>output = [2, -3]
</pre><p>We perform our subtraction step again:</p><pre>-2 - -3 = -2 + 3 = 1
</pre><p>We now jump to the column labelled by 1 in the first row in the matrix. There is also a T there, so we need to write one last number to our output:</p><pre>output = [2, -3, 1]
</pre><p>Since we&#39;re at the top of the matrix, we&#39;re done. As you can see the procedure we perform to reconstruct the output subset is actually a variant of the third pattern we used to construct the matrix. And that&#39;s all there is to it.</p><p>Oh yeah, I almost forgot the code :), since it is not tiny, I put it in a gist, <a
href="https://gist.github.com/845251" target="_blank">you can find it here</a>. But, here are the guts of it:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  <span style="color:#9966CC; font-weight:bold;">def</span> initialize_first_row
    <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>element,i<span style="color:#006600; font-weight:bold;">|</span>
      <span style="color:#9966CC; font-weight:bold;">next</span> <span style="color:#9966CC; font-weight:bold;">if</span> i == <span style="color:#006666;">0</span> <span style="color:#008000; font-style:italic;"># skipping the first one since it is the index into the array</span>
      <span style="color:#9966CC; font-weight:bold;">if</span> @<span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#91;</span>@matrix<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#008000; font-style:italic;"># the only sum we can have is the first number itself</span>
        <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#996600;">&quot;T&quot;</span>;
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#0066ff; font-weight:bold;">@matrix</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#9966CC; font-weight:bold;">def</span> populate
    <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">2</span>...@matrix.<span style="color:#9900CC;">size</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>row<span style="color:#006600; font-weight:bold;">|</span>
      <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">&#93;</span>.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>element,i<span style="color:#006600; font-weight:bold;">|</span>
        <span style="color:#9966CC; font-weight:bold;">next</span> <span style="color:#9966CC; font-weight:bold;">if</span> i == <span style="color:#006666;">0</span>
        <span style="color:#9966CC; font-weight:bold;">if</span> @<span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#91;</span>@matrix<span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">||</span> <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#006600; font-weight:bold;">&amp;</span><span style="color:#008000; font-style:italic;">#39;T&amp;#39; || current_sum_possible(row, i) </span>
          <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>i<span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#996600;">&quot;T&quot;</span>;
        <span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#0066ff; font-weight:bold;">@matrix</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#9966CC; font-weight:bold;">def</span> current_sum_possible<span style="color:#006600; font-weight:bold;">&#40;</span>row, column<span style="color:#006600; font-weight:bold;">&#41;</span>
    column_sum = <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>column<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">-</span> @<span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#91;</span>@matrix<span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    column_index = <span style="color:#0066ff; font-weight:bold;">@column_value_to_index</span><span style="color:#006600; font-weight:bold;">&#91;</span>column_sum<span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">false</span> <span style="color:#9966CC; font-weight:bold;">unless</span> column_index
    <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>column_index<span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#996600;">&quot;T&quot;</span>;
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  <span style="color:#9966CC; font-weight:bold;">def</span> derive_subset_for<span style="color:#006600; font-weight:bold;">&#40;</span>reference_value<span style="color:#006600; font-weight:bold;">&#41;</span>
    subset = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    column_index = <span style="color:#0066ff; font-weight:bold;">@column_value_to_index</span><span style="color:#006600; font-weight:bold;">&#91;</span>reference_value<span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">1</span>...@matrix.<span style="color:#9900CC;">size</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">to_a</span>.<span style="color:#9900CC;">reverse</span>.<span style="color:#9900CC;">each</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>row<span style="color:#006600; font-weight:bold;">|</span>
      <span style="color:#9966CC; font-weight:bold;">if</span> <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>column_index<span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#996600;">&quot;F&quot;</span>;
        <span style="color:#0000FF; font-weight:bold;">return</span> subset
      <span style="color:#9966CC; font-weight:bold;">elsif</span> <span style="color:#0066ff; font-weight:bold;">@matrix</span><span style="color:#006600; font-weight:bold;">&#91;</span>row<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>column_index<span style="color:#006600; font-weight:bold;">&#93;</span> == <span style="color:#996600;">&quot;T&quot;</span>;
        <span style="color:#9966CC; font-weight:bold;">next</span>
      <span style="color:#9966CC; font-weight:bold;">else</span>
        array_value = @<span style="color:#CC0066; font-weight:bold;">array</span><span style="color:#006600; font-weight:bold;">&#91;</span>row <span style="color:#006600; font-weight:bold;">-</span> <span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#008000; font-style:italic;"># the -1 is to account for the fact that our rows are 1 larger than indexes of input array due to row 0 in matrix being header</span>
        subset.<span style="color:#9900CC;">insert</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">0</span>, array_value<span style="color:#006600; font-weight:bold;">&#41;</span>
        column_index = <span style="color:#0066ff; font-weight:bold;">@column_value_to_index</span><span style="color:#006600; font-weight:bold;">&#91;</span>@matrix<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#91;</span>column_index<span style="color:#006600; font-weight:bold;">&#93;</span> <span style="color:#006600; font-weight:bold;">-</span> array_value<span style="color:#006600; font-weight:bold;">&#93;</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    subset
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>You can recognise the 3 patterns being applied in the &#39;<em>populate</em>&#39; method. We&#39;re, of course, missing the code for instantiating the matrix in the first place. Grab the whole thing <a
href="https://gist.github.com/845251" target="_blank">from the gist</a> and give it a run, it generates random inputs of size 50 with values between -1000 and 1000. And if you think that would produce quite a large matrix, you would be right :) (<em>50 rows and about 25000 columns give or take a few thousand</em>). But <strong>even with input size 100 it only takes a couple of seconds to get an answer, which is MUCH better than the exponential time algorithm</strong>; in my book that equals success. Dropbox Challenge 3 &#8211; solved (<em>more or less :)</em>)!</p><p>By the way if you want to print out a few more matrices, grab the code and uncomment the relevant line (102) and you&#39;ll get a matrix similar to those above along with the rest of the output. Obviously, if you&#39;re doing that, make sure your input size is small enough for the matrix to actually fit on the screen. I used the great <a
href="https://github.com/visionmedia/terminal-table" target="_blank">terminal-table gem</a> to produce the nice ASCII tables.</p><p>Lastly, if you&#39;re wondering what framework this is:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">if</span> ENV<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#996600;">&quot;attest&quot;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
  this_tests <span style="color:#996600;">&quot;generating subset sums using dynamic programming&quot;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
    test<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;subset should be [1,-3,2]&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
      actual_subset_sum = subset_sum_dynamic<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span>, <span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">3</span>, <span style="color:#006666;">2</span>, <span style="color:#006666;">4</span><span style="color:#006600; font-weight:bold;">&#93;</span>, <span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      should_equal<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span>,<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">3</span>,<span style="color:#006666;">2</span><span style="color:#006600; font-weight:bold;">&#93;</span>, actual_subset_sum<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    ...
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>That would be me <a
href="https://github.com/skorks/attest" target="_blank">eating my own dog food</a>, I took the time to write it, might as well use it :).</p><p>By the way, it took me hours (<em>pretty much the better part of a day</em>) to get all of this stuff working properly, dynamic programming algorithms really are fiddly little beasts. But, I had some fun, and got some good practice and learning out of it &#8211; time well spent (<em>and now there is some decent subset sum code on the internet :P</em>). Of course once I finished with this I had to look at the other challenges, number 2 didn&#39;t really catch my attention, but I couldn&#39;t walk away from number 1 with its ASCII boxes and bin packing goodness &#8211; I&#39;ll <a
href="http://feeds.feedburner.com/softwaretechandmore">write that one up some other time</a>.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Images by <a
href="http://www.flickr.com/photos/trainor/451799414/" target="_blank">johntrainor</a>, <a
href="http://www.flickr.com/photos/infinitewhite/3982659473/" target="_blank">infinitewhite</a> and <a
href="http://www.flickr.com/photos/familymwr/4928456758/" target="_blank">familymwr</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/07/effective-vs-ineffective-pair-programming/' rel='bookmark' title='Effective vs Ineffective Pair Programming'>Effective vs Ineffective Pair Programming</a></li><li><a
href='http://www.skorks.com/2010/05/sort-files-like-a-master-with-the-linux-sort-command-bash/' rel='bookmark' title='Sort Files Like A Master With The Linux Sort Command (Bash)'>Sort Files Like A Master With The Linux Sort Command (Bash)</a></li><li><a
href='http://www.skorks.com/2010/03/how-to-answer-a-programming-interview-question-and-look-good-doing-it/' rel='bookmark' title='How To Answer A Programming Interview Question And Look Good Doing It'>How To Answer A Programming Interview Question And Look Good Doing It</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=v8-eNLi7Y3Q:ysbTzT7gM8o:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=v8-eNLi7Y3Q:ysbTzT7gM8o:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=v8-eNLi7Y3Q:ysbTzT7gM8o:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/v8-eNLi7Y3Q" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/feed/</wfw:commentRss> <slash:comments>38</slash:comments> </item> <item><title>A Unit Testing Framework In 44 Lines Of Ruby</title><link>http://www.skorks.com/2011/02/a-unit-testing-framework-in-44-lines-of-ruby/</link> <comments>http://www.skorks.com/2011/02/a-unit-testing-framework-in-44-lines-of-ruby/#comments</comments> <pubDate>Wed, 16 Feb 2011 11:58:07 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[ruby]]></category> <category><![CDATA[ruby unit testing]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1942</guid> <description><![CDATA[Late last year I attended some workshops which were being run as part of the YOW Melbourne developer conference. Since the workshops were run by @coreyhaines and @jbrains, TDD was a prominent part. Normally this would not be an issue, but in a spectacular display of fail (considering it was a developer conference in 2010), [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/09/true-false-and-nil-objects-in-ruby/' rel='bookmark' title='True, False And Nil Objects In Ruby'>True, False And Nil Objects In Ruby</a></li><li><a
href='http://www.skorks.com/2009/09/ruby-equality-and-object-comparison/' rel='bookmark' title='Ruby Equality And Object Comparison'>Ruby Equality And Object Comparison</a></li><li><a
href='http://www.skorks.com/2009/08/more-advanced-ruby-method-arguments-hashes-and-blocks/' rel='bookmark' title='More Advanced Ruby Method Arguments &#8211; Hashes And Block Basics'>More Advanced Ruby Method Arguments &#8211; Hashes And Block Basics</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Testing" class="alignleft size-medium wp-image-1946" height="250" hspace="20" src="http://www.skorks.com/wp-content/uploads/2011/02/testing-300x225.jpg" style="width: 334px; height: 250px;" title="Testing" vspace="5" width="334" />Late last year I attended some workshops which were being run as part of the <a
href="http://yowaustralia.com.au/melbourne/index.html" target="_blank">YOW Melbourne developer conference</a>. Since the workshops were run by <a
href="http://twitter.com/coreyhaines" target="_blank">@coreyhaines</a> and <a
href="http://twitter.com/jbrains" target="_blank">@jbrains</a>, TDD was a prominent part. Normally this would not be an issue, but in a spectacular display of fail (<em>considering it was a developer conference in 2010</em>), the internet was hard to come by, which left me and my freshly installed Linux laptop without the ability to acquire <a
href="http://relishapp.com/rspec" target="_blank">Rspec</a>. Luckily a few weeks before, I decided to write <a
href="https://github.com/skorks/attest" target="_blank">a unit testing framework of my very own</a> (<em>just because I could :)</em>) and so I had a reasonably fresh copy of that code lying around &#8211; problem solved. But, it got me thinking, <strong>what is the minimum amount of code needed to make a viable unit testing framework</strong>?</p><h2>A Minimum Viable Unit Test</h2><p>I had some minimal code when I first toyed with writing a unit testing framework, but then I went and ruined it by adding a bunch of features :), fortunately it is pretty easy to recreate. What we really want is the ability to execute the following code:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">describe <span style="color:#996600;">&quot;some test&quot;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
  it <span style="color:#996600;">&quot;should be true&quot;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
    <span style="color:#0000FF; font-weight:bold;">true</span>.<span style="color:#9900CC;">should</span> == <span style="color:#0000FF; font-weight:bold;">true</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  it <span style="color:#996600;">&quot;should show that an expression can be true&quot;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
    <span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006666;">5</span> == <span style="color:#006666;">5</span><span style="color:#006600; font-weight:bold;">&#41;</span>.<span style="color:#9900CC;">should</span> == <span style="color:#0000FF; font-weight:bold;">true</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
&nbsp;
  it <span style="color:#996600;">&quot;should be failing deliberately&quot;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
    <span style="color:#006666;">5</span>.<span style="color:#9900CC;">should</span> == <span style="color:#006666;">6</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>As you can see it looks very much like a basic Rspec test, let&#39;s try and write some code to run this.</p><h2>Building A Basic Framework</h2><p>The first thing we&#39;re going to need is the ability to define a new test using &quot;<em>describe</em>&quot;. Since we want to be able to put a &quot;<em>describe</em>&quot; block anywhere (<em>e.g. its own file</em>), we&#39;re going to have to augment Ruby a little bit. <strong>The &quot;<em>puts</em>&quot; method lives in the Kernel module and is therefore available anywhere (<em>since the Object class includes Kernel and every object in Ruby inherits from Object</em>), we will put describe inside Kernel to give it the same ability</strong>:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">module</span> <span style="color:#CC00FF; font-weight:bold;">Kernel</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> describe<span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    tests = Dsl.<span style="color:#9900CC;">new</span>.<span style="color:#9900CC;">parse</span><span style="color:#006600; font-weight:bold;">&#40;</span>description, block<span style="color:#006600; font-weight:bold;">&#41;</span>
    tests.<span style="color:#9900CC;">execute</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>As you can see, &quot;<em>describe</em>&quot; takes a description string and a block that will contain the rest of our test code. At this point, we want to pull apart the block that we&#39;re passing to &quot;<em>describe</em>&quot; to separate it into individual examples (<em>i.e. &quot;it&quot; blocks</em>). For this we create a class called <strong>Dsl </strong>and pass our block to its parse method, this will produce an object that will allow us to execute all our test, but let&#39;s not get ahead of ourselves. Our <strong>Dsl </strong>class looks like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">class</span> Dsl
  <span style="color:#9966CC; font-weight:bold;">def</span> initialize
    <span style="color:#0066ff; font-weight:bold;">@tests</span> = <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">&#125;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> parse<span style="color:#006600; font-weight:bold;">&#40;</span>description, block<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">instance_eval</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    Executor.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#0066ff; font-weight:bold;">@tests</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> it<span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0066ff; font-weight:bold;">@tests</span><span style="color:#006600; font-weight:bold;">&#91;</span>description<span style="color:#006600; font-weight:bold;">&#93;</span> = block
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>What we do here is evaluate our block in the context of the current Dsl object:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">instance_eval</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span></pre></div></div><p>Our <strong>Dsl </strong>object has an &quot;<em>it</em>&quot; method which also takes a description and a block and since that is exactly what our describe block contains everything works well (<em>i.e. we&#39;re essentially making several method calls to the &quot;it&quot; method each time passing in a description and a block</em>). <strong>We could define other methods on our Dsl object and those would become part of the &quot;<em>language</em>&quot; which will be available to us in the &quot;<em>describe</em>&quot; block</strong>.</p><p>Our &quot;<em>it</em>&quot; method will be called once for every &quot;<em>it</em>&quot; block in the describe block, every time that happens we simply take the block that was passed in and store it in hash keyed on the description. When we&#39;re done, we simply create a new <strong>Executor </strong>object which we will use to iterate over our test blocks, call them and produce some results. The executor looks like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">class</span> Executor
  <span style="color:#9966CC; font-weight:bold;">def</span> initialize<span style="color:#006600; font-weight:bold;">&#40;</span>description, tests<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0066ff; font-weight:bold;">@description</span> = description
    <span style="color:#0066ff; font-weight:bold;">@tests</span> = tests
    <span style="color:#0066ff; font-weight:bold;">@success_count</span> = <span style="color:#006666;">0</span>
    <span style="color:#0066ff; font-weight:bold;">@failure_count</span> = <span style="color:#006666;">0</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> execute
    <span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;#{@description}&quot;</span>
    <span style="color:#0066ff; font-weight:bold;">@tests</span>.<span style="color:#9900CC;">each_pair</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>name, block<span style="color:#006600; font-weight:bold;">|</span>
      <span style="color:#CC0066; font-weight:bold;">print</span> <span style="color:#996600;">&quot; - #{name}&quot;</span>
      result = <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">instance_eval</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
      result ? <span style="color:#0066ff; font-weight:bold;">@success_count</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span> : <span style="color:#0066ff; font-weight:bold;">@failure_count</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span>
      <span style="color:#CC0066; font-weight:bold;">puts</span> result ? <span style="color:#996600;">&quot; SUCCESS&quot;</span> : <span style="color:#996600;">&quot; FAILURE&quot;</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    summary
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> summary
    <span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;<span style="color:#000099;">\n</span>#{@tests.keys.size} tests, #{@success_count} success, #{@failure_count} failure&quot;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>Our executor code is reasonably simple. We print out the description of our &quot;<em>describe</em>&quot; block we then go through all the &quot;<em>it</em>&quot; blocks we have stored and evaluate them in the context of the executor object. In our case there is no special reason for this, but it does mean that the executor object can also be a container for some methods that can be used as a &quot;<em>language</em>&quot; inside &quot;<em>it</em>&quot; blocks (<em>i.e. part of our dsl can be defined as method on the executor</em>). For example, we could define the following method on our executor:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> should_be_five<span style="color:#006600; font-weight:bold;">&#40;</span>x<span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#006666;">5</span> == x
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>This method would then be available for us to use inside our &quot;<em>it</em>&quot; blocks, but for our simple test it is not necessary.</p><p>So, we evaluate our &quot;<em>it</em>&quot; blocks and store the result, which is simply going to be the return value of the last statement in the &quot;<em>it</em>&quot; block (<em>as per regular Ruby</em>). <strong>In our case we want to make sure that the last statement always returns a boolean value (<em>to indicate success or failure of the test</em>), we can then use it to produce some meaningful output</strong>.</p><p>We are missing one piece though, the &quot;<em>should</em>&quot; method, we had code like:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#0000FF; font-weight:bold;">true</span>.<span style="color:#9900CC;">should</span> == <span style="color:#0000FF; font-weight:bold;">true</span>
<span style="color:#006666;">5</span>.<span style="color:#9900CC;">should</span> == <span style="color:#006666;">5</span></pre></div></div><p>It seems that every object has the &quot;<em>should</em>&quot; method available to it and that&#39;s exactly how it works:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">class</span> <span style="color:#CC00FF; font-weight:bold;">Object</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> should
    <span style="color:#0000FF; font-weight:bold;">self</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>The method doesn&#39;t really DO anything (<em>just returns the object</em>); it simply acts as syntactic sugar to make our tests read a bit better.</p><p>At this stage, we just take the result of evaluating the test, turn it into a string to indicate success or failure and print that out. <strong>Along the way we keep track of the number of successes or failures so that we can produce a summary report in the end</strong>. That&#39;s all the code we need, if we put it all together, we get the following 44 lines:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">module</span> <span style="color:#CC00FF; font-weight:bold;">Kernel</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> describe<span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    tests = Dsl.<span style="color:#9900CC;">new</span>.<span style="color:#9900CC;">parse</span><span style="color:#006600; font-weight:bold;">&#40;</span>description, block<span style="color:#006600; font-weight:bold;">&#41;</span>
    tests.<span style="color:#9900CC;">execute</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">class</span> <span style="color:#CC00FF; font-weight:bold;">Object</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> should
    <span style="color:#0000FF; font-weight:bold;">self</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">class</span> Dsl
  <span style="color:#9966CC; font-weight:bold;">def</span> initialize
    <span style="color:#0066ff; font-weight:bold;">@tests</span> = <span style="color:#006600; font-weight:bold;">&#123;</span><span style="color:#006600; font-weight:bold;">&#125;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> parse<span style="color:#006600; font-weight:bold;">&#40;</span>description, block<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">instance_eval</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    Executor.<span style="color:#9900CC;">new</span><span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#0066ff; font-weight:bold;">@tests</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> it<span style="color:#006600; font-weight:bold;">&#40;</span>description, <span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0066ff; font-weight:bold;">@tests</span><span style="color:#006600; font-weight:bold;">&#91;</span>description<span style="color:#006600; font-weight:bold;">&#93;</span> = block
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">class</span> Executor
  <span style="color:#9966CC; font-weight:bold;">def</span> initialize<span style="color:#006600; font-weight:bold;">&#40;</span>description, tests<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0066ff; font-weight:bold;">@description</span> = description
    <span style="color:#0066ff; font-weight:bold;">@tests</span> = tests
    <span style="color:#0066ff; font-weight:bold;">@success_count</span> = <span style="color:#006666;">0</span>
    <span style="color:#0066ff; font-weight:bold;">@failure_count</span> = <span style="color:#006666;">0</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> execute
    <span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;#{@description}&quot;</span>
    <span style="color:#0066ff; font-weight:bold;">@tests</span>.<span style="color:#9900CC;">each_pair</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>name, block<span style="color:#006600; font-weight:bold;">|</span>
      <span style="color:#CC0066; font-weight:bold;">print</span> <span style="color:#996600;">&quot; - #{name}&quot;</span>
      result = <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">instance_eval</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&amp;</span>block<span style="color:#006600; font-weight:bold;">&#41;</span>
      result ? <span style="color:#0066ff; font-weight:bold;">@success_count</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span> : <span style="color:#0066ff; font-weight:bold;">@failure_count</span> <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span>
      <span style="color:#CC0066; font-weight:bold;">puts</span> result ? <span style="color:#996600;">&quot; SUCCESS&quot;</span> : <span style="color:#996600;">&quot; FAILURE&quot;</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    summary
  <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">def</span> summary
    <span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;<span style="color:#000099;">\n</span>#{@tests.keys.size} tests, #{@success_count} success, #{@failure_count} failure&quot;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>If we &quot;<em>require</em>&quot; this code and execute our original sample test, we get the following output:</p><pre>some test
 - should be true SUCCESS
 - should show that an expression can be true SUCCESS
 - should be failing deliberately FAILURE
3 tests, 2 success, 1 failure
</pre><p>Nifty! Now, if you&#39;re ever stuck without a unit testing framework and you don&#39;t want to go cowboy, just spend 5 minutes and you can whip up something reasonably viable to tide you over. I am, of course, exaggerating slightly; you will quickly start to miss all the extra expectations, better output, mocking and stubbing etc. However <strong>we can easily enhance our little framework with some of those features (<em>e.g. adding extra DSL elements</em>) &#8211; the development effort is surprisingly small</strong>. If you don&#39;t believe me, have a look at <a
href="https://github.com/chneukirchen/bacon" target="_blank">bacon</a> it is only a couple of hundred lines and is a reasonably complete little Rspec clone. <a
href="https://github.com/skorks/attest" target="_blank">Attest &#8211; the framework that I wrote</a> is another decent example (<em>even if I do say so myself :P</em>). Both of them do still miss any sort of built-in <a
href="http://xunitpatterns.com/Test%20Double.html" target="_blank">test double</a> support, but adding that is a <a
href="http://feeds.feedburner.com/softwaretechandmore">story for another time</a>.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Image by <a
href="http://www.flickr.com/photos/jepoirrier/310155894/" target="_blank">jepoirrier</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/09/true-false-and-nil-objects-in-ruby/' rel='bookmark' title='True, False And Nil Objects In Ruby'>True, False And Nil Objects In Ruby</a></li><li><a
href='http://www.skorks.com/2009/09/ruby-equality-and-object-comparison/' rel='bookmark' title='Ruby Equality And Object Comparison'>Ruby Equality And Object Comparison</a></li><li><a
href='http://www.skorks.com/2009/08/more-advanced-ruby-method-arguments-hashes-and-blocks/' rel='bookmark' title='More Advanced Ruby Method Arguments &#8211; Hashes And Block Basics'>More Advanced Ruby Method Arguments &#8211; Hashes And Block Basics</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=RGLEKjSPHz8:Rb0ksGME4vg:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=RGLEKjSPHz8:Rb0ksGME4vg:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=RGLEKjSPHz8:Rb0ksGME4vg:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/RGLEKjSPHz8" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/02/a-unit-testing-framework-in-44-lines-of-ruby/feed/</wfw:commentRss> <slash:comments>20</slash:comments> </item> <item><title>The Greatest Developer Fallacy Or The Wisest Words You’ll Ever Hear?</title><link>http://www.skorks.com/2011/02/the-greatest-developer-fallacy-or-the-wisest-words-youll-ever-hear/</link> <comments>http://www.skorks.com/2011/02/the-greatest-developer-fallacy-or-the-wisest-words-youll-ever-hear/#comments</comments> <pubDate>Mon, 07 Feb 2011 11:42:37 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[Developers]]></category> <category><![CDATA[developer wisdom]]></category> <category><![CDATA[expertise]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1931</guid> <description><![CDATA[&#34;I will learn it when I need it&#34;! I&#39;ve heard that phrase a lot over the years; it seems like a highly pragmatic attitude to foster when you&#39;re in an industry as fast-paced as software development. On some level it actually IS quite pragmatic, but on another level I am annoyed by the phrase. It [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/04/software-development-and-the-sunk-cost-fallacy/' rel='bookmark' title='Software Development And The Sunk Cost Fallacy'>Software Development And The Sunk Cost Fallacy</a></li><li><a
href='http://www.skorks.com/2008/10/here-are-some-words-that-rhyme-with-orange/' rel='bookmark' title='Here Are Some Words That Rhyme With Orange!'>Here Are Some Words That Rhyme With Orange!</a></li><li><a
href='http://www.skorks.com/2010/03/you-dont-need-math-skills-to-be-a-good-developer-but-you-do-need-them-to-be-a-great-one/' rel='bookmark' title='You Don&#8217;t Need Math Skills To Be A Good Developer But You Do Need Them To Be A Great One'>You Don&#8217;t Need Math Skills To Be A Good Developer But You Do Need Them To Be A Great One</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Wisdom" class="alignleft size-medium wp-image-1936" height="349" hspace="20" src="http://www.skorks.com/wp-content/uploads/2011/02/wisdom-225x300.jpg" style="width: 262px; height: 349px;" title="Wisdom" vspace="5" width="262" />&quot;<em>I will learn it when I need it</em>&quot;! I&#39;ve heard that phrase a lot over the years; it seems like a highly pragmatic attitude to foster when you&#39;re in an industry as fast-paced as software development. On some level it actually IS quite pragmatic, but on another level I am annoyed by the phrase. It has become a mantra for our whole industry which hasn&#39;t changed said industry for the better. The problem is this, <strong>in the guise of sounding like a wise and practical developer, people use it as an excuse to coast</strong>. There is too much stuff to know, it is necessary to be able to pick certain things up as you go along &#8211; part of the job. But, there is a difference between having to &quot;pick up&quot; some knowledge as you go along and doing absolutely everything just-in-time.</p><p>The whole industry has become a bunch of generalists, maybe it has always been this way, I just wasn&#39;t around to see it, either way I don&#39;t like it. Noone wants to invest the time to learn anything really deeply, not <a
href="http://www.skorks.com/2010/04/on-the-value-of-fundamentals-in-software-development/" target="_blank">computer science fundamentals</a>, not the latest tech you&#39;re working with, not even the <a
href="http://blog.tmorris.net/java-is-pass-by-value/" target="_blank">language you&#39;ve been coding in every day, for the last few years</a>. Why bother, it will be replaced, superseded, marginalised and out of fashion before you&#39;re half way done. I&#39;ve discussed this with various people many times, but noone seems to really see it as a problem. &quot;<em>Just being pragmatic dude</em>&quot;. In the meantime we&#39;ve all become clones of each other. You want a Java developer, I am a Java developer, you&#39;re a Java developer, my neighbour is a Java developer. What differentiates us from each other &#8211; not much! Well, I&#39;ve got some jQuery experience. That&#39;s great, so you know how to build accordion menu then? Sure, I Google it and steal the best code I find :). In the meantime, if you need to hire a REAL expert (<em>in anything, maybe you&#39;re writing a fancy parser or need to visualise some big data</em>), I hope you&#39;ve stocked up on beer and sandwiches cause you&#39;re gonna be here a while.</p><p>Ok, there are ways to differentiate yourself, I have better communication skills, which is why I do better. That&#39;s important too, but, <strong>developers differentiating themselves based on soft skills rather than developer skills &#8211; seems a bit twisted</strong>. We all communicate really well but the code is a mess :). Hell, I shouldn&#39;t really talk, I am a bit of a generalist too. Of course I&#39;d like to think of myself as a <a
href="http://darrennegraeff.com/the-importance-of-t-shaped-individuals/" target="_blank">T-shaped individual</a>, but if we&#39;re completely honest, it&#39;s more of a dash-shaped or underscore-shaped with maybe a few bumps :). To the uninitiated those bumps might look like big giant stalactites &#8211; T-shaped indeed. <strong>You seem like an expert without ever being an expert</strong>, just one advantage of being in a sea of generalists.</p><h2>Investing In Your Future</h2><p>I don&#39;t want to preach about how we should all be investing in our professional future, everybody knows we should be. Most people probably think they are infact investing, they rock up to work, write a lot of code maybe even do some reading on the side, surely that must make them an <a
href="http://norvig.com/21-days.html" target="_blank">expert in about 10 years</a>, and a senior expert in 20 (<em>I keep meaning to write more about this, one day <a
href="http://feeds.feedburner.com/softwaretechandmore">I&#39;ll get around to it</a> :)</em>)? But, if that was the way, every old person would be an expert in a whole bunch of stuff and that is emphatically not the case. Maybe it is just that people don&#39;t know how to build expertise (<em>there is an element of truth to this</em>), but I have a sneaking suspicion that <strong>it&#39;s more about lack of desire rather than lack of knowledge</strong>. What was that saying about the will and the way &#8211; totally applicable in this case?</p><p>I&#39;ve gone completely off-track. &quot;<em>Investing in professional future</em>&quot; is just one of those buzzword things, the mantra is &quot;<em>I will learn it when I need it</em>&quot;. It was good enough for my daddy and it has served me well so far. Let&#39;s apply this thinking to finance, &quot;<em>I will invest my money when I think I need the money</em>&quot;. Somehow it doesn&#39;t quite have the same kind of pragmatic ring to it.</p><h2>You Don&#39;t Know What You Don&#39;t Know</h2><p>We&#39;ve all had those moments where you&#39;re going through major pain trying to solve a problem until someone comes along and tells you about algorithm X or technology Y and it makes everything fast and simple. It was lucky that person just happened to be there to show you the &quot;<em>easy</em>&quot; way, otherwise you would have spent days/weeks trying to figure it out and it would have been a mess. You can&#39;t be blamed for this though, you don&#39;t know what you don&#39;t know. For me, this is where the &quot;<em>I will learn it when I need it</em>&quot; mentality falls over. <strong>You can&#39;t learn something if you don&#39;t know it exists</strong>. Google goes a long way towards mitigating this problem, but not all the way. There are plenty of problems you will encounter in the wild where you can beat your head against the wall ad infinitum unless you know what class of problem you&#39;re looking at (<em>e.g. if you know a bit about searching and constraint propagation, <a
href="http://norvig.com/sudoku.html" target="_blank">solving sudoku is easy</a>, otherwise <a
href="http://xprogramming.com/xpmag/OkSudoku" target="_blank">it&#39;s</a> <a
href="http://xprogramming.com/xpmag/Sudoku2" target="_blank">really</a> <a
href="http://xprogramming.com/xpmag/SudokuMusings" target="_blank">quite</a> <a
href="http://xprogramming.com/xpmag/Sudoku4" target="_blank">hard</a></em>). You can&#39;t learn about an algorithm if you&#39;re not aware of it or its applicability. You can&#39;t utilise a technology to solve a problem if you don&#39;t even realise it has that capability. You&#39;re not going to always have someone there to point you in the right direction. I am willing to bet <strong>there is a billion lines of code out there right now which can be replaced with a million lines of faster, cleaner, better code simply because whoever wrote it didn&#39;t know what they didn&#39;t know</strong>.</p><p>I seem to be making a case for the opposite side here, if knowing what you don&#39;t know is the ticket then surely we should be focusing on breadth of knowledge. Superficial awareness of as much stuff as possible should see us through, we&#39;ll be able to recognise the problems when we see them and then learn what we need more deeply. Except it doesn&#39;t work like that, <strong>skimming subjects doesn&#39;t allow you to retain anything</strong>, our brain doesn&#39;t work that way. If we don&#39;t reinforce and dig deeper into the concepts we quickly <a
href="http://www.skorks.com/2009/09/become-a-better-developer-by-indexing-your-brain/" target="_blank">page that information out as unimportant</a>, it is a waste of time (<em>think back to cramming for exams, how much do you remember the next day?</em>). However if you focus on building deeper understanding of a subject &#8211; in an interesting twist &#8211; you will gain broad knowledge as well (<em>which you will actually be able to retain</em>). My grandad is a nuclear physicist, several decades of working to gain deeper knowledge of the subject has made him an expert, but it has also made him an excellent mathematician, a decent chemist, a pretty good geologist, a fair biologist etc. Just some <a
href="http://en.wikipedia.org/wiki/Empirical" target="_blank">empirical evidence</a> that seeking depth leads to breadth as a side-effect.</p><h2>Can You Learn It Fast Enough</h2><p
style="text-align: center;"><img
align="middle" alt="Learn fast" class="aligncenter size-medium wp-image-1938" height="217" src="http://www.skorks.com/wp-content/uploads/2011/02/learn-fast-300x199.jpg" style="width: 328px; height: 217px;" title="Learn fast" vspace="5" width="328" /></p><p>Some stuff just takes a long time to learn. I am confident I can pick up an ORM framework I haven&#39;t seen before without even breaking stride, I&#39;ve used them before, the concepts are the same. But what if you need to do some speech to text conversion, not quite as simple, not enough background. Hopefully Google will have something for us to copy/paste. That was a bad example, only research boffins at universities need to do that crap. How about building a website then, we all know how to do that, but what if you need to do it for 10 million users a day. We just need to learn everything about scaling, <strong>I am sure the users will wait a month or two for us to get up to speed :)</strong>. Yeah, I am just being stupid, all we need to do is hire an expert and &#8230; errr &#8230; oh wait, we&#39;re all out of beer and sandwiches.</p><h2>Why Should I Care</h2><p><strong>Working with experts is freaking awesome</strong>. You may have experienced it before, everything they say is something new and interesting, you learn new tricks with every line of code, you can almost feel your brain expanding :). You want to learn from the experts, so it&#39;s really sad when you can&#39;t find any. Since everyone is only learning when they &quot;<em>need it</em>&quot;, noone can teach anything to anyone. The chunk of wisdom here is this, you want to work with experts, but the experts also want to work with experts, so <strong>what are you doing to make sure the experts want to work with you</strong>? Being able to learn something when you need it is a good skill to have, but you can not let it be your philosophy as a developer. Yes it is a big industry you can&#39;t learn everything, so pick something and make sure you know it backwards, if you&#39;re curious enough to follow up on the interesting bits, you&#39;ll find you have a decent grasp of a lot of other stuff at the end. And if you do a good enough job, other super-awesome-smart people are going to want to come and hang around you cause they&#39;ll be able to learn something from you and you&#39;ll be able to learn much from them. Everybody will be a winner.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Image by <a
href="http://www.flickr.com/photos/samueleghilardi/2971657900/" target="_blank">SamueleGhilardi</a> and <a
href="http://www.flickr.com/photos/specialkrb/3250756763/">SpecialKRB</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/04/software-development-and-the-sunk-cost-fallacy/' rel='bookmark' title='Software Development And The Sunk Cost Fallacy'>Software Development And The Sunk Cost Fallacy</a></li><li><a
href='http://www.skorks.com/2008/10/here-are-some-words-that-rhyme-with-orange/' rel='bookmark' title='Here Are Some Words That Rhyme With Orange!'>Here Are Some Words That Rhyme With Orange!</a></li><li><a
href='http://www.skorks.com/2010/03/you-dont-need-math-skills-to-be-a-good-developer-but-you-do-need-them-to-be-a-great-one/' rel='bookmark' title='You Don&#8217;t Need Math Skills To Be A Good Developer But You Do Need Them To Be A Great One'>You Don&#8217;t Need Math Skills To Be A Good Developer But You Do Need Them To Be A Great One</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=mEqSj_DD-H4:lKX8QL5rY9o:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=mEqSj_DD-H4:lKX8QL5rY9o:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=mEqSj_DD-H4:lKX8QL5rY9o:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/mEqSj_DD-H4" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2011/02/the-greatest-developer-fallacy-or-the-wisest-words-youll-ever-hear/feed/</wfw:commentRss> <slash:comments>49</slash:comments> </item> <item><title>Here Is The Main Reason Why You Suck At Interviews</title><link>http://www.skorks.com/2010/12/here-is-the-main-reason-why-you-suck-at-interviews/</link> <comments>http://www.skorks.com/2010/12/here-is-the-main-reason-why-you-suck-at-interviews/#comments</comments> <pubDate>Wed, 08 Dec 2010 12:11:26 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[People]]></category> <category><![CDATA[coding interview]]></category> <category><![CDATA[interview]]></category> <category><![CDATA[interview process]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1901</guid> <description><![CDATA[I&#39;ve talked about interviews from one perspective or another on several occasions, you might even say it is a pet subject of mine. It&#39;s fascinating because most people are no good at interviews and when it comes to developer interviews &#8211; well; let&#39;s just say there is a whole new dimension for us to suck [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/09/the-best-way-to-interview-a-developer/' rel='bookmark' title='The Best Way To Interview A Developer'>The Best Way To Interview A Developer</a></li><li><a
href='http://www.skorks.com/2010/03/how-to-answer-a-programming-interview-question-and-look-good-doing-it/' rel='bookmark' title='How To Answer A Programming Interview Question And Look Good Doing It'>How To Answer A Programming Interview Question And Look Good Doing It</a></li><li><a
href='http://www.skorks.com/2010/06/when-a-mount-fuji-question-is-not-really-a-mount-fuji-question-are-you-as-smart-as-you-think-you-are/' rel='bookmark' title='When A &#8216;Mount Fuji&#8217; Question Is Not Really A &#8216;Mount Fuji&#8217; Question (Are You As Smart As You Think You Are)'>When A &#8216;Mount Fuji&#8217; Question Is Not Really A &#8216;Mount Fuji&#8217; Question (Are You As Smart As You Think You Are)</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Interview fail" class="alignleft size-medium wp-image-1905" height="201" hspace="25" src="http://www.skorks.com/wp-content/uploads/2010/12/interview-fail-300x201.jpg" title="Interview fail" vspace="5" width="300" />I&#39;ve talked about interviews from <a
href="http://www.skorks.com/2010/03/how-to-answer-a-programming-interview-question-and-look-good-doing-it/" target="_blank">one perspective</a> or <a
href="http://www.skorks.com/2010/04/a-fizzbuzz-faux-pas/" target="_blank">another</a> on several occasions, you might even say it is a pet subject of mine. It&#39;s fascinating because<strong> most people are no good at interviews</strong> and when it comes to developer interviews &#8211; well; let&#39;s just say there is a whole new dimension for us to suck at with coding questions, whiteboards and whatnot. Of course, the other side of the equation is not pristine here, the interviewer can be just as much to blame for a terrible interview, either through lack of training, empathy, preparation or a host of other reasons, but that&#39;s a <a
href="http://feeds.feedburner.com/softwaretechandmore">whole separate discussion</a>. So, why are we so bad at interviews? You can probably think of quite a few reasons straight away:</p><ul><li>it is a high pressure situation, you were nervous</li><li>you just didn&#39;t &quot;<em>click</em>&quot; with the interviewer</li><li>you were asked all the &quot;<em>wrong</em>&quot; questions</li><li>sometimes you just have a bad day</li></ul><p>Infact, you can often work yourself into a self-righteous frenzy after a bad interview, thinking how every circumstance seemed to conspire against you, it was beyond your control, there was nothing you could do &#8211; hell, you didn&#39;t even want to work for that stupid company anyway! But, deep down, we all know that those excuses are just so much bullshit. The truth is there were many things we could have done, but by the time the interview started it was much too late. I&#39;ll use a story to demonstrate.</p><p>The least stressful exam I&rsquo;ve ever had was a computing theory exam in the second year of my computer science degree. I never really got &ldquo;<em>into it</em>&rdquo; during the semester, but a few weeks before exam time &#8211; for some inexplicable reason &#8211; I suddenly found the subject fascinating. I spent hours reading and hours more playing with <a
href="http://en.wikipedia.org/wiki/Context-free_grammar" target="_blank">grammars</a> and <a
href="http://en.wikipedia.org/wiki/Automata_theory" target="_blank">automata</a>. Long story short, when exam time rolled around I knew the material backwards &#8211; I groked it. There was some anxiety (<em>you can&rsquo;t eliminate it fully</em>), but I went into the exam reasonably confident I&rsquo;d blitz it (<em>which I did</em>). <strong>Practice and preparation made all the difference</strong>. Of course, this is hardly a revelation, everyone knows that if you study and practice you&rsquo;ll do well (<em>your parents wouldn&rsquo;t shut up about it for years :)</em>). Interviews are no different from any other skill/subject in this respect, preparation and practice are key.</p><h2>Can You Wing It?</h2><p>The one difference with interviews is that they are an occasional skill, almost meaningless most of the time, but of paramount importance once in a long while. It just doesn&#39;t seem like it&#39;s worth the effort to get good at them, especially if you happen to already have a job at the time (<em>who knows, you may not need those skills for years</em>). There are plenty of other subjects clamouring for your attention and anyway every interview is different, you can never predict what&#39;s gonna happen so it would be stupid to waste your time trying. No &#8211; good communication skills and decent software development knowledge will see you through, right? Except, they don&#39;t and it won&#39;t. Sure, you might be able to stave off total disaster, but <strong>without preparation and practice, you&#39;re mostly relying on luck</strong>. Things &quot;<em>click</em>&quot;; you get asked the &quot;<em>right</em>&quot; questions and are able to think of decent answers just in time. This is how most people get hired. As soon as the process gets a little more rigorous/scientific, so many candidates get weeded out that companies like Google, Facebook, Twitter etc. find themselves trying to steal people from each other since they know that those that have passed the rigorous interview processes of their competitors must be alright. The interesting thing is that the rejected candidates are not necessarily worse; they are often simply a lot less prepared and a little less lucky.</p><p>Over the last few years presentation skills have seen quite a lot of press. Many a blog post and much literature has come out (<em>e.g. <a
href="http://www.amazon.com/Presentation-Zen-Simple-Design-Delivery/dp/0321525655" target="_blank">Presentation Zen</a> and <a
href="http://www.amazon.com/Confessions-Public-Speaker-Scott-Berkun/dp/0596801998" target="_blank">Confessions of a Public Speaker</a> are both great books</em>). Many people have decent knowledge of their subject area and have good communication skills, they think they are excellent presenters &#8211; they are wrong. They put together some slides in a few hours and think their innate abilities will carry them through, but inevitably their presentations end up disjointed, mistargeted, boring or amateurish. Sometimes they sail through on luck, circumstances conspire and the presentation works, but these situations are not common. <a
href="http://www.gladwell.com/" target="_blank">Malcolm Gladwell</a> is a master presenter, he is one of the most highly sought after and highly paid speakers in the world (<em>and has written <a
href="http://www.amazon.com/gp/product/0316017922?ie=UTF8&amp;tag=gladwellcom&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0316017922" target="_blank">a bunch</a> of <a
href="http://www.amazon.com/exec/obidos/ASIN/0316172324/gladwellcom" target="_blank">awesome</a> <a
href="http://www.amazon.com/exec/obidos/ASIN/0316316962/gladwellcom" target="_blank">books</a> to boot</em>) &#8211; this is not by chance. Without doubt he knows his stuff and has better communication skills than the majority of speakers out there and yet all his talks are <a
href="http://blogs.ft.com/rachmanblog/2010/02/the-secrets-of-malcolm-gladwell/" target="_blank">rigorously prepared for and practiced</a>. To my mind,<strong> the situation with interviews is similar to that of presentations</strong>, except the deluge of literature about interviews goes almost unnoticed since they are old-hat. The digital world hasn&#39;t changed the interview process too significantly (<em>unlike the public speaking process</em>), except the internet age brings all the old advice together in one place for us and all of that advice is still surprisingly relevant.</p><h2>The Old-School Advice</h2><p>Everyone (<em>and I mean everyone</em>) always says that you should research the company you&#39;ll be interviewing with beforehand. You would think people would have this one down by now, especially developers cause we&#39;re smart, right? Nope, no such luck, just about everyone who rocks up for an interview knows next to nothing about the company they are trying to get a job at, unless the company is famous, in which case people are just full of hearsay. But hearsay is no substitute for a bit of research and it is so easy, I am reminded of an article <a
href="http://www.shoemoney.com/" target="_blank">Shoemoney</a> wrote about <a
href="http://www.shoemoney.com/2010/09/14/getting-press-for-your-website-application-or-service/" target="_blank">getting press</a> (<em>well worth a read by the way, if you&#39;re trying to promote a product/service</em>) &#8211; there is a tremendous amount of info you can find out about a person by trawling the web for a bit and it is just as easy to learn something about a company. I mean, we do work in software, so any company you may want to work for should have a web presence and a half. And even if web info is sparse there is your social network or picking up the phone and seeing if someone will trade a coffee for some info. Yeah, you go to a bit of trouble but the fact that you did will be apparent in an interview, I mean, <strong>there is a reason why I work where I work and I&#39;d prefer to work with other people who give a shit</strong> (<em>everyone would</em>) &#8211; you savvy? Of course if you do go to the trouble to find out what skills/tech/knowledge/processes a company is looking for/values you may be able to anticipate where an interview might head, the value there should be self-evident.</p><p>Which leads me to interview questions. When it comes to developers, there are three types of questions we struggle with/despise:</p><ul><li>the behavioural/culture fit</li><li>the coding question</li><li>the <a
href="http://www.skorks.com/2010/06/when-a-mount-fuji-question-is-not-really-a-mount-fuji-question-are-you-as-smart-as-you-think-you-are/" target="_blank">Mount Fuji question</a></li></ul><p>With a bit of practice you can blitz all of these. The Fujis are the hardest, but even they can be prepared for, but I&#39;ll get back to those shortly.</p><h3>Behavioural Questions</h3><p>The behavioural questions seem annoyingly difficult but are actually the easiest. You know the ones &quot;<em>Give me an example of a time when you demonstrated leadership/communication skills/problem solving/conflict resolution</em>&quot;. It is always the same question, with a different attribute of yours that you have to <a
href="http://encarta.msn.com/dictionary_561508115/spruik.html" target="_blank">spruik</a>. Being in essence the same question you can address all of them with what amounts to the same answer, once again substituting a different attribute. These are actually difficult to handle on the spot, but if you have something prepared it is a breeze. Example:</p><blockquote><p>&quot;There was a time, when Bill the developer was being an obstinate bastard and wouldn&#39;t buy in to the awesome that the rest of us were peddling, but I took charge of the situation and was able to convince Bill blah, blah &#8230;.&quot; &#8211; <strong>leadership</strong></p></blockquote><blockquote><p>&quot;Bill the contrary developer just wouldn&#39;t agree with the way we were dishing out the awesome, but I took Bill out for coffee and we hashed it out one on one blah, blah &#8230;&quot; &#8211; <strong>communication</strong></p></blockquote><blockquote><p>&quot;There was insufficient awesome to go around and Bill and Joe just couldn&#39;t share it and were coming to blows, but I stepped in and took them both to a whiteboard, we had a long chat blah, blah &#8230;&quot; &#8211; <strong>conflict resolution</strong></p></blockquote><p>As long as you think up a situation beforehand you can adapt it to answer any behavioural question, the situation can even be a fictitious, but you do need to think through it carefully for 10-15 minutes to be able to come up with something coherent. You will never have time to frame a coherent reply in the interview itself. Of course, it is best to have a few situations prepared just so you can change it up a bit, variety never hurt anyone. If the company you&#39;re interviewing with is enlightened they won&#39;t ask you these questions, but will instead focus on your dev skills, but there are many companies and few are enlightened, <strong>might as well prepare for the worst case and be pleasantly surprised if the best case happens</strong>.</p><h3>Coding Questions</h3><p
style="text-align: center;"><img
alt="Coding question" class="aligncenter size-medium wp-image-1906" height="202" src="http://www.skorks.com/wp-content/uploads/2010/12/coding-question-300x189.jpg" style="width: 321px; height: 202px;" title="Coding question" vspace="5" width="321" /></p><p>Talking about dev skills, the one thing that just about every company that hires developers will do, would be to ask a coding question at some stage. These usually take the form of a sandbox question or what I call a Uni question. You know the ones, &quot;<em>Reverse a list</em>&quot;, &quot;<em>Implement a linked list</em>&quot; it&#39;s as if they are under the impression you studied computer science at some point, go figure :). <strong>People struggle here, because you just don&#39;t come across this kind of question in your day-to-day work</strong>. If they asked you to implement a Twitter or Facebook clone, you could really show them your chops, but balancing a binary tree &#8211; who the hell remembers how to do that? And that&#39;s the rub, you probably could dredge the information from the depths of your grey matter, but by the time you do the interview will have been long over. Because you don&#39;t do this kind of question daily, you brain has dumped the info you need to tape and sent it to Switzerland (<em>cause backups should be kept off-premises</em>). The answer is simple; you gotta practice these questions well in advance of the interview. Preparation baby, that&#39;s the ticket. Preferably you should be doing them regularly regardless of your employment status cause those questions are fun and bite-sized and give your brain a bit of a workout &#8211; you&#39;ll be a better developer for it. The most interesting thing though is this, you do enough of them and you won&#39;t really encounter anything new in an interview. There are really not so many themes when it comes to these questions, it will be the same formulae you&#39;ll just need to plug in different numbers (<em>a simplification but not too far from reality</em>). I really gotta follow my own advice here. If you seriously want a leg up, there are books specifically about this, I prefer <a
href="http://www.amazon.com/Cracking-Coding-Interview-Fourth-Programming/dp/145157827X" target="_blank">Cracking the Coding Interview</a> but there is also <a
href="http://www.amazon.com/Programming-Interviews-Exposed-Secrets-Programmer/dp/047012167X/ref=dp_ob_title_bk" target="_blank">Programming Interviews Exposed</a> &#8211; read them, do the questions.</p><h3>Puzzles</h3><p>Mount Fuji questions are the most controversial, but regardless of whether you hate them or not, even they can be practiced. Yeah, alright, maybe you&#39;re willing to walk out if any company ever dares to ask you about manhole covers, but <strong>I&#39;d much rather answer the questions and then walk out in self-righteous satisfaction rejecting the offers they&#39;ll be throwing at my feet, than storm out in self-righteous anger knowing deep down that I was a wuss for doing so</strong>. And anyway, I&#39;d like to think that I&#39;ve already demonstrated that <a
href="http://www.skorks.com/2010/06/when-a-mount-fuji-question-is-not-really-a-mount-fuji-question-are-you-as-smart-as-you-think-you-are/" target="_blank">not all Mount Fuji questions are created equal</a>, some are coding problems, others deal with concurrency, others still might be <a
href="http://en.wikipedia.org/wiki/Back-of-the-envelope_calculation" target="_blank">back-of-the-envelope questions</a> (<em>the story of how these originated is actually pretty interesting, I am <a
href="http://feeds.feedburner.com/softwaretechandmore">writing it up as we speak</a></em>). The subset of questions that are simply a useless puzzle are a lot smaller than you might think. Doing these kinds of questions in your spare time is just another way to build your skills with a side benefit that you become an interview-proof individual. Of course, many people also enjoy an occasional puzzle &#8211; I&#39;m just saying.</p><p>There is still lots more to say, I haven&#39;t even begun to talk about attitude, but this is already a <strong>tl;dr</strong> candidate, so I&#39;ll leave that discussion for another time. Let&#39;s sum up, <strong>if you feel that you suck at interviews, it&#39;s because you didn&#39;t prepare well enough and haven&#39;t had enough practice</strong> &#8211; it is that simple. As an interviewer it is most disheartening to see an unprepared candidate, there are just so many of them, on the other hand a person who is clearly on the ball is just awesome. And I am well aware that practicing interviews is hard, there is no <a
href="http://www.toastmasters.org.au/" target="_blank">Toastmasters</a> equivalent for that particular skill, but thorough preparation can significantly mitigate that issue. Even a few minutes of preparation will put you head and shoulders above most other people since the vast majority don&#39;t bother at all. So, take the time to practice/prepare if you have an interview on the horizon, be smart about it and it is bound to pay off, not to mention a better experience for the interviewer as well, more fun and less stress for everyone.</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Images by <a
href="http://www.flickr.com/photos/carowallis1/4463478302/" target="_blank">Caro Wallis</a> and <a
href="http://www.flickr.com/photos/27048731@N03/4105630566/" target="_blank">louisvolant</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2009/09/the-best-way-to-interview-a-developer/' rel='bookmark' title='The Best Way To Interview A Developer'>The Best Way To Interview A Developer</a></li><li><a
href='http://www.skorks.com/2010/03/how-to-answer-a-programming-interview-question-and-look-good-doing-it/' rel='bookmark' title='How To Answer A Programming Interview Question And Look Good Doing It'>How To Answer A Programming Interview Question And Look Good Doing It</a></li><li><a
href='http://www.skorks.com/2010/06/when-a-mount-fuji-question-is-not-really-a-mount-fuji-question-are-you-as-smart-as-you-think-you-are/' rel='bookmark' title='When A &#8216;Mount Fuji&#8217; Question Is Not Really A &#8216;Mount Fuji&#8217; Question (Are You As Smart As You Think You Are)'>When A &#8216;Mount Fuji&#8217; Question Is Not Really A &#8216;Mount Fuji&#8217; Question (Are You As Smart As You Think You Are)</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=P4mdrN2Diu8:faaWdknqnts:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=P4mdrN2Diu8:faaWdknqnts:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=P4mdrN2Diu8:faaWdknqnts:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/P4mdrN2Diu8" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2010/12/here-is-the-main-reason-why-you-suck-at-interviews/feed/</wfw:commentRss> <slash:comments>14</slash:comments> </item> <item><title>Converting Integers To Words – Bringing Order To English Through Code</title><link>http://www.skorks.com/2010/11/converting-integers-to-words-bringing-order-to-english-through-code/</link> <comments>http://www.skorks.com/2010/11/converting-integers-to-words-bringing-order-to-english-through-code/#comments</comments> <pubDate>Fri, 05 Nov 2010 14:58:35 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[english]]></category> <category><![CDATA[lisp]]></category> <category><![CDATA[practice]]></category> <category><![CDATA[programming]]></category> <category><![CDATA[ruby]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1891</guid> <description><![CDATA[For a few years now, I&#39;ve been meaning to read Peter Norvig&#39;s old book, &#34;Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp&#34;, but every time I started I&#39;d inevitably get lost in the code due to my poor Lisp skills. For some reason I just couldn&#39;t absorb the introductory Lisp chapters at the [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2008/09/tweaking-english-for-fun-and-profit-facilitating-poetry/' rel='bookmark' title='Tweaking English For Fun And Profit &#8230; Facilitating Poetry'>Tweaking English For Fun And Profit &#8230; Facilitating Poetry</a></li><li><a
href='http://www.skorks.com/2008/10/here-are-some-words-that-rhyme-with-orange/' rel='bookmark' title='Here Are Some Words That Rhyme With Orange!'>Here Are Some Words That Rhyme With Orange!</a></li><li><a
href='http://www.skorks.com/2010/03/timing-ruby-code-it-is-easy-with-benchmark/' rel='bookmark' title='Timing Ruby Code &#8211; It Is Easy With Benchmark'>Timing Ruby Code &#8211; It Is Easy With Benchmark</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Integer" class="alignleft size-medium wp-image-1898" height="225" hspace="20" src="http://www.skorks.com/wp-content/uploads/2010/11/integer-300x225.jpg" title="Integer" vspace="5" width="300" />For a few years now, I&#39;ve been meaning to read Peter Norvig&#39;s old book, &quot;<a
href="http://www.amazon.com/Paradigms-Artificial-Intelligence-Programming-Studies/dp/1558601910" target="_blank"><em>Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp</em></a>&quot;, but every time I started I&#39;d inevitably get lost in the code due to my poor Lisp skills. For some reason I just couldn&#39;t absorb the introductory Lisp chapters at the beginning of the book. So a little while ago I decided to &quot;<em>begin at the beginning</em>&quot; and slowly started making my way through Peter Seibel&#39;s &quot;<a
href="http://www.gigamonkeys.com/book/" target="_blank"><em>Practical Common Lisp</em></a>&quot;.</p><p>Inevitably, one of the first things I learned about was the &quot;<em>format</em>&quot; function. <em>Format</em>, is essentially the Lisp equivalent of <em>printf</em>, so you kinda need it to produce the old &quot;<em>Hello World</em>&quot;. Just like <em>printf </em>and its ilk, <em>format </em>supports string interpolation via embedded directives. For example if you wanted to print two strings side by side but separate them by a specific number of spaces you could do something like this:</p><div
class="wp_syntax"><div
class="code"><pre class="lisp" style="font-family:monospace;">CL-USER<span style="color: #66cc66;">&gt;</span> <span style="color: #66cc66;">&#40;</span>format t <span style="color: #ff0000;">&quot;~a~40t~a~%&quot;</span> <span style="color: #ff0000;">&quot;abc&quot;</span> <span style="color: #ff0000;">&quot;123&quot;</span><span style="color: #66cc66;">&#41;</span>
abc                                     <span style="color: #cc66cc;">123</span>
<span style="color: #b1b100;">NIL</span></pre></div></div><p><strong>One of the most interesting format directives is ~r which will take an integer and produce its English representation</strong> e.g.:</p><div
class="wp_syntax"><div
class="code"><pre class="lisp" style="font-family:monospace;">CL-USER<span style="color: #66cc66;">&gt;</span> <span style="color: #66cc66;">&#40;</span>format <span style="color: #b1b100;">nil</span> <span style="color: #ff0000;">&quot;~r&quot;</span> <span style="color: #cc66cc;">123456</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #ff0000;">&quot;one hundred twenty-three thousand four hundred fifty-six&quot;</span></pre></div></div><p>I found this little gem in one of the footnotes, the lesson here being &#8211; <strong>always read the footnotes</strong>. Anyway, I don&#39;t know about you, but I thought that was pretty awesome, which immediately got me thinking about how hard it would be to implement a stand-alone integer to English translation function. I would have loved to try my hand at doing this in Lisp, but I don&#39;t think I am sufficiently Jedi for that as yet (<em>I will however give it a go as soon as I am ready, <a
href="http://feeds.feedburner.com/softwaretechandmore">you will not want to miss <strong>that </strong></a>:)</em>), so instead I decided to have a go at implementing it in Ruby.</p><blockquote><h3>A Totally Unrelated Musical Trivia Interlude</h3><p>Speaking of Jedi, I&#39;ve recently been a little obsessed with the &quot;<em>Good Morning</em>&quot; song from &quot;<a
href="http://www.imdb.com/title/tt0045152/" target="_blank"><em>Singing In The Rain</em></a>&quot;, you know <a
href="http://www.youtube.com/watch?v=bU2zoQ8gV-s&amp;feature=related" target="_blank">the one I mean</a>. I blame &quot;<em>Family Guy</em>&quot; for this, it all started after I watched the episode where they <a
href="http://www.youtube.com/watch?v=4KyvoUmzMAc" target="_blank">perform this song</a> in one of their postmodern flashes of randomness. Anyway, it&#39;s just such a nice and happy song, it&#39;s kinda quaint how 1:30 used to be considered &quot;<em>late</em>&quot; in those days and I dig the whole synchronized tap dancing thing that they do (<em>just imagine how much they had to practice to get it right</em>).</p><p>Now, here is the trivia, the girl in the song/movie is a 20 year old <a
href="http://en.wikipedia.org/wiki/Debbie_Reynolds" target="_blank">Debbie Reynolds</a> (<em>did you know that before &quot;Singing In The Rain&quot; she didn&#39;t even know how to dance</em>) who went on to marry singer <a
href="http://en.wikipedia.org/wiki/Eddie_Fisher_%28singer%29" target="_blank">Eddie Fisher</a>. A few years later she gave birth to a daughter named Carrie (i.e. <a
href="http://en.wikipedia.org/wiki/Carrie_Fisher" target="_blank">Carrie Fisher</a>) who, in turn, went on to play Princess Leia in &quot;<em>Star Wars</em>&quot;. That&#39;s how all of this relates to Jedi, and you thought I was going off on a random tangent &#8211; now you know better :). Time to get back to some Ruby code.</p></blockquote><h2>A Ruby Based Integer To English Converter</h2><p>It turns out writing it was somewhat harder than I expected because <strong>the English language is a little insane when it comes to naming numbers</strong>. Everything was going well at the start; it was obvious that we need some kind of lookup table to convert digits to words. The lookup table must contain all the uniquely named numbers that we are likely to encounter.</p><ul><li>all number between 1 and 10 (<em>e.g. one, two, five etc.</em>)</li><li>all number between 11 and 20 (<em>e.g. eleven, twelve, sixteen etc.</em>)</li><li>all multiples of 10 between 20 and 100 (<em>e.g. thirty, forty etc.</em>)</li><li>all uniquely named powers of 2 over 100 (<em>e.g. hundred, thousand, million, billion etc.</em>)</li></ul><p>Here is a cut-down version:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#006600; font-weight:bold;">&#123;</span>
<span style="color:#006666;">1</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;one&quot;</span>,
<span style="color:#006666;">2</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;two&quot;</span>,
<span style="color:#006666;">3</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;three&quot;</span>,
...
<span style="color:#006666;">19</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;nineteen&quot;</span>,
<span style="color:#006666;">20</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;twenty&quot;</span>,
...
<span style="color:#006666;">90</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;ninety&quot;</span>,
<span style="color:#006666;">100</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;hundred&quot;</span>,
<span style="color:#006666;">1000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;thousand&quot;</span>,
<span style="color:#006666;">1000000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;million&quot;</span>,
...
<span style="color:#006600; font-weight:bold;">&#125;</span></pre></div></div><p>We don&#39;t really think about it, since we are so used to it, but that is actually a lot more unique numbers than strictly necessary. <strong>You can easily identify all possible numbers using just 1 through 9 as well as all the named powers of 10</strong>, such as ten, hundred, thousand etc (<em>we&#39;ll explore this thought further shortly</em>). Doing it this way would be pretty logical, instead the English speaking world decided that we needed nine special words to represent numbers between 11 and 19 as well as eight more similar, but still unique, words for the multiples of 10 between 20 and 100. As much as we take it for granted this causes all sorts of pain when we try to implement out integer to words converter.</p><p>Consider this, we&#39;re given an integer several digits long, let&#39;s say 12.</p><ul><li>to get a word representation we start looking at the digits of the integer in reverse order (<em>we have to start with the least significant digit first</em>):<ul><li>digits = 21</li></ul></li><li>we use the look-up table to find the word that represents the first digit and add it to an accumulator (<em>which is a list that will contain all the words in the English representation of our integer</em>):<ul><li>digits = <strong>2</strong>1, accumulator = [<strong>two</strong>]</li></ul></li></ul><p>But as soon as we move on to the second digit, we&#39;re already in trouble. If this digit is a one, we know we have a number between 11 and 19, so we have to do the following:</p><ul><li>backtrack the previous digit from the accumulator:<ul><li>digits = 2<strong>1</strong>, accumulator = []</li></ul></li><li>get the current digit along with the previous digit and reverse them to get our actual number:<ul><li>digits = <strong>12</strong>, accumulator = []</li></ul></li><li>do a lookup to get the new word representation and put the new representation in the accumulator:<ul><li>digits = <strong>12</strong>, accumulator = [<strong>twelve</strong>]</li></ul></li></ul><p>The story is similar if the second digit is 2 through 9, we still have to backtrack, then we need to do a lookup on the second digit multiplied by 10 (i.e. 20, 30 etc), combine the result with the word representation of the previous digit and put that back in the accumulator. So if the number is 20, we need to look up 20, combine that with 5 and get twenty five which ends up in the accumulator.</p><p>When the number is big enough, this patters comes up again and again every few digits. For example, numbers such as 11000, 23000, 134000 need to be represented as <strong>eleven</strong> thousand, <strong>twenty three</strong> thousand and one hundred and <strong>thirty four</strong> thousand (<em>see what I mean</em>). Luckily it is a pattern so we can handle all occurrences of it as a special case, but it&#39;s a pattern that need not be. And talking about patterns that need not be, what is our obsession with the word hundred:</p><ul><li>three <strong>hundred </strong>and fifty</li><li>one <strong>hundred </strong>and twenty thousand three <strong>hundred </strong>and fifty</li><li>one <strong>hundred </strong>and twenty five million three <strong>hundred </strong>and twenty thousand five <strong>hundred </strong>and six</li></ul><p>We have the word hundred appearing 3 times in that last one, we&#39;re used to it and it feels comfortable, but is it strictly necessary? We have to handle that one as a special case as well. Needless to say, the code ends up being long and ugly, you can have a look at the <a
href="https://gist.github.com/661002" target="_blank">full source here</a>, but the main chunk of it is something along the following lines:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  <span style="color:#9966CC; font-weight:bold;">def</span> convert<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#996600;">&quot;zero&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> number == <span style="color:#006666;">0</span>
    word_representation_accumulator = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    number_digits_reversed = number.<span style="color:#9900CC;">to_s</span>.<span style="color:#9900CC;">reverse</span>
    digit_count = <span style="color:#006666;">0</span>
    number_digits_reversed.<span style="color:#9900CC;">chars</span>.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>digit, index<span style="color:#006600; font-weight:bold;">|</span>
      digit_as_number = <span style="color:#CC0066; font-weight:bold;">Integer</span><span style="color:#006600; font-weight:bold;">&#40;</span>digit<span style="color:#006600; font-weight:bold;">&#41;</span>
      skip_zero<span style="color:#006600; font-weight:bold;">&#40;</span>digit_as_number<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
        <span style="color:#9966CC; font-weight:bold;">if</span> digit_count == <span style="color:#006666;">0</span>
          word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#996600;">&quot;#{NTW[digit_as_number]}&quot;</span>
        <span style="color:#9966CC; font-weight:bold;">elsif</span> ten_to_twenty?<span style="color:#006600; font-weight:bold;">&#40;</span>digit_as_number, digit_count<span style="color:#006600; font-weight:bold;">&#41;</span>
          backtrack word_representation_accumulator
          actual_number = <span style="color:#CC0066; font-weight:bold;">Integer</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;#{digit}#{number_digits_reversed[index - 1]}&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
          multiplier = <span style="color:#006600; font-weight:bold;">&#40;</span>digit_count <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">1</span> ? <span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">**</span><span style="color:#006600; font-weight:bold;">&#40;</span>digit_count <span style="color:#006600; font-weight:bold;">-</span> <span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span> : <span style="color:#0000FF; font-weight:bold;">nil</span><span style="color:#006600; font-weight:bold;">&#41;</span>
          word_representation = <span style="color:#996600;">&quot;#{NTW[actual_number]}&quot;</span>
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; #{NTW[multiplier]}&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> multiplier
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; and&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> word_representation_accumulator.<span style="color:#9900CC;">size</span> == <span style="color:#006666;">1</span>
          word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> word_representation
        <span style="color:#9966CC; font-weight:bold;">elsif</span> twenty_to_one_hundred?<span style="color:#006600; font-weight:bold;">&#40;</span>digit_count<span style="color:#006600; font-weight:bold;">&#41;</span>
          backtrack word_representation_accumulator
          multiplier = <span style="color:#006600; font-weight:bold;">&#40;</span>digit_count <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">1</span> ? <span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">**</span><span style="color:#006600; font-weight:bold;">&#40;</span>digit_count <span style="color:#006600; font-weight:bold;">-</span> <span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span> : <span style="color:#0000FF; font-weight:bold;">nil</span><span style="color:#006600; font-weight:bold;">&#41;</span>
          lookup_number = digit_as_number <span style="color:#006600; font-weight:bold;">*</span> <span style="color:#006666;">10</span>
          word_representation = <span style="color:#996600;">&quot;#{NTW[lookup_number]}&quot;</span>
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; #{NTW[Integer(number_digits_reversed[index - 1])]}&quot;</span>
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; #{NTW[multiplier]}&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> multiplier
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; and&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> word_representation_accumulator.<span style="color:#9900CC;">size</span> == <span style="color:#006666;">1</span>
          word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> word_representation
        <span style="color:#9966CC; font-weight:bold;">elsif</span> digit_count == <span style="color:#006666;">2</span> <span style="color:#006600; font-weight:bold;">||</span> digit_count <span style="color:#006600; font-weight:bold;">%</span> <span style="color:#006666;">3</span> == <span style="color:#006666;">2</span>
          multiplier = <span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">**</span><span style="color:#006666;">2</span>
          word_representation = <span style="color:#996600;">&quot;#{NTW[digit_as_number]} #{NTW[multiplier]}&quot;</span>
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; and&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> word_representation_accumulator.<span style="color:#9900CC;">size</span> != <span style="color:#006666;">0</span>
          word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> word_representation
        <span style="color:#9966CC; font-weight:bold;">else</span>
          multiplier = <span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">**</span>digit_count
          word_representation = <span style="color:#996600;">&quot;#{NTW[digit_as_number]} #{NTW[multiplier]}&quot;</span>
          word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; and&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> word_representation_accumulator.<span style="color:#9900CC;">size</span> == <span style="color:#006666;">1</span>
          word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> word_representation
        <span style="color:#9966CC; font-weight:bold;">end</span>
      <span style="color:#9966CC; font-weight:bold;">end</span>
      digit_count <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    word_representation_accumulator.<span style="color:#9900CC;">reverse</span>.<span style="color:#9900CC;">join</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot; &quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>You can have a go at running it, but don&#39;t forget, I only took the lookup table up to trillions, so if you want to try numbers bigger than that you will need to add more <a
href="http://en.wikipedia.org/wiki/Names_of_large_numbers" target="_blank">named powers of ten</a> (<em>such as quadrillion, quintillion etc.</em>) to the table. Here is some sample output:</p><pre>315119 - three hundred and fifteen thousand one hundred and nineteen
1000001 - one million and one
1315119 - one million three hundred and fifteen thousand one hundred and nineteen
11315119 - eleven million three hundred and fifteen thousand one hundred and nineteen</pre><h2>Using Code To Derive The English That Should Have Been</h2><p>Even though my code was producing reasonable output, I wasn&#39;t entirely happy, the arbitrary nature of the English language made my code uglier than it could have been. Since I&#39;ve <a
href="http://www.skorks.com/2008/09/tweaking-english-for-fun-and-profit-facilitating-poetry/" target="_blank">taken liberties with the English language</a> before I decided this was a good time to do it again. What if we were to remove all the extraneous words that English had for numbers. Surely, the numbers 1 through 9 are more than sufficient? This would be the simplest possible case e.g.:</p><pre>135 - one three five
3619 - three six one nine</pre><p>Unfortunately it just doesn&#39;t cut it. Firstly we would have to introduce 0 as a non-silent number e.g.:</p><pre>609 - six zero nine</pre><p>Currently, zero is completely silent in all numbers unless it is by itself e.g.:</p><pre>609 - six hundred and nine (see, zero is silent)</pre><p>But even if this wasn&#39;t an issue, it turns out <strong>we actually need the named powers of ten to make English representations of integers meaningful</strong>. <em>One five six two three</em>, tells us almost nothing, but <em>fifteen thousand six hundred and twenty three</em> gives us a lot of information instantly. Even when we haven&#39;t processed the number fully, we know straight away that it is approximately <em>15 thousand and a few hundred</em>. The bigger the number gets, the more pronounced this &quot;<em>estimation</em>&quot; effect is. So, we need numbers one through nine as well as named powers of 10. This allows us to cut our lookup table down to the following:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  NTW = <span style="color:#006600; font-weight:bold;">&#123;</span>
  <span style="color:#006666;">1</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;one&quot;</span>,
  <span style="color:#006666;">2</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;two&quot;</span>,
  <span style="color:#006666;">3</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;three&quot;</span>,
  <span style="color:#006666;">4</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;four&quot;</span>,
  <span style="color:#006666;">5</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;five&quot;</span>,
  <span style="color:#006666;">6</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;six&quot;</span>,
  <span style="color:#006666;">7</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;seven&quot;</span>,
  <span style="color:#006666;">8</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;eight&quot;</span>,
  <span style="color:#006666;">9</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;nine&quot;</span>,
  <span style="color:#006666;">10</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;ten&quot;</span>,
  <span style="color:#006666;">100</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;hundred&quot;</span>,
  <span style="color:#006666;">1000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;thousand&quot;</span>,
  <span style="color:#006666;">1000000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;million&quot;</span>,
  <span style="color:#006666;">1000000000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;billion&quot;</span>,
  <span style="color:#006666;">1000000000000</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;trillion&quot;</span>
  <span style="color:#006600; font-weight:bold;">&#125;</span></pre></div></div><p>We can use this table to write code that is much more intuitive than our previous attempt. Infact, we don&#39;t even need to know exactly what we&#39;re trying to achieve, instead we let the code guide us to the &quot;<em>correct</em>&quot; solution. We&#39;re using code to derive the requirements for the real world which is the opposite of the usual scenario. The full implementation I came up with <a
href="https://gist.github.com/662350" target="_blank">is in this gist</a>, but here is the main bit:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;">  <span style="color:#9966CC; font-weight:bold;">def</span> convert<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#996600;">&quot;zero&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> number == <span style="color:#006666;">0</span>
    word_representation_accumulator = <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    number_digits_reversed = number.<span style="color:#9900CC;">to_s</span>.<span style="color:#9900CC;">reverse</span>
    digit_count = <span style="color:#006666;">0</span>
    number_digits_reversed.<span style="color:#9900CC;">chars</span>.<span style="color:#9900CC;">each_with_index</span> <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>digit, index<span style="color:#006600; font-weight:bold;">|</span>
      digit_as_number = <span style="color:#CC0066; font-weight:bold;">Integer</span><span style="color:#006600; font-weight:bold;">&#40;</span>digit<span style="color:#006600; font-weight:bold;">&#41;</span>
      skip_zero<span style="color:#006600; font-weight:bold;">&#40;</span>digit_as_number<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span>
        multiplier = <span style="color:#006666;">10</span><span style="color:#006600; font-weight:bold;">**</span>digit_count
        word_representation = <span style="color:#996600;">&quot;#{NTW[digit_as_number]}&quot;</span>
        word_representation <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#996600;">&quot; #{NTW[multiplier]}&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> multiplier <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">1</span> <span style="color:#006600; font-weight:bold;">&amp;&amp;</span> NTW<span style="color:#006600; font-weight:bold;">&#91;</span>multiplier<span style="color:#006600; font-weight:bold;">&#93;</span>
        word_representation_accumulator <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> word_representation
      <span style="color:#9966CC; font-weight:bold;">end</span>
      digit_count <span style="color:#006600; font-weight:bold;">+</span>= <span style="color:#006666;">1</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    word_representation_accumulator.<span style="color:#9900CC;">reverse</span>.<span style="color:#9900CC;">join</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot; &quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>You have to admit, it&#39;s much shorter and nicer looking and here is some of the output produced:</p><pre>0 - zero
1 - one
3 - three
5 - five
11 - one ten one
15 - one ten five
25 - two ten five
71 - seven ten one
40 - four ten
100 - one hundred
101 - one hundred one
112 - one hundred one ten two
123 - one hundred two ten three
457 - four hundred five ten seven
999 - nine hundred nine ten nine
1000 - one thousand
1001 - one thousand one
1010 - one thousand one ten
1011 - one thousand one ten one
2117 - two thousand one hundred one ten seven
3001 - three thousand one
13101 - one three thousand one hundred one
14001 - one four thousand one
16000 - one six thousand
25119 - two five thousand one hundred one ten nine
65009 - six five thousand nine
315119 - three one five thousand one hundred one ten nine
1000001 - one million one
1315119 - one million three one five thousand one hundred one ten nine
11315119 - one one million three one five thousand one hundred one ten nine
74315119 - seven four million three one five thousand one hundred one ten nine
174315119 - one seven four million three one five thousand one hundred one ten nine
1174315119 - one billion one seven four million three one five thousand one hundred one ten nine
15174315119 - one five billion one seven four million three one five thousand one hundred one ten nine
35174315119 - three five billion one seven four million three one five thousand one hundred one ten nine
935174315119 - nine three five billion one seven four million three one five thousand one hundred one ten nine
2935174315119 - two trillion nine three five billion one seven four million three one five thousand one hundred one ten nine
</pre><p>As you can see, all the named powers of ten are now treated in exactly the same way i.e.:</p><ul><li>10 is ten to the power of 1</li><li>100 is ten to the power of 2</li><li>etc.</li></ul><p>So 101 is <strong>one hundred one</strong> and 11 is <strong>one ten one</strong>, 203 is <strong>two hundred three</strong>, while 23 is<strong> two ten three</strong> &#8211; logical isn&#39;t it? There is no longer any need for any specially named numbers between 10 and 100 since we can represent all of them just as efficiently with this notation; this means no more special cases for numbers under 100. The other one we had was all the extra usages of the word hundred. We simply eliminate them all together (<em>except for when we are talking about numbers between 100 and 1000</em>), <strong>three one five thousand</strong> conveys the same amount of information as <strong>three hundred and fifteen thousand</strong> &#8211; the word hundred is totally redundant. Now, every named power of 10 appears only once in the English representation of the integer which allows us to still retain the ability to &quot;<em>estimate</em>&quot; the magnitude instantly without any extraneous information:</p><pre>one million three one five thousand one hundred one ten nine</pre><p>Short, consistent and to the point. It does seem a little awkward at first, but after a few minutes it starts to grow on you &#8211; after all, <strong>it is much more logical than the system we have now</strong>. There you go, deriving &quot;<em>better</em>&quot; English through some Ruby code. Not only do we get superior language but it is also easier, faster and less error prone to code it &#8211; FTW brothers, FTW :)!!!</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Image by <a
href="http://www.flickr.com/photos/plindberg/132364351/" target="_blank">plindberg</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2008/09/tweaking-english-for-fun-and-profit-facilitating-poetry/' rel='bookmark' title='Tweaking English For Fun And Profit &#8230; Facilitating Poetry'>Tweaking English For Fun And Profit &#8230; Facilitating Poetry</a></li><li><a
href='http://www.skorks.com/2008/10/here-are-some-words-that-rhyme-with-orange/' rel='bookmark' title='Here Are Some Words That Rhyme With Orange!'>Here Are Some Words That Rhyme With Orange!</a></li><li><a
href='http://www.skorks.com/2010/03/timing-ruby-code-it-is-easy-with-benchmark/' rel='bookmark' title='Timing Ruby Code &#8211; It Is Easy With Benchmark'>Timing Ruby Code &#8211; It Is Easy With Benchmark</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=0lwFPukADY4:A-BisPeTGT8:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=0lwFPukADY4:A-BisPeTGT8:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=0lwFPukADY4:A-BisPeTGT8:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/0lwFPukADY4" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2010/11/converting-integers-to-words-bringing-order-to-english-through-code/feed/</wfw:commentRss> <slash:comments>27</slash:comments> </item> <item><title>Write A Function To Determine If A Number Is A Power Of 2</title><link>http://www.skorks.com/2010/10/write-a-function-to-determine-if-a-number-is-a-power-of-2/</link> <comments>http://www.skorks.com/2010/10/write-a-function-to-determine-if-a-number-is-a-power-of-2/#comments</comments> <pubDate>Mon, 18 Oct 2010 11:59:57 +0000</pubDate> <dc:creator>Alan Skorkin</dc:creator> <category><![CDATA[coding]]></category> <category><![CDATA[bit hacking]]></category> <category><![CDATA[interviews]]></category> <category><![CDATA[recursion]]></category> <guid isPermaLink="false">http://www.skorks.com/?p=1882</guid> <description><![CDATA[One of my friends always asks that question when interviewing developers. Apparently it&#8217;s quite ambiguous and many people misunderstand &#8211; which is curious since I thought it was rather straight forward, but that&#8217;s not what this story is about. When he first told me about this question my brain immediately went ahead and tried to [...]
<strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/writing-a-more-ruby-ish-array-intersection-function-and-sorting-structs/' rel='bookmark' title='Writing A More Ruby-ish Array Intersection Function And Sorting Structs'>Writing A More Ruby-ish Array Intersection Function And Sorting Structs</a></li><li><a
href='http://www.skorks.com/2009/07/how-to-write-a-web-crawler-in-ruby/' rel='bookmark' title='How To Write A Simple Web Crawler In Ruby'>How To Write A Simple Web Crawler In Ruby</a></li><li><a
href='http://www.skorks.com/2008/10/are-you-using-the-full-power-of-spring-when-injecting-your-dependencies/' rel='bookmark' title='Are You Using The Full Power Of Spring When Injecting Your Dependencies?'>Are You Using The Full Power Of Spring When Injecting Your Dependencies?</a></li></ol>]]></description> <content:encoded><![CDATA[<p></p><p><img
align="left" alt="Power of 2" class="alignleft size-medium wp-image-1888" height="300" hspace="20" src="http://www.skorks.com/wp-content/uploads/2010/10/powerof2-225x300.jpg" title="Power of 2" vspace="5" width="225" />One of my friends always asks that question when interviewing developers. Apparently it&rsquo;s quite ambiguous and many people misunderstand &#8211; which is curious since I thought it was rather straight forward, but that&rsquo;s not what this story is about.</p><p>When he first told me about this question my brain immediately went ahead and tried to solve it, as your brain probably did as soon as you read the title of this post, if you&rsquo;re a developer :). After thinking about it for a couple of minutes I said that <strong>people should get <span
style="font-style: italic;">&quot;</span><em>extra points</em>&quot; if their function uses bit hackery to solve the problem.</strong> I later realised that the most interesting thing about this was how I arrived at that conclusion.</p><h2>The First Thing I Thought Of Was&#8230;</h2><p>Something along the lines of the following:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> power_of_2?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
 <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">false</span> <span style="color:#9966CC; font-weight:bold;">if</span> number == <span style="color:#006666;">0</span>
 <span style="color:#9966CC; font-weight:bold;">while</span><span style="color:#006600; font-weight:bold;">&#40;</span>number <span style="color:#006600; font-weight:bold;">%</span> <span style="color:#006666;">2</span> == <span style="color:#006666;">0</span><span style="color:#006600; font-weight:bold;">&#41;</span>
   number = number <span style="color:#006600; font-weight:bold;">/</span> <span style="color:#006666;">2</span>
 <span style="color:#9966CC; font-weight:bold;">end</span>
 <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">false</span> <span style="color:#9966CC; font-weight:bold;">if</span> number <span style="color:#006600; font-weight:bold;">&gt;</span> <span style="color:#006666;">1</span>
 <span style="color:#0000FF; font-weight:bold;">true</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>Of course it didn&rsquo;t spring forth out of my head fully formed like that, but the general gist was the same. This code solves the problem and, as you can see it is an iterative solution. The first programming language I learned was Java, the second was C, so you might say I was weaned on the &ldquo;<em>iterative approach</em>&rdquo;. So, whenever I am presented with a new problem, my mind always gropes for the iterative solution first. You might say I find it the most &ldquo;<em>natural</em>&rdquo; way to solve a problem.</p><h2>The Second Thing I Though Of Was&#8230;</h2><p>Something that looked like this:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> power_of_2?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
 <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">true</span> <span style="color:#9966CC; font-weight:bold;">if</span> number == <span style="color:#006666;">1</span>
 <span style="color:#0000FF; font-weight:bold;">return</span> <span style="color:#0000FF; font-weight:bold;">false</span> <span style="color:#9966CC; font-weight:bold;">if</span> number == <span style="color:#006666;">0</span> <span style="color:#006600; font-weight:bold;">||</span> number <span style="color:#006600; font-weight:bold;">%</span> <span style="color:#006666;">2</span> != <span style="color:#006666;">0</span>
 power_of_2?<span style="color:#006600; font-weight:bold;">&#40;</span>number <span style="color:#006600; font-weight:bold;">/</span> <span style="color:#006666;">2</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>Once again, it solves the problem, but this time recursively. Since those early days of Java and C, I&rsquo;ve played around with a whole bunch of languages, some just as <a
href="http://en.wikipedia.org/wiki/Imperative_programming" target="_blank">imperative</a>, but others were <a
href="http://en.wikipedia.org/wiki/Functional_programming" target="_blank">functional</a> or had functional capabilities. In such languages, recursion is the norm or at least a lot more prevalent. I find that I really like recursive solutions, it is often a very elegant way to get around some ugly code. Even though it is not the first stop my brain makes when it comes to solving programming problems, I find that these days I almost always consider if there might be a recursive way to approach a problem. I don&rsquo;t yet always instinctively feel if a problem has a recursive answer, but in many cases (<em>like this one</em>) it is fairly simple. As long as you can divide the initial problem into two smaller parts which are &ldquo;<em>equivalent</em>&rdquo; to the original problem, you can solve it recursively (<em>like when you split a tree you get two smaller trees</em>).</p><p><strong>I do wonder if programmers who were &ldquo;<em>weaned</em>&rdquo; on functional languages would go for the recursive solution first and if they find one do they bother to consider an iterative one at all</strong>? If you&rsquo;re such a developer, please share how you think.</p><h2>The Next Thing I Thought Of Wasn&rsquo;t Code</h2><p>It is only recently that I have started to give bit hacking the attention it deserves. It always used to seem like it wasn&rsquo;t worth the trouble, but since I&rsquo;ve learned a bit more about <a
href="http://en.wikipedia.org/wiki/Bloom_filter" target="_blank">Bloom Filters</a> (I&rsquo;ll <a
href="http://feeds.feedburner.com/softwaretechandmore">write it up one of these days</a>, hopefully make it a bit easier to grok) and finally got my hands on a copy of &ldquo;<a
href="http://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp/0201657880" target="_blank">Programming Pearls</a>&rdquo;, I&rsquo;ve become a bit of a convert. So, whenever I hear &ldquo;<em>power of 2</em>&rdquo; or anything similar these days, I always think that there is surely some bit hacking to be done :), which is exactly what my brain latched on to after the iterative and recursive approaches.</p><p>Consider that any number that is a power of 2 has exactly one bit set in its binary representation e.g.:</p><pre>2&nbsp;&nbsp;&nbsp; =&nbsp; 00000010
4&nbsp;&nbsp;&nbsp; =&nbsp; 00000100
8&nbsp;&nbsp;&nbsp; =&nbsp; 00001000
16&nbsp;&nbsp; =&nbsp; 00010000
256&nbsp; = 100000000</pre><p>If we subtract 1 from any of these numbers we get the following:</p><pre>2 - 1 = 1 = 00000001
4 - 1 = 3 = 00000011
8 - 1 = 7 = 00000111</pre><p>Every bit that was less significant than the original interesting bit (<em>the one that was set</em>), is now set while the original bit is unset. If we now do a bitwise and (<em>&amp; operator</em>) on the original number and the number which results when 1 is subtracted, we always get zero:</p><pre>2 &amp; 1 = 00000010 &amp; 00000001 = 0
8 &amp; 7 = 00001000 &amp; 00000111 = 0</pre><p>This will only happen if a number is a power of two. Therefore we can write the following code:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#9966CC; font-weight:bold;">def</span> power_of_2?<span style="color:#006600; font-weight:bold;">&#40;</span>number<span style="color:#006600; font-weight:bold;">&#41;</span>
 number != <span style="color:#006666;">0</span> <span style="color:#006600; font-weight:bold;">&amp;&amp;</span> number <span style="color:#006600; font-weight:bold;">&amp;</span> <span style="color:#006600; font-weight:bold;">&#40;</span>number <span style="color:#006600; font-weight:bold;">-</span> <span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span> == <span style="color:#006666;">0</span>
<span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div><p>This is apparently a pretty well known way to determine if a number is a power of 2. <strong>In the olden days when computing resources were scarce, every programmer probably knew about this shortcut and others like it, these days &#8211; not so much</strong>. But, you have to admit it is by far the cleanest solution of the three, even the need to treat zero separately makes it only marginally less elegant. Definitely a handy thing to remember (<em>it does still come up occasionally in the real world</em>).</p><p>I do wonder though, is anyone&rsquo;s brain sufficiently indoctrinated into bit hackery that they would jump to this solution first without even considering an iterative or a recursive one? Anyone?</p><h2>If Someone Ever Actually Asks You This In An Interview&#8230;</h2><p>The first thing to do would be to write (<em>or at least mention</em>) some tests. They don&rsquo;t have to be proper unit tests in this case (<em>although for more complex questions it may be warranted</em>), but you will never go wrong if you show that you&rsquo;re aware of testing. In this case something along the lines of this is probably fine:</p><div
class="wp_syntax"><div
class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 2&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> !power_of_2? <span style="color:#006666;">2</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 4&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> !power_of_2? <span style="color:#006666;">4</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 1024&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> !power_of_2? <span style="color:#006666;">1024</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 1&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> !power_of_2? <span style="color:#006666;">1</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 1025&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> power_of_2? <span style="color:#006666;">1025</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 0&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> power_of_2? <span style="color:#006666;">0</span>
<span style="color:#CC0066; font-weight:bold;">puts</span> <span style="color:#996600;">&quot;fail 3&quot;</span> <span style="color:#9966CC; font-weight:bold;">if</span> power_of_2? <span style="color:#006666;">3</span></pre></div></div><p>0,1,2 and 3 are the cases where our function is likeliest to break the rest are included for completeness. The other thing that may be worth mentioning is the fact that the behaviour of the function for negative integers or floats is undefined, you can of course handle those cases, but it adds extra complexity and I think simply showing that you&rsquo;re aware of them is enough.</p><p>After this you can go ahead and write your function, remember &#8211; bit hackery means extra points (<em>as far as I am concerned anyway :)</em>).</p><p><span
style="font-size: 10px; font-family: trebuchet ms;">Image by <a
href="http://www.flickr.com/photos/matthijs/15828566/" target="_blank">.m for matthijs</a></span></p><p><strong>Related posts:</strong><ol><li><a
href='http://www.skorks.com/2010/03/writing-a-more-ruby-ish-array-intersection-function-and-sorting-structs/' rel='bookmark' title='Writing A More Ruby-ish Array Intersection Function And Sorting Structs'>Writing A More Ruby-ish Array Intersection Function And Sorting Structs</a></li><li><a
href='http://www.skorks.com/2009/07/how-to-write-a-web-crawler-in-ruby/' rel='bookmark' title='How To Write A Simple Web Crawler In Ruby'>How To Write A Simple Web Crawler In Ruby</a></li><li><a
href='http://www.skorks.com/2008/10/are-you-using-the-full-power-of-spring-when-injecting-your-dependencies/' rel='bookmark' title='Are You Using The Full Power Of Spring When Injecting Your Dependencies?'>Are You Using The Full Power Of Spring When Injecting Your Dependencies?</a></li></ol></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=I9og5sOYxJI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=cGdyc7Q-1BI" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=CbnnuLG7tiY:kfKbxoUWsWM:gIN9vFwOqvQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/softwaretechandmore?a=CbnnuLG7tiY:kfKbxoUWsWM:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/softwaretechandmore?i=CbnnuLG7tiY:kfKbxoUWsWM:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/softwaretechandmore/~4/CbnnuLG7tiY" height="1" width="1"/>]]></content:encoded> <wfw:commentRss>http://www.skorks.com/2010/10/write-a-function-to-determine-if-a-number-is-a-power-of-2/feed/</wfw:commentRss> <slash:comments>57</slash:comments> </item> </channel> </rss><!-- Dynamic page generated in 8.825 seconds. --><!-- Cached page generated by WP-Super-Cache on 2012-05-15 16:16:33 --><!-- Compression = gzip -->

