<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:posterous="http://posterous.com/help/rss/1.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
  <channel>
    <title>Alex Bowe</title>
    <link>http://www.alexbowe.com</link>
    <description>Most recent posts at Alex Bowe</description>
    <generator>posterous.com</generator>
    <link xmlns="http://www.w3.org/2005/Atom" href="http://posterous.com/api/sup_update#806a65dd6" type="application/json" rel="http://api.friendfeed.com/2008/03#sup" />
    
    
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/alexbowe" /><feedburner:info uri="alexbowe" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://posterous.superfeedr.com/" /><item>
      <pubDate>Tue, 06 Sep 2011 02:32:00 -0700</pubDate>
      <title>Failing at Google Interviews</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/4oArooSG440/failing-at-google-interviews</link>
      <guid isPermaLink="false">http://www.alexbowe.com/failing-at-google-interviews</guid>
      <description>&lt;p&gt;
	&lt;p&gt;I&amp;rsquo;ve participated in about four &lt;em&gt;sets&lt;/em&gt; of interviews (of about 3 interviews each) for various Google positions. I&amp;rsquo;m still not a Googler though, which I guess indicates that I&amp;rsquo;m not the best person to give this advice. However, I think it&amp;rsquo;s about time I put in my 0.002372757892 bitcoins. I recently did exactly this to help my brother prepare for his interviews and the guy kicked ass. If he gets the job I&amp;rsquo;m going to take as much credit for it as I can ;)&lt;/p&gt;

&lt;p&gt;During my interviews I didn&amp;rsquo;t sign a NDA, but I do respect the effort that interviewers put into preparing their questions so I&amp;rsquo;m not going to discuss them. That doesn&amp;rsquo;t matter though, because you probably won&amp;rsquo;t get the same questions anyway :) and the algorithm stuff is far from the whole story.&lt;/p&gt;

&lt;p&gt;This post is mainly about the rituals I perform during preparation for the interviews, and the lessons I have learned from them. I am of the strong opinion that everyone should apply for a job at Google.&lt;/p&gt;

&lt;h2&gt;Why Should I?!&lt;/h2&gt;

&lt;p&gt;Not everyone wants to work for Google, but there are valuable side effects to a Google interview. Even if you don&amp;rsquo;t think you want a job there, or think that you are under-qualified, it is a great idea to just try for one. The absolute worst thing that could happen is that you have fun and learn something.&lt;/p&gt;

&lt;p&gt;A couple of the things I learned are algorithms for (weighted) random sampling, queueing, vector calculus, and some cool applications of bloom filters.&lt;/p&gt;

&lt;p&gt;The people you will talk to are smart, and it&amp;rsquo;s a fun experience to be able to solve problems with smart and passionate people. One of my interviews was just a discussion about the good and bad parts (in our opinions) of a bunch of programming languages (Scheme, Python, C, C++, Java, Erlang). We discussed &lt;a href="http://mitpress.mit.edu/sicp/"&gt;SICP&lt;/a&gt; and the current state of education, and he recommended some research papers for me to read. All the intriguing questions and back-and-forth made me feel like I was being taught by a modern &lt;a href="http://en.wikipedia.org/wiki/Socratic_method"&gt;Socrates&lt;/a&gt; (perhaps Google should consider offering a Computer Science degree taught entirely with interviews :P).&lt;/p&gt;

&lt;p&gt;Sadly, a subsequent interview stumped me because I didn&amp;rsquo;t understand the requirements. Even the stumping interviews have given me a great chance to realise some gaps in my knowledge and refine my approach. I knew that it was important to get the requirements right, but this really drove it home.&lt;/p&gt;

&lt;p&gt;I hope I&amp;rsquo;ve got you curious about what you could learn from a Google interview. If you are worried about the possible rejection, treat it as a win in a game of &lt;a href="http://rejectiontherapy.com/rules/"&gt;Rejection Therapy&lt;/a&gt;. You can re-apply as many times as you like, so you could also think of it as &lt;a href="http://en.wikipedia.org/wiki/Test-driven_development"&gt;TDD&lt;/a&gt; for your skills, and you like TDD, right?&lt;/p&gt;

&lt;h2&gt;How To Prepare &amp;ndash; Technical&lt;/h2&gt;

&lt;p&gt;When you are accepted for a phone interview, Google sends you an email giving you tips on how to prepare. Interestingly, this has been a different list each time. I&amp;rsquo;ll discuss the one I liked the most. They only give advice on the technical side. I will also discuss what I think are some other important aspects to be mindful of.&lt;/p&gt;

&lt;p&gt;First of all, you are going to want to practice. Even if you have been coding every day for years, you might not be used to the short question style. &lt;a href="http://projecteuler.net/"&gt;Project Euler&lt;/a&gt; is the bomb for this. You will learn some maths too, which will come in handy, and it builds confidence. Do at least one of these every day until your interview.&lt;/p&gt;

&lt;p&gt;You will also want some reading material. Google recommended &lt;a href="http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html"&gt;this post by Steve Yegge&lt;/a&gt;, which does a good job of calming you. They also recommended &lt;a href="http://sites.google.com/site/steveyegge2/five-essential-phone-screen-questions"&gt;another post by Steve Yegge&lt;/a&gt; where he covers some styles of questions that are likely to be asked. Yegge recommends
a particular book very highly &amp;ndash; &lt;a href="http://www.algorist.com/"&gt;The Algorithm Design Manual&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="" src="http://img138.imageshack.us/img138/3826/thealgorithmdesignmanua.jpg" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;blockquote class="posterous_medium_quote"&gt;&lt;p&gt;More than any other book it helped me understand just how astonishingly commonplace (and important)
graph problems are – they should be part of every working programmer&amp;rsquo;s toolkit. The book also covers
basic data structures and sorting algorithms, which is a nice bonus. But the gold mine is the second half
of the book, which is a sort of encyclopedia of 1-pagers on zillions of useful problems and various ways to
solve them, without too much detail. Almost every 1-pager has a simple picture, making it easy to remember.
This is a great way to learn how to identify hundreds of problem types.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I haven&amp;rsquo;t read the whole thing, but what I have read of it is eye and mind opening. This wasn&amp;rsquo;t recommended to me directly by Google recruiting staff, but one of my interviewers emailed me a bunch of links after, including a link to the page for this book. There was a recent review of this book featured on &lt;a href="http://eriwen.com/books/best-algorithms-book/"&gt;Hacker News&lt;/a&gt;. It is very good. The author, Steve Skiena, also offers his &lt;a href="http://www.cs.sunysb.edu/~algorith/video-lectures/"&gt;lecture videos and slides&lt;/a&gt; &amp;ndash; kick back and watch them with a beer after work/uni.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpcmbelllabsc_rtdya" height="254" src="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/fAEjFCHfwGHIhqlCBitwDJiugCxpdxCenwxJyatdzDtrbDCJCbdhttzmkHbB/media_httpcmbelllabsc_rtdya.jpg.scaled500.jpg" width="202" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;If the size of The Algorithm Design Manual is daunting and you want a short book to conquer quickly (for morale reasons), give &lt;a href="http://cm.bell-labs.com/cm/cs/pearls/"&gt;Programming Pearls&lt;/a&gt; a read. Answer as many questions in it as you can.&lt;/p&gt;

&lt;p&gt;The phone interviews usually are accompanied by a Google doc for you to program into. I usually nominate Python as my preferred language, but usually they make me use C or C++ (they often say I can use Java too). I was rusty with my C++ syntax, but they didn&amp;rsquo;t seem to mind. I just explained things like using templates, even though I can never remember the syntax for the tricks.&lt;/p&gt;

&lt;p&gt;Speaking of tricks, you get style points for using features of the language that are less well known. I had an interviewer say he was impressed because I used Pythons pattern matching (simple example: &lt;code&gt;(a, b) = (b, a)&lt;/code&gt;), list comprehensions, map/reduce, generators, lambdas, and I guess decorators could help make you look cool, too. Only use them if they are &lt;em&gt;useful&lt;/em&gt; though!&lt;/p&gt;

&lt;h2&gt;How To Prepare &amp;ndash; Non-Technical&lt;/h2&gt;

&lt;p&gt;There will also be a few non-technical questions. When I did my first one, a friend recommended that I have answers ready for cookie-cutter questions like &amp;ldquo;Where do you see yourself in ten years?&amp;rdquo; and &amp;ldquo;Why do you want to work for Google?&amp;rdquo;. Don&amp;rsquo;t bother with that! Do you really think one of the biggest companies in the world will waste their time asking questions like that? Everyone candidate would say the same answer, something about leading a team and how Google would let you contribute to society, or whatever (great, but everyone wants that).&lt;/p&gt;

&lt;p&gt;They &lt;em&gt;will&lt;/em&gt; ask you about your previous work and education, though, and pretty much always ask about a technical challenge you overcame. I like to talk about a fun iterative A* search I did at my first job (and why we needed it to be iterative). You can probably think of something, don&amp;rsquo;t stress, but better to think of it before the interview.&lt;/p&gt;

&lt;p&gt;And have a question ready for when they let you have your turn. Don&amp;rsquo;t search for &lt;em&gt;&amp;ldquo;good questions to ask in technical interviews&amp;rdquo;&lt;/em&gt;, because if it isn&amp;rsquo;t &lt;em&gt;your&lt;/em&gt; question, you might be uninterested if the interviewer talks about it for a long time. Think of something that you could have a discussion about, something you are opinionated about. Think of something you hated at a previous job (but don&amp;rsquo;t come across as bitter), how you would improve that, and then ask them if they do that. For me, I was interested in the code review process at Google, and what sort of project they would start a beginner be assigned to.&lt;/p&gt;

&lt;p&gt;I know someone who asked questions from &lt;a href="http://www.joelonsoftware.com/articles/fog0000000043.html"&gt;The Joel Test&lt;/a&gt;. The interviewer might recognise these questions and either congratulate you on reading blogs about your field, or quietly yawn to themselves. It&amp;rsquo;s up if you want to take that risk. I definitely think it&amp;rsquo;s better to ask about something that has the potential to annoy you on a personal level if they don&amp;rsquo;t give you the answer you want ;) it&amp;rsquo;s subtle but people can detect your healthy arrogance and passion.&lt;/p&gt;

&lt;p&gt;If you have a tech blog, refer to it. I&amp;rsquo;ve had interviewers discuss my posts with me (which they found from my resume). Blogs aren&amp;rsquo;t hard to write, and even a few posts on an otherwise barren blog will make you look more thoughtful.&lt;/p&gt;

&lt;p&gt;Finally, the absolute best way to prepare for a Google interview is to do more Google interviews, so if you fail, good for you! ;)&lt;/p&gt;

&lt;h2&gt;Just Before the Interview&lt;/h2&gt;

&lt;p&gt;Here are a few things that help me handle the pressure before an interview.&lt;/p&gt;

&lt;p&gt;One time I was walking to an interview in the city (not a Google interview) and I was really nervous, even though I didn&amp;rsquo;t care either way if I got the job. I thought about how the nerves wouldn&amp;rsquo;t be an issue after the interview, because I&amp;rsquo;d have already done the scary thing by then. I couldn&amp;rsquo;t time travel, but I instead wondered if there is a way to use up the nerves on something else. There was a girl walking next to me, so I
turned to her and said she was dressed very nicely. She said a timid &amp;ldquo;thank you&amp;rdquo; and picked up pace to get away from me. I laughed at my failure, but suddenly I didn&amp;rsquo;t feel so scared about the interview. I think this is a great example of why &lt;a href="http://rejectiontherapy.com/rules/"&gt;Rejection Therapy&lt;/a&gt; is worthwhile.&lt;/p&gt;

&lt;p&gt;So yeah, talk to a stranger. If you are waiting at home for a phone call though, another thing I do is jack jumps, dancing, or jogging on the spot just to make myself forget the other reason my heart is pounding so fast.&lt;/p&gt;

&lt;h2&gt;During the Interview&lt;/h2&gt;

&lt;p&gt;If you are doing a phone interview, answer it standing up (you can sit down after) and pace around a little bit. Smile as you talk, as well. You should also take down their name on paper ready to use a few times casually. These are tricks from the infamous &lt;a href="http://en.wikipedia.org/wiki/How_to_Win_Friends_and_Influence_People"&gt;How to Win Friends and Influence People&lt;/a&gt;. Maybe these alone wont make you likeable, but I think it causes you to think about the other person and stop being so self conscious, which helps you to relax. You&amp;rsquo;ll be &lt;a href="http://www.youtube.com/watch?v=Fx93E0wTih4"&gt;one charming motherfucking pig&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Take some time to think before answering, and especially to seek clarification on the questions. Ask what the data representation is. I&amp;rsquo;ve found that they tend to say &amp;ldquo;whatever you want&amp;rdquo;. In a graph question, I said &amp;ldquo;Okay, then it&amp;rsquo;s an adjacency matrix&amp;rdquo;, which made the question over and done with in ten seconds. The interviewer seemed to like that, so don&amp;rsquo;t be afraid to be a (humble) smart ass.&lt;/p&gt;

&lt;p&gt;You might recognise the adjacency matrix as potentially being a very poor choice, depending on the nature of the graph. I did discuss when this might not be a good option. In fact, for every question, I start off by describing a naive approach, and then refine it. This helps to verify the question requirements, and gives you an easy starting point.&lt;/p&gt;

&lt;p&gt;One last thing! Google schedules the interview to be from 45 minutes to an hour. I have had awkward moments at the end of interviews where the interviewer mentions that our time is nearly up, and &lt;em&gt;then&lt;/em&gt; asks another question, or asks if I have any questions. It made me feel like he was in a rush, so I didn&amp;rsquo;t feel like expanding on things much. Now, I recommend taking as much time as they will give you. Keep talking until they hang up on you if you have to :) Although it might help to say &amp;ldquo;I don&amp;rsquo;t mind if we go over, as long as I&amp;rsquo;m not keeping you from something&amp;rdquo; when the interviewer mentions the time.&lt;/p&gt;

&lt;h2&gt;Reflect&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html"&gt;Steve Yegge&lt;/a&gt; says there are lots of smart Googlers who didn&amp;rsquo;t get in until their third attempt (I still haven&amp;rsquo;t gotten in after my fourth, and I don&amp;rsquo;t think I&amp;rsquo;m stupid). As I mentioned, I&amp;rsquo;m writing this post because I found the process of doing a Google interview at all to be very rewarding.&lt;/p&gt;

&lt;p&gt;It is important to reflect afterwards in order to reap the full benefits of interviewing at Google. If you did well, why? But more importantly, if you feel you did poorly, why? Google won&amp;rsquo;t give feedback, which can be a bit depressing at times. After each interview write notes about what you felt went well and what didn&amp;rsquo;t &amp;ndash; this way you can look back if you don&amp;rsquo;t get the job, and decide what you need to work on. This post is the culmination of my reflections and the notes &amp;ndash; if you decide to write a blog post, I&amp;rsquo;d enjoy reading it and will link it here.&lt;/p&gt;

&lt;p&gt;If you want more blog posts to read about how to get better at Computer Science, I recently found &lt;a href="http://matt.might.net/articles/what-cs-majors-should-know/"&gt;this post by Matt Might&lt;/a&gt; to be a good target to aim for. Check out &lt;a href="http://profshonle.blogspot.com/2010/08/ten-things-every-computer-science-major.html"&gt;Ten Things Every Computer Science Major Should Learn&lt;/a&gt; as well, and my previous post &lt;a href="http://www.alexbowe.com/advice-to-cs-undergrads"&gt;Advice to CS Undergrads&lt;/a&gt; (the links at the end in particular).&lt;/p&gt;

&lt;p&gt;Have you really read this far? Consider adding me to &lt;a href="http://www.twitter.com/alexbowe"&gt;Twitter&lt;/a&gt; and telling me what you thought :)&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/failing-at-google-interviews"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/failing-at-google-interviews#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/KvPtPP-HEqDQwUvRjm-AvufBfeI/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KvPtPP-HEqDQwUvRjm-AvufBfeI/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/KvPtPP-HEqDQwUvRjm-AvufBfeI/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KvPtPP-HEqDQwUvRjm-AvufBfeI/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/4oArooSG440" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="421" width="300" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/fqjFupvAlkdAtHpCwopGDtBotunspdxeiggnyDnzHEbqfByatcxkkpGetuJC/media_httpimg138image_cCzIm.jpg">
        <media:thumbnail height="421" width="300" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/fqjFupvAlkdAtHpCwopGDtBotunspdxeiggnyDnzHEbqfByatcxkkpGetuJC/media_httpimg138image_cCzIm.jpg.scaled500.jpg" />
      </media:content>
      <media:content type="image/jpeg" height="254" width="202" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/fAEjFCHfwGHIhqlCBitwDJiugCxpdxCenwxJyatdzDtrbDCJCbdhttzmkHbB/media_httpcmbelllabsc_rtdya.jpg">
        <media:thumbnail height="254" width="202" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/fAEjFCHfwGHIhqlCBitwDJiugCxpdxCenwxJyatdzDtrbDCJCbdhttzmkHbB/media_httpcmbelllabsc_rtdya.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/failing-at-google-interviews</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 23 Aug 2011 07:02:00 -0700</pubDate>
      <title>FM-Indexes and Backwards Search</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/ff6FTc8v-NI/fm-indexes-and-backwards-search-32172</link>
      <guid isPermaLink="false">http://www.alexbowe.com/fm-indexes-and-backwards-search-32172</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/bkDoDCBtcxiGvGFyqglwfxHjyoCuyxwmemiaGarwkCpqqnlFBAGcgBJxgAvI/media_httpalexbowes3a_mxdzx.png.scaled1000.png"&gt;&lt;img alt="Media_httpalexbowes3a_mxdzx" height="193" src="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/bkDoDCBtcxiGvGFyqglwfxHjyoCuyxwmemiaGarwkCpqqnlFBAGcgBJxgAvI/media_httpalexbowes3a_mxdzx.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Last time (way back in June! I have got to start blogging consistently again) I discussed a gorgeous data structure called the Wavelet Tree. When a Wavelet Tree is stored using RRR sequences, it can answer rank and select operations in O(log A) time, where A is the size of the alphabet. If the size of the alphabet is 2, we could just use RRR by itself, which answers rank and select in O(1) time for binary strings. RRR also compresses the binary strings, and hence compresses a Wavelet Tree which is stored using RRR.&lt;/p&gt;

&lt;p&gt;So far so good, but I suspect rank and select queries seem to be of limited use right now (although once you are familiar with the available structures, applications show up often). One of the neatest uses of rank that I&amp;rsquo;ve seen is in substring search, which is certainly a wide reaching problem (for a very recent application to genome assembly, see Jared Simpson&amp;rsquo;s paper from 2010 called &lt;em&gt;Efficient construction of an assembly string graph using the FM-index&lt;/em&gt;)&lt;/p&gt;

&lt;p&gt;Also, my apologies but I am using 1-basing instead of 0-basing, because that is how I did my diagrams a year ago :)
(and bear with my lack of nicely typeset math, I am migrating to GitHub pages where I will be allowed to use Mathjax soon)&lt;/p&gt;

&lt;h1&gt;Suffix Arrays&lt;/h1&gt;

&lt;p&gt;There is a variety of Suffix Array construction algorithms, including some O(N) ones (Puglisi et al. 2007). However, I will explain it from the most common (and intuitive) angle.&lt;/p&gt;

&lt;p&gt;In its simplest form, a suffix array can be constructed for a string S[1..N] like so:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Construct an array of pointers to all suffixes S[1..N], S[2..N], &amp;hellip;, S[N..N].&lt;/li&gt;
&lt;li&gt;Sort these pointers by the lexicographical (i.e. alphabetical) ordering of their associated suffixes.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_giuel" height="38" src="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/eIsactEgjhxsuDablqiHddeBFpurqIbDBgFwivEyfncpvsjcHygbyzaBIcck/media_httpalexbowes3a_gIuEl.png.scaled500.png" width="264" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;For example, the sorting of the string &amp;lsquo;mississippi&amp;rsquo; with terminating character $ would look like this:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_qbwxw" height="218" src="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/oAAFCAAhHnEegHeChHhwsEcfvsodBJumnvyEvuarCvGmqDbAzerBshrpGCzu/media_httpalexbowes3a_qBwxw.png.scaled500.png" width="332" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;h1&gt;Burrows-Wheeler Transform&lt;/h1&gt;

&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Burrows-Wheeler_transform"&gt;Burrows-Wheeler Transform&lt;/a&gt; (BWT) is a was developed by Burrows and Wheeler to reversibly permute a string in such a way that characters from repeated substrings would be clustered together. It was useful for compression schemes such as run-length encoding.&lt;/p&gt;

&lt;p&gt;It is not the point of this blog to explain how it works, but it is closely linked to Suffix Arrays: BWT[i] = S[SA[i] &amp;ndash; 1, BWT[1] = $ (it wraps around) for the original string S, Suffix Array SA, and Burrows-Wheeler Transform string BWT. In other words, the ith symbol of the BWT is the symbol &lt;em&gt;just before&lt;/em&gt; the ith suffix. See the image below:&lt;/p&gt;

&lt;p&gt;In particular, BWT[1] = S[SA[1] &amp;ndash; 1] = S[12 &amp;ndash; 1] = S[11] = &amp;lsquo;i&amp;rsquo; (or the 11th symbol from the original string &amp;lsquo;mississippi&amp;rsquo;)&lt;/p&gt;

&lt;p&gt;Ferragina and Manzini (Ferragina et al. 2000) recommended that a BWT be paired with a Suffix Array, creating the so-called FM-Index, which enables backward search. The BWT also lets us reconstruct the original string S (not covered in this blog), allowing us to discard the original document &amp;ndash; indexes with this property are known as &lt;em&gt;self indexes&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_rejrx" height="219" src="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/yBvqdHucrfpmApHcjgFrGlaHDgDJlGJdbeddmmoiuaDztlGtHCrJaqtemdxn/media_httpalexbowes3a_reJrx.png.scaled500.png" width="259" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;h1&gt;Backward Search&lt;/h1&gt;

&lt;p&gt;This is where rank comes in. If it is hard to follow (it is certainly not easy to explain) then hang in there until the example, which should clear things up.&lt;/p&gt;

&lt;p&gt;Since any pattern P in S (the original string) is a &lt;em&gt;prefix&lt;/em&gt; of a &lt;em&gt;suffix&lt;/em&gt; (our Suffix Array stores suffixes), and because the suffixes are lexicographically ordered, all occurrences of a search pattern P lie in a contiguous portion of the Suffix Array. One way to hone in on our search term is to use successive binary searches. Storing the BWT lets us use a cooler way, though&amp;hellip;&lt;/p&gt;

&lt;p&gt;Backward search instead utilises the BWT in a series of paired &lt;a href="http://www.alexbowe.com/wavelet-trees"&gt;rank queries&lt;/a&gt; (which can be answered with a Wavelet Tree, for example), improving the query performance considerably.
Backward search issues p pairs of rank queries, where p denotes the length of the pattern P. The paired rank queries are:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmathurlcom4_aychb" height="43" src="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/lgBawjynwzikxkeevbCzDlEsJbliwmotcjfFaHvuEFeJvJtEevxnkjhufuhB/media_httpmathurlcom4_ayChb.png.scaled500.png" width="253" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Where s denotes the start of the range and e is the end of the range. Initially s = 1 and e = N. If at any stage e &amp;lt; s, then P doesn&amp;rsquo;t exist in S.&lt;/p&gt;

&lt;p&gt;As for C&amp;hellip; C is a lookup table containing the count of all symbols in our alphabet which sort lexicographically before P[i]. What does this mean? Well, C would look like this:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_dauos" height="62" src="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/euyCgesfserHbEtuvzvfpqkxnFieakDfacDswansminBEcykDgazyAgafAzn/media_httpalexbowes3a_DAuos.png.scaled500.png" width="168" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Which means that there aren&amp;rsquo;t any characters in S that sort before &amp;lsquo;$&amp;rsquo;, one that sorts before &amp;lsquo;i&amp;rsquo; (the &amp;lsquo;$&amp;rsquo;), five that sort before m (the &amp;lsquo;$&amp;rsquo; and the &amp;lsquo;i&amp;rsquo;s) and so on. In the example I store it in a less compact way as the column &lt;em&gt;F&lt;/em&gt; (which contains the first symbol for each suffix &amp;ndash; essentially the same information, since each suffix is sorted), so it might be easier to follow (wishful thinking).&lt;/p&gt;

&lt;p&gt;Why is this called backwards search? Well, our index variable &lt;em&gt;i&lt;/em&gt; actually starts at |P| (the last character of our search pattern), and decreases to 1. This maintains the invariant that SA[s..e] contains all the suffixes of which P[i..|P|] is a prefix, and hence all locations of P[i..|P|] in S.&lt;/p&gt;

&lt;h1&gt;Example&lt;/h1&gt;

&lt;p&gt;Let&amp;rsquo;s practice this magic spell&amp;hellip;
Let our search pattern P be &amp;lsquo;iss&amp;rsquo;, and our string S be &amp;lsquo;mississippi&amp;rsquo;. Starting with i = 3, c = P[i] = &amp;rsquo;s'. The working for each rank query is shown below each figure. I&amp;rsquo;m representing the current symbol as &lt;em&gt;c&lt;/em&gt; to avoid confusion between ‘s’ and s and s′.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_cvian" height="203" src="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/ihHjbgnieymvfnoqpBjGgiqFjhvcxvratxpjjHnCypCuqedCAjdDdvemolaC/media_httpalexbowes3a_cvian.png.scaled500.png" width="252" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Here (above) we are at the first stage of backwards search for &amp;lsquo;iss&amp;rsquo; on &amp;lsquo;mississippi&amp;rsquo; string &amp;ndash; before any rank queries have been made.
&lt;strong&gt;Note&lt;/strong&gt;: we do not store the document anymore &amp;ndash; the gray text &amp;ndash; and we don&amp;rsquo;t store F, but instead store C &amp;ndash; see section on &lt;strong&gt;Backward Search&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Starting from s=1 and e=12 (as above) and c = P[i] = &amp;rsquo;s' where i = 3, we make our first two rank queries:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmathurlcom3_jqexj" height="43" src="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/rrpegeCGyptDyeHqeymlsbrCAAyHHFsICpFkAptgeccIduqsuCxmgquDFEuC/media_httpmathurlcom3_JqExj.png.scaled500.png" width="303" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_ycebo" height="204" src="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/fijsEdkIfsodwlypjvglyytipbChnkcoChrqhuHHwpFHjvuxqweusGmExiFm/media_httpalexbowes3a_ycEBo.png.scaled500.png" width="253" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;After the above, we are now at the &lt;em&gt;second&lt;/em&gt; stage of backwards search for &amp;lsquo;iss&amp;rsquo; on ‘mississippi’ string. All the occurrences of &amp;rsquo;s' lie in SA[9..12].&lt;/p&gt;

&lt;p&gt;From s = 9 and e = 11, and c = P[i] = &amp;rsquo;s' where i = 2, our next two rank queries are:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmathurlcom3_sdxgn" height="43" src="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/buGjfwfbbrABgDljdijpnzecxrjhnrhtmIyEgJfaqeFJvshrrHaxhEchnlDp/media_httpmathurlcom3_sdxGn.png.scaled500.png" width="314" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_arabe" height="205" src="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/ocjcqwnBfClusugoDgACjnbqDplycpECkqiolqkdmmplvnrEifoswnankAws/media_httpalexbowes3a_arAbE.png.scaled500.png" width="260" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;We are now at the &lt;em&gt;third&lt;/em&gt; stage of backwards search for &amp;lsquo;iss&amp;rsquo; on &amp;lsquo;mississippi&amp;rsquo; string. All the occurrences of &amp;lsquo;ss&amp;rsquo; lie in SA[11..12].&lt;/p&gt;

&lt;p&gt;From s = 11 and e = 12, and c = P[i] = &amp;lsquo;i&amp;rsquo; where i = 1, our final two rank queries are:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmathurlcom3_jocuc" height="43" src="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/rJIjcCwEBklrhemGykACmghiysJFuwrtoJjABylzdlmjjzrpDajJIJolHFeI/media_httpmathurlcom3_joCuC.png.scaled500.png" width="319" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_pjtdo" height="209" src="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/lqvAHisiFdkakrDdzwuphkCJsrGjflsavFvzFtIDqswxjnInHsIcIxkdxmhh/media_httpalexbowes3a_pJtDo.png.scaled500.png" width="252" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;This is the &lt;em&gt;fourth&lt;/em&gt; and final stage of our backwards search for &amp;lsquo;iss&amp;rsquo; in the string &amp;lsquo;mississippi&amp;rsquo;. All the occurrences of &amp;lsquo;iss&amp;rsquo; lie in SA[4..5].&lt;/p&gt;

&lt;p&gt;It impresses me every time&amp;hellip;&lt;/p&gt;

&lt;h1&gt;Play Time&lt;/h1&gt;

&lt;p&gt;No doubt you want to get your hands dirty. I have played around with &lt;a href="http://code.google.com/p/libdivsufsort/"&gt;libdivsufsort&lt;/a&gt; before, although I &lt;em&gt;think&lt;/em&gt; you may have to implement backward search yourself (it&amp;rsquo;d be a good exercise), since it doesn&amp;rsquo;t appear to come with fast rank query providers. For rank structures for your BWT you might want to check out &lt;a href="http://libcds.recoded.cl/"&gt;libcds&lt;/a&gt;. In fact there are heaps out there, but I haven&amp;rsquo;t used any others. Hopefully someone will comment below with a good recommendation of this.&lt;/p&gt;

&lt;p&gt;Also, please comment here if you develop something cool with it  :) and as always, if you have journeyed this far, consider following me on Twitter: &lt;a href="http://www.twitter.com/alexbowe"&gt;@alexbowe&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;Bibliography&lt;/h1&gt;

&lt;p&gt;Ferragina, P. and Manzini, G. (2000). Opportunistic data structures with applications. Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science, pages 390–398.&lt;/p&gt;

&lt;p&gt;S. J. Puglisi, W. F. Smyth, and A. Turpin. A taxonomy of suffix array construction algorithms. ACM Computing Surveys, 39(2):1–31, 2007.
Jared Simpson&amp;rsquo;s paper from 2010 called *Efficient construction of an assembly string graph using the FM-index.&lt;/p&gt;

&lt;p&gt;Simpson, J. T. and Durbin, R. (2010). Efficient construction of an assembly string graph using the FM-index. Bioinformatics, 26(12):i367–i373.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/fm-indexes-and-backwards-search-32172"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/fm-indexes-and-backwards-search-32172#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/KHa4f1ul92v0-rmuQ42x5SnkN9o/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KHa4f1ul92v0-rmuQ42x5SnkN9o/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/KHa4f1ul92v0-rmuQ42x5SnkN9o/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KHa4f1ul92v0-rmuQ42x5SnkN9o/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/ff6FTc8v-NI" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="200" width="519" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/bkDoDCBtcxiGvGFyqglwfxHjyoCuyxwmemiaGarwkCpqqnlFBAGcgBJxgAvI/media_httpalexbowes3a_mxdzx.png">
        <media:thumbnail height="193" width="500" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/bkDoDCBtcxiGvGFyqglwfxHjyoCuyxwmemiaGarwkCpqqnlFBAGcgBJxgAvI/media_httpalexbowes3a_mxdzx.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="38" width="264" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/eIsactEgjhxsuDablqiHddeBFpurqIbDBgFwivEyfncpvsjcHygbyzaBIcck/media_httpalexbowes3a_gIuEl.png">
        <media:thumbnail height="38" width="264" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/eIsactEgjhxsuDablqiHddeBFpurqIbDBgFwivEyfncpvsjcHygbyzaBIcck/media_httpalexbowes3a_gIuEl.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="218" width="332" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/oAAFCAAhHnEegHeChHhwsEcfvsodBJumnvyEvuarCvGmqDbAzerBshrpGCzu/media_httpalexbowes3a_qBwxw.png">
        <media:thumbnail height="218" width="332" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/oAAFCAAhHnEegHeChHhwsEcfvsodBJumnvyEvuarCvGmqDbAzerBshrpGCzu/media_httpalexbowes3a_qBwxw.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="219" width="259" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/yBvqdHucrfpmApHcjgFrGlaHDgDJlGJdbeddmmoiuaDztlGtHCrJaqtemdxn/media_httpalexbowes3a_reJrx.png">
        <media:thumbnail height="219" width="259" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/yBvqdHucrfpmApHcjgFrGlaHDgDJlGJdbeddmmoiuaDztlGtHCrJaqtemdxn/media_httpalexbowes3a_reJrx.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="43" width="253" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/lgBawjynwzikxkeevbCzDlEsJbliwmotcjfFaHvuEFeJvJtEevxnkjhufuhB/media_httpmathurlcom4_ayChb.png">
        <media:thumbnail height="43" width="253" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/lgBawjynwzikxkeevbCzDlEsJbliwmotcjfFaHvuEFeJvJtEevxnkjhufuhB/media_httpmathurlcom4_ayChb.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="62" width="168" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/euyCgesfserHbEtuvzvfpqkxnFieakDfacDswansminBEcykDgazyAgafAzn/media_httpalexbowes3a_DAuos.png">
        <media:thumbnail height="62" width="168" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/euyCgesfserHbEtuvzvfpqkxnFieakDfacDswansminBEcykDgazyAgafAzn/media_httpalexbowes3a_DAuos.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="203" width="252" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/ihHjbgnieymvfnoqpBjGgiqFjhvcxvratxpjjHnCypCuqedCAjdDdvemolaC/media_httpalexbowes3a_cvian.png">
        <media:thumbnail height="203" width="252" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/ihHjbgnieymvfnoqpBjGgiqFjhvcxvratxpjjHnCypCuqedCAjdDdvemolaC/media_httpalexbowes3a_cvian.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="43" width="303" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/rrpegeCGyptDyeHqeymlsbrCAAyHHFsICpFkAptgeccIduqsuCxmgquDFEuC/media_httpmathurlcom3_JqExj.png">
        <media:thumbnail height="43" width="303" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/rrpegeCGyptDyeHqeymlsbrCAAyHHFsICpFkAptgeccIduqsuCxmgquDFEuC/media_httpmathurlcom3_JqExj.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="204" width="253" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/fijsEdkIfsodwlypjvglyytipbChnkcoChrqhuHHwpFHjvuxqweusGmExiFm/media_httpalexbowes3a_ycEBo.png">
        <media:thumbnail height="204" width="253" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/fijsEdkIfsodwlypjvglyytipbChnkcoChrqhuHHwpFHjvuxqweusGmExiFm/media_httpalexbowes3a_ycEBo.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="43" width="314" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/buGjfwfbbrABgDljdijpnzecxrjhnrhtmIyEgJfaqeFJvshrrHaxhEchnlDp/media_httpmathurlcom3_sdxGn.png">
        <media:thumbnail height="43" width="314" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/buGjfwfbbrABgDljdijpnzecxrjhnrhtmIyEgJfaqeFJvshrrHaxhEchnlDp/media_httpmathurlcom3_sdxGn.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="205" width="260" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/ocjcqwnBfClusugoDgACjnbqDplycpECkqiolqkdmmplvnrEifoswnankAws/media_httpalexbowes3a_arAbE.png">
        <media:thumbnail height="205" width="260" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/ocjcqwnBfClusugoDgACjnbqDplycpECkqiolqkdmmplvnrEifoswnankAws/media_httpalexbowes3a_arAbE.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="43" width="319" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/rJIjcCwEBklrhemGykACmghiysJFuwrtoJjABylzdlmjjzrpDajJIJolHFeI/media_httpmathurlcom3_joCuC.png">
        <media:thumbnail height="43" width="319" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/rJIjcCwEBklrhemGykACmghiysJFuwrtoJjABylzdlmjjzrpDajJIJolHFeI/media_httpmathurlcom3_joCuC.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="209" width="252" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/lqvAHisiFdkakrDdzwuphkCJsrGjflsavFvzFtIDqswxjnInHsIcIxkdxmhh/media_httpalexbowes3a_pJtDo.png">
        <media:thumbnail height="209" width="252" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/lqvAHisiFdkakrDdzwuphkCJsrGjflsavFvzFtIDqswxjnInHsIcIxkdxmhh/media_httpalexbowes3a_pJtDo.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/fm-indexes-and-backwards-search-32172</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 28 Jun 2011 06:35:00 -0700</pubDate>
      <title>Wavelet Trees</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/Xe19kFbPwHc/wavelet-trees</link>
      <guid isPermaLink="false">http://www.alexbowe.com/wavelet-trees</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_qjefe" height="220" src="http://posterous.com/getfile/files.posterous.com/alexbowe/rwpkuuutnzsHeHmmtaGdiFFDxsfqepCzdzInqdoBBsbyosjAezuoGevieIhH/media_httpalexbowes3a_qjEfe.png.scaled500.png" width="336" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;In &lt;a href="http://www.alexbowe.com/yarrr-me-hearties"&gt;my last post&lt;/a&gt; I introduced a data structure called RRR, which is used to quickly answer rank queries on binary sequences, and provide implicit compression.&lt;/p&gt;

&lt;p&gt;Today I will talk about an elegant way of answering rank queries on sequences over larger alphabets.
The structure is called the &lt;em&gt;Wavelet Tree&lt;/em&gt;, which organises a string into a hierarchy of bit vectors. A rank query has time complexity is O(log_2 A),
where A is the size of the alphabet. It was introduced by Grossi, Gupta and Vitter in their 2003 paper &lt;em&gt;High-order entropy-compressed text indexes&lt;/em&gt; [4] (see the &lt;em&gt;Further Reading&lt;/em&gt; section for more papers). It has since been featured in many papers [1, 2, 3, 5, 6].&lt;/p&gt;

&lt;p&gt;If you store the bit vectors in RRR sequences, it may take less space than the original sequence. Alternatively, you could store the bit vectors in the rank indexes proposed by Sadakane and Okonohara [7]. It has a different approach to compression. I will talk about it another time ;) &amp;ndash; fortunately, I will be studying under Sadakane-san at a later date.&lt;/p&gt;

&lt;p&gt;In a different future post, I will show how Suffix Arrays can be used to find arbitrary patterns of length P, by issuing 2P rank queries.  If using a Wavelet Tree, this means a pattern search has O(P log_2 A) time complexity, that is, the size of size of the &amp;lsquo;haystack&amp;rsquo; doesn&amp;rsquo;t matter, it instead depends on the size of the &amp;lsquo;needle&amp;rsquo; and size of the alphabet.&lt;/p&gt;

&lt;h2&gt;Construction&lt;/h2&gt;

&lt;p&gt;A Wavelet Tree converts a string into a balanced binary-tree of bit vectors, where a 0 replaces half of the symbols, and a 1 replaces the other half. This creates ambiguity, but at each level this alphabet is filtered and re-encoded, so the ambiguity lessens, until there is no ambiguity at all.&lt;/p&gt;

&lt;p&gt;The tree is defined recursively as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Take the alphabet of the string, and encode the first half as 0, the 2nd half as 1: &lt;code&gt;{ a, b, c, d }&lt;/code&gt; would become &lt;code&gt;{ 0, 0, 1, 1 }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Group each 0-encoded symbol, &lt;code&gt;{ a, b }&lt;/code&gt;, as a sub-tree;&lt;/li&gt;
&lt;li&gt;Group each 1-encoded symbol, &lt;code&gt;{ c, d }&lt;/code&gt;, as a sub-tree;&lt;/li&gt;
&lt;li&gt;Reapply this to each subtree recursively until there is only one or two symbols left (when a &lt;code&gt;0&lt;/code&gt; or &lt;code&gt;1&lt;/code&gt; can only mean one thing).&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;For the string &amp;ldquo;Peter Piper picked a peck of pickled peppers&amp;rdquo; (spaces and a string terminator have been represented as &lt;code&gt;_&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; respectively, due to convention in the literature) the Wavelet Tree would look like this:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_odjnn" height="261" src="http://posterous.com/getfile/files.posterous.com/alexbowe/xsjspxCFdcfwhmfibExxHGmInapmguyiibgtefljAfJqfvdFJfmucfDoardH/media_httpalexbowes3a_oDJnn.png.scaled500.png" width="440" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;&lt;em&gt;note: the strings aren&amp;rsquo;t actually stored, but are shown here for convenience&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It has the alphabet &lt;code&gt;{ $, P, _, a, c, d, e, f, i, k, l, o, p, r, s, t }&lt;/code&gt;, which would be mapped to &lt;code&gt;{ 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1 }&lt;/code&gt;. So, for example, &lt;code&gt;$&lt;/code&gt; would map to 0, and &lt;code&gt;r&lt;/code&gt; would map to &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The left subtree is created by taking just the 0-encoded symbols &lt;code&gt;{ $, P, _, a, c, d, e, f }&lt;/code&gt; and then re-encoding them by dividing this &lt;em&gt;new&lt;/em&gt; alphabet: &lt;code&gt;{ 0, 0, 0, 0, 1, 1, 1, 1 }&lt;/code&gt;. Note that on the first level an &lt;code&gt;e&lt;/code&gt; would be encoded as a &lt;code&gt;0&lt;/code&gt;, but now it is encoded as a &lt;code&gt;1&lt;/code&gt; (it becomes a &lt;code&gt;0&lt;/code&gt; again at a leaf node).&lt;/p&gt;

&lt;p&gt;We can store the bit vectors in RRR structures for fast binary rank queries (which are needed, as described below), and compression :)
In fact, since it is a balanced tree, we can concatenate each of the levels and store it as one single bit vector.&lt;/p&gt;

&lt;h2&gt;Querying&lt;/h2&gt;

&lt;p&gt;Recall from &lt;a href="http://www.alexbowe.com/yarrr-me-hearties"&gt;my last post&lt;/a&gt; that a rank query is the count of 1-bits up to a specified position. Rank queries over larger alphabets are analogous &amp;ndash; instead of a 1, it may be any other symbol:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_wfllf" height="129" src="http://posterous.com/getfile/files.posterous.com/alexbowe/usAhwpahvtaFfgvbskdGikbbeyegGhnnxeHJDrzeIjfmwBHyjeqcAlpCJvmE/media_httpalexbowes3a_wFllF.png.scaled500.png" width="315" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;After the tree is constructed, a rank query can be done with log A (A = alphabet size) &lt;em&gt;binary&lt;/em&gt; rank queries on the bit vectors &amp;ndash; &lt;code&gt;O(1)&lt;/code&gt; if you store them in RRR or another binary rank index. The encoding at each internal node may be ambiguous, but of course it isn&amp;rsquo;t useless &amp;ndash; we use the ambiguous encoding to guide us to the appropriate sub-tree, and keep doing so until we have our answer.&lt;/p&gt;

&lt;p&gt;For example, if we wanted to know rank(5, e), we use the following procedure which is illustrated below. We know that &lt;code&gt;e&lt;/code&gt; is encoded as &lt;code&gt;0&lt;/code&gt; at this level, so we take the &lt;em&gt;binary&lt;/em&gt; rank query of 0 at position 5:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_ahscp" height="136" src="http://posterous.com/getfile/files.posterous.com/alexbowe/njnHBincsyxcHgDHfCkFhgAJeFaaCnClBaJqDxzovdqCtsgHielhqGbdimmH/media_httpalexbowes3a_aHsCp.png.scaled500.png" width="378" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Which is 4, which we then use to indicate where to rank in the 0-child: the 4th bit (or the bit at position 3, due to 0-basing). We know to query the 0-child, since that is what &lt;code&gt;e&lt;/code&gt; was encoded as at the parent level. We then repeat this recursively:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_bdgct" height="288" src="http://posterous.com/getfile/files.posterous.com/alexbowe/AjcGuBJeGsesrnAnvxJBDxBqIlDcoGeFCfyHpgxczJGbplvlfBCarqEElEFy/media_httpalexbowes3a_BDGCt.png.scaled500.png" width="378" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;At a leaf node we have our answer. I would love to explain why this works, but it is fun and rewarding to think about it yourself ;)&lt;/p&gt;

&lt;p&gt;There are also ways to provide fast select queries, but once again I will leave that up to you to research. The curious among you might also be interested in the Huffman-Shaped Wavelet Tree described by Mäkinen and Navarro [5].&lt;/p&gt;

&lt;h2&gt;Using Your New Powers for Good&lt;/h2&gt;

&lt;p&gt;Feel free to implement this yourself, but if you want to get your hands dirty right away, all-around-clever-guy &lt;a href="http://fclaude.recoded.cl"&gt;Francisco Claude&lt;/a&gt; has made an implementation available in his &lt;a href="http://libcds.recoded.cl"&gt;Compressed Data Structure Library (libcds)&lt;/a&gt;. If you create something neat with it be sure to report back ;)&lt;/p&gt;

&lt;p&gt;And if you read this far, consider following me on Twitter: &lt;a href="http://www.twitter.com/alexbowe"&gt;@alexbowe&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Further Reading&lt;/h2&gt;

&lt;p&gt;I didn&amp;rsquo;t want to saturate this blog with proofs and so-on, as it was meant to be a light introduction. It is also a pain typesetting math on this blog :/ If you want to learn more about this awesome structure, check out the following papers:&lt;/p&gt;

&lt;p&gt;[1]  F. Claude and G. Navarro. Practical rank/select queries over arbitrary sequences. In Proceedings of the 15th International Symposium on String Processing and Information Retrieval (SPIRE), LNCS 5280, pages 176–187. Springer, 2008.&lt;/p&gt;

&lt;p&gt;[2]  P. Ferragina, R. Giancarlo, and G. Manzini. The myriad virtues of wavelet trees. Information and Computation, 207(8):849–866, 2009.&lt;/p&gt;

&lt;p&gt;[3]  P. Ferragina, G. Manzini, V. M ̈akinen, and G. Navarro. Compressed representations of sequences and full-text indexes. ACM Transactions on Al- gorithms, 3(2):20, 2007.&lt;/p&gt;

&lt;p&gt;[4]  R. Grossi, A. Gupta, and J. Vitter. High-order entropy-compressed text indexes. In Proceedings of the 14th annual ACM-SIAM symposium on Dis- crete algorithms, pages 841–850. Society for Industrial and Applied Mathematics, 2003.&lt;/p&gt;

&lt;p&gt;[5]  V. Mäkinen and G. Navarro. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing, 12(1):40–66, 2005.&lt;/p&gt;

&lt;p&gt;[6]  V. Mäkinen and G. Navarro. Implicit compression boosting with applications to self-indexing. In Proceedings of the 14th International Symposium on String Processing and Information Retrieval (SPIRE), LNCS 4726, pages 214–226. Springer, 2007.&lt;/p&gt;

&lt;p&gt;[7]  D. Okanohara and K. Sadakane. Practical entropy-compressed rank/select dictionary. Arxiv Computing Research Repository, abs/cs/0610001, 2006.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/wavelet-trees"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/wavelet-trees#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/ewossymdKcFzvBlR8hlQzxuxPok/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ewossymdKcFzvBlR8hlQzxuxPok/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/ewossymdKcFzvBlR8hlQzxuxPok/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ewossymdKcFzvBlR8hlQzxuxPok/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/Xe19kFbPwHc" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="220" width="336" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/rwpkuuutnzsHeHmmtaGdiFFDxsfqepCzdzInqdoBBsbyosjAezuoGevieIhH/media_httpalexbowes3a_qjEfe.png">
        <media:thumbnail height="220" width="336" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/rwpkuuutnzsHeHmmtaGdiFFDxsfqepCzdzInqdoBBsbyosjAezuoGevieIhH/media_httpalexbowes3a_qjEfe.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="261" width="440" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/xsjspxCFdcfwhmfibExxHGmInapmguyiibgtefljAfJqfvdFJfmucfDoardH/media_httpalexbowes3a_oDJnn.png">
        <media:thumbnail height="261" width="440" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/xsjspxCFdcfwhmfibExxHGmInapmguyiibgtefljAfJqfvdFJfmucfDoardH/media_httpalexbowes3a_oDJnn.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="129" width="315" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/usAhwpahvtaFfgvbskdGikbbeyegGhnnxeHJDrzeIjfmwBHyjeqcAlpCJvmE/media_httpalexbowes3a_wFllF.png">
        <media:thumbnail height="129" width="315" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/usAhwpahvtaFfgvbskdGikbbeyegGhnnxeHJDrzeIjfmwBHyjeqcAlpCJvmE/media_httpalexbowes3a_wFllF.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="136" width="378" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/njnHBincsyxcHgDHfCkFhgAJeFaaCnClBaJqDxzovdqCtsgHielhqGbdimmH/media_httpalexbowes3a_aHsCp.png">
        <media:thumbnail height="136" width="378" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/njnHBincsyxcHgDHfCkFhgAJeFaaCnClBaJqDxzovdqCtsgHielhqGbdimmH/media_httpalexbowes3a_aHsCp.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="288" width="378" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/AjcGuBJeGsesrnAnvxJBDxBqIlDcoGeFCfyHpgxczJGbplvlfBCarqEElEFy/media_httpalexbowes3a_BDGCt.png">
        <media:thumbnail height="288" width="378" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/AjcGuBJeGsesrnAnvxJBDxBqIlDcoGeFCfyHpgxczJGbplvlfBCarqEElEFy/media_httpalexbowes3a_BDGCt.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/wavelet-trees</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 31 May 2011 22:12:00 -0700</pubDate>
      <title>YaRRR Me Hearties - a post about a succinct data structure (not pirates, sorry)</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/mEfUH8OSMR8/yarrr-me-hearties</link>
      <guid isPermaLink="false">http://www.alexbowe.com/yarrr-me-hearties</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_igeiq" height="210" src="http://posterous.com/getfile/files.posterous.com/alexbowe/jvzgIFyDmdbqFwyIdHigkqwxFiozxfqwdBvefBobcnGxsJgxCuswxDCwyddz/media_httpalexbowes3a_IGEiq.png.scaled500.png" width="476" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;This blog post will give an overview of a static bitsequence data structure known as RRR, which answers arbitrary length rank queries in O(1) time, and provides implicit compression.&lt;/p&gt;

&lt;p&gt;As my blog is informal, I give an introduction to this structure from a birds eye view. If you want, &lt;a href="https://github.com/alexbowe/honours-thesis/downloads"&gt;read my thesis&lt;/a&gt; for a version with better markup, and follow the citations for proofs by people smarter than myself :)&lt;/p&gt;

&lt;p&gt;My intended future posts will cover the other aspects of my thesis, including generalising RRR (for sequences over small alphabets), Wavelet Trees (which answer rank queries over bigger alphabets), and Suffix Arrays (a text index which &amp;ndash; when combined with the above structures &amp;ndash; can answer queries in O(P log A) time, when P is the length of the search pattern, and A is the alphabet size).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; I have now posted about Wavelet Trees! Check it out &lt;a href="http://www.alexbowe.com/wavelet-trees"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Example Problem&lt;/h2&gt;

&lt;p&gt;Cracking the Oyster, the first column of
&lt;a href="http://www.cs.bell-labs.com/cm/cs/pearls/cto.html"&gt;Programming Pearls&lt;/a&gt;, opens
with a programmer asking for advice when sorting around ten million unique
seven-digit integers &amp;ndash; phone numbers.&lt;/p&gt;

&lt;p&gt;After some discussion, the author
&lt;a href="http://www.cs.bell-labs.com/cm/cs/pearls/sec014.html"&gt;concludes&lt;/a&gt; that a
&lt;a href="http://en.wikipedia.org/wiki/Bit_array"&gt;bitmap&lt;/a&gt; should be used. If we wanted to
store ten million integers, we could use an array of 32-bit integers, consuming
38 MB, or we could represent our numbers as positions on a number line.&lt;/p&gt;

&lt;p&gt;All of these phone numbers will be within the range &lt;code&gt;[0000000, 9999999]&lt;/code&gt;. To
represent the presence of these numbers, we only need a bitmap 10&lt;sup&gt;7&lt;/sup&gt; bits long,
about 1 MB, which would represent our number line. Then, for a bitmap M, if we
want to store phone number p, we set the  bit M[p] to 1. Sorting would involve
setting the numbers that are present to 1, then iterating over the
bitmap, printing the positions of the 1-bits &amp;ndash; O(N) time.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In the following sections, I will detail operations that can be done on bitmaps, named rank and select, and explain how to answer rank queries in O(1) time, and implicitly compress the bitmap. Using rank and select, a compressed bitmap can be a very powerful way to store sets. This isn&amp;rsquo;t limited to just sets of numbers, all sorts of things, such as sets of graph nodes for example; A friend of mine is using succinct bitmaps to represent De Bruijn graphs, which are used in genome assembly.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Extension: Rank&lt;/h2&gt;

&lt;p&gt;Allow me to extend the problem. I want to query our simple phone number
database to see how many phone numbers are allocated within the range
$[0005000, 0080000]$. I could iterate over that range and update a counter
whenever I encounter a 1-bit. Actually, this operation is what is known as a
&lt;strong&gt;rank&lt;/strong&gt; operation.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_trqyt" height="124" src="http://posterous.com/getfile/files.posterous.com/alexbowe/FwzcjqsEdeFkubwwrpmrtElrtyDFeIqHuJcdCwuiAeAvJCAatABqChFfvGrs/media_httpalexbowes3a_trqyt.png.scaled500.png" width="250" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;The operation &lt;code&gt;rank(i)&lt;/code&gt; is defined as the number of set bits (&lt;code&gt;1&lt;/code&gt;s) in the range &lt;code&gt;[0, i]&lt;/code&gt; (or &lt;code&gt;[0, i)&lt;/code&gt; in some papers). In the bitstring above, the answer to &lt;code&gt;rank(5)&lt;/code&gt; is 3&amp;hellip; This is a generalisation of the &lt;a href="http://en.wikipedia.org/wiki/Popcount"&gt;popcount&lt;/a&gt; operation which counts all set bits, which I have discussed before. &lt;code&gt;rank(i)&lt;/code&gt; can be implemented by left-shifting &lt;code&gt;L - i&lt;/code&gt; bits (where L is the length of the datatype you are using, int, long, etc) to remove the unwanted bits, then calling &lt;code&gt;popcount&lt;/code&gt; on the resulting value. This could be done iteratively over an array if you want, but I will discuss a much faster way below.&lt;/p&gt;

&lt;p&gt;Then, the above question can be answered as:  &lt;code&gt;rank(0080000) - rank(0005000 - 1)&lt;/code&gt;. This will give us just the number of 1s between 0005000 and 0080000.&lt;/p&gt;

&lt;p&gt;This isn&amp;rsquo;t the only place we would use a popcounts; it happens that popcounts are common enough that we want to optimise them. Check out &lt;a href="http://www.valuedlessons.com/2009/01/popcount-in-python-with-benchmarks.html"&gt;this blog post at valuedlessons.com&lt;/a&gt; for a discussion and empirical comparison of several fast approaches.&lt;/p&gt;

&lt;h2&gt;RRR&lt;/h2&gt;

&lt;p&gt;As it happens, we can build a data structure for static bitmaps that answers rank queries in O(1) time, &lt;em&gt;and&lt;/em&gt; provides implicit compression. It is what is known as a succinct data structure, which means that even though it is compressed, we don&amp;rsquo;t need to decompress the whole thing t operate on it efficiently.  Sadakane (a big name in succinct data structures) gives a nice analogy in his &lt;a href="http://www.nii.ac.jp/userimg/intro/en/sadakane_en.pdf"&gt;introduction of the field&lt;/a&gt;, likening it to forcing dehydrated noodles apart with your chopsticks (decompression) as you are rehydrating them, but before the whole thing is fully cooked and separated. This allows you to keep some of the noodles compressed while you eat the decompressed fragment.&lt;/p&gt;

&lt;p&gt;Since it is static it isn&amp;rsquo;t well suited for a bitmap which you want to update (although work has been done toward this), it is still really cool :)&lt;/p&gt;

&lt;p&gt;The structure I&amp;rsquo;m referring to is named RRR. It sounds like a radio station, but it is named after its creators: &lt;a href="http://portal.acm.org/citation.cfm?id=545411"&gt;Raman, Raman, Rao, from their 2002 paper &lt;em&gt;Succinct indexable dictionaries with applications to encoding k-ary trees and multisets&lt;/em&gt;&lt;/a&gt;. Its a data structure I had to become intimately involved with for &lt;a href="https://github.com/alexbowe/honours-thesis/downloads"&gt;my honours thesis&lt;/a&gt;, where I extended it for sequences of larger (but still small) alphabets. If you want to answer rank queries on large alphabets, a wavelet tree might be what you are after, but that will be covered in a different blog post (or you could read my thesis!).&lt;/p&gt;

&lt;p&gt;In my &lt;a href="http://www.alexbowe.com/48392639"&gt;last post (Generating Binary Permutations in Popcount Order)&lt;/a&gt; I discussed how to compress a bitstring by replacing blocks of a certain blocksize with their corresponding pop number, and (variable length) offset into a lookup table. I briefly mentioned building an index over it to improve lookup as well.&lt;/p&gt;

&lt;h2&gt;RRR: Construction&lt;/h2&gt;

&lt;p&gt;To construct a RRR sequence we divide our bitmap into blocks, as I mentioned &lt;a href="http://www.alexbowe.com/generating-binary-permutations-in-popcount-or"&gt;in my previous blog post&lt;/a&gt;. These are grouped in &lt;em&gt;superblocks&lt;/em&gt;, too, which allows us to construct an index to enable O(1) rank queries. In the following image, I have fragmented the bitmap using a blocksize of b = 5, and grouped them with a superblock factor of f = 3 &amp;ndash; so each superblock is three blocks.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/ilCGGmgulAdvarAAyvcflpmbdBmwyeBsHnyEsIvjGybnIJydkuFExulHicpl/media_httpalexbowes3a_qEoIf.png.scaled1000.png"&gt;&lt;img alt="Media_httpalexbowes3a_qeoif" height="81" src="http://posterous.com/getfile/files.posterous.com/alexbowe/ilCGGmgulAdvarAAyvcflpmbdBmwyeBsHnyEsIvjGybnIJydkuFExulHicpl/media_httpalexbowes3a_qEoIf.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;First we replace the blocks with a pair of values, a &lt;em&gt;class&lt;/em&gt; value C and offset value O, which are used together as a lookup key into a table of precomputed (small &amp;ndash; for each possible block only) ranks &amp;ndash; this is demonstrated in the figure below. This is the same as &lt;a href="http://www.alexbowe.com/generating-binary-permutations-in-popcount-or"&gt;the previous blog post&lt;/a&gt;, although in that I called the &amp;ldquo;class&amp;rdquo; P. This is because the class of a block is defined as the popcount &amp;ndash; the number of set bits &amp;ndash; in the block: &lt;code&gt;class(B) = popcount(B)&lt;/code&gt; for block B.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_gbopg" height="235" src="http://posterous.com/getfile/files.posterous.com/alexbowe/iupqqjJeIwxansBrhbwgsJIGadAAfhGeAekjrGCmjlzCyieIDInIjzFEDIjk/media_httpalexbowes3a_gBopg.png.scaled500.png" width="379" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;The table is shared among all your RRR sequences, and is in fact a table of tables, where C points to the first element for the ranks of a given popcount:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_cjgfp" height="399" src="http://posterous.com/getfile/files.posterous.com/alexbowe/jDmHJuohvqpBdBBtjbaApmhayzhsEBtlwdHmmuoCDqipGHoabkDEfzggqvln/media_httpalexbowes3a_cJgfp.png.scaled500.png" width="390" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;For this table (let&amp;rsquo;s call it G), for a given class C, the sub-table at G[C] has b Choose C entries, which correspond to all possible permutations that have a popcount of C. This means that while our C values will always be log(b + 1) (the number of bits to represent values 0, 1, 2&amp;hellip; b &amp;ndash; these are all possible popcount values for the blocksize), but our O values will vary in size, requiring log(b Choose C) bits (oh yeah, and of course I&amp;rsquo;m using log_2 here :)). During a query, we can use our C values to work out how many bits will follow for the O values.&lt;/p&gt;

&lt;p&gt;Using this approach alone we get the compression, but not O(1) ranks. C is fixed width, the compression comes from O being varied width.&lt;/p&gt;

&lt;p&gt;In order to get the O(1) ranks we use a method discussed by &lt;a href="http://www.springerlink.com/content/yv33538123433477/"&gt;Munro in &lt;em&gt;Tables&lt;/em&gt;, 1996&lt;/a&gt;. This is where the superblocks come in to play:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_jjdii" height="304" src="http://posterous.com/getfile/files.posterous.com/alexbowe/HoriAGckpnbdrtBgnszxFtipFtAbHkxhJExebokBgdBIxlxAktipHEFeeFHg/media_httpalexbowes3a_JJdii.png.scaled500.png" width="476" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;For each superblock boundary we store the global rank up to that position. We also store a prefix sum of the bits, which gives us the address to the first block in the next superblock (since it is variable length!). This allows us to not require iterating over the whole RRR sequence, but instead going straight to the required superblock. We will only need to iterate over the blocks within a superblock, so it is now bound by whatever your superblock factor is.&lt;/p&gt;

&lt;h2&gt;RRR: Querying&lt;/h2&gt;

&lt;p&gt;To calculate rank(i):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calculate which block our index is in as i_b = i/b. (i_b is the global index of the block)&lt;/li&gt;
&lt;li&gt;Calculate which superblock our block resides in as i_s = i_b/f. (i_s is the index of the superblock)&lt;/li&gt;
&lt;li&gt;Set result to the sum of previous ranks at is boundary (which is pre- calculated).&lt;/li&gt;
&lt;li&gt;Using each blocks class-offset pair (c,o) after the boundary at is, add the rank for that entire block to result. 5. Repeat previous step until we reach i_b. We then add rank(j,c) (from i_b, not the global rank) to our result,where j = i mod b, and is the position we are querying local to i_b. Our final answer is result.&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;Select&lt;/h2&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_heucn" height="103" src="http://posterous.com/getfile/files.posterous.com/alexbowe/nvmnDhhcuGjwbvmwhuEIDCmbgkgFlvojCDsDktywvnBbtqDfaHAnBGJyHoci/media_httpalexbowes3a_heucn.png.scaled500.png" width="250" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Select is the inverse operation to rank; it answers the question &amp;ldquo;at which position is the ith set bit?&amp;rdquo;. To tie this in with the phone numbers example, maybe we want  to find out the fiftieth phone number in the set (excluding unassigned numbers). This is a way we can index just the present elements of a bitmap. It turns out select can be answered in O(1) time as well.  I won&amp;rsquo;t cover select here, as my future posts (and thesis) will mainly use rank. You can read about it in the &lt;a href="http://portal.acm.org/citation.cfm?id=545411"&gt;RRR paper&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Go Forth and&amp;hellip;&lt;/h2&gt;

&lt;p&gt;Feel free to implement this (somewhat complicated) data structure yourself, or you can use a pre-rolled one by my friend &lt;a href="http://fclaude.recoded.cl"&gt;Francisco Claude&lt;/a&gt; in his &lt;a href="http://libcds.recoded.cl"&gt;LIBCDS &amp;ndash; Compressed Data Structure Library&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you read this far, consider &lt;a href="http://www.twitter.com/alexbowe"&gt;adding me to twitter&lt;/a&gt; :) or you may enjoy reading &lt;a href="http://www.alexbowe.com/wavelet-trees"&gt;my post on Wavelet Trees&lt;/a&gt;.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/yarrr-me-hearties"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/yarrr-me-hearties#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/n0PHnkb3sOprnUhVDCKNa1o7zcY/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/n0PHnkb3sOprnUhVDCKNa1o7zcY/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/n0PHnkb3sOprnUhVDCKNa1o7zcY/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/n0PHnkb3sOprnUhVDCKNa1o7zcY/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/mEfUH8OSMR8" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="333" width="363" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/tHhlksakfwHewdpghsGjrJDCvwyCxlivsqgrrljejBcxnkxoADHiDanIgyvm/media_httpalexbowes3a_dGikc.png">
        <media:thumbnail height="333" width="363" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/tHhlksakfwHewdpghsGjrJDCvwyCxlivsqgrrljejBcxnkxoADHiDanIgyvm/media_httpalexbowes3a_dGikc.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="124" width="250" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/FwzcjqsEdeFkubwwrpmrtElrtyDFeIqHuJcdCwuiAeAvJCAatABqChFfvGrs/media_httpalexbowes3a_trqyt.png">
        <media:thumbnail height="124" width="250" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/FwzcjqsEdeFkubwwrpmrtElrtyDFeIqHuJcdCwuiAeAvJCAatABqChFfvGrs/media_httpalexbowes3a_trqyt.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="149" width="917" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/ilCGGmgulAdvarAAyvcflpmbdBmwyeBsHnyEsIvjGybnIJydkuFExulHicpl/media_httpalexbowes3a_qEoIf.png">
        <media:thumbnail height="81" width="500" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/ilCGGmgulAdvarAAyvcflpmbdBmwyeBsHnyEsIvjGybnIJydkuFExulHicpl/media_httpalexbowes3a_qEoIf.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="235" width="379" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/iupqqjJeIwxansBrhbwgsJIGadAAfhGeAekjrGCmjlzCyieIDInIjzFEDIjk/media_httpalexbowes3a_gBopg.png">
        <media:thumbnail height="235" width="379" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/iupqqjJeIwxansBrhbwgsJIGadAAfhGeAekjrGCmjlzCyieIDInIjzFEDIjk/media_httpalexbowes3a_gBopg.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="399" width="390" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/jDmHJuohvqpBdBBtjbaApmhayzhsEBtlwdHmmuoCDqipGHoabkDEfzggqvln/media_httpalexbowes3a_cJgfp.png">
        <media:thumbnail height="399" width="390" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/jDmHJuohvqpBdBBtjbaApmhayzhsEBtlwdHmmuoCDqipGHoabkDEfzggqvln/media_httpalexbowes3a_cJgfp.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="304" width="476" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/HoriAGckpnbdrtBgnszxFtipFtAbHkxhJExebokBgdBIxlxAktipHEFeeFHg/media_httpalexbowes3a_JJdii.png">
        <media:thumbnail height="304" width="476" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/HoriAGckpnbdrtBgnszxFtipFtAbHkxhJExebokBgdBIxlxAktipHEFeeFHg/media_httpalexbowes3a_JJdii.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="103" width="250" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/nvmnDhhcuGjwbvmwhuEIDCmbgkgFlvojCDsDktywvnBbtqDfaHAnBGJyHoci/media_httpalexbowes3a_heucn.png">
        <media:thumbnail height="103" width="250" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/nvmnDhhcuGjwbvmwhuEIDCmbgkgFlvojCDsDktywvnBbtqDfaHAnBGJyHoci/media_httpalexbowes3a_heucn.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="210" width="476" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/jvzgIFyDmdbqFwyIdHigkqwxFiozxfqwdBvefBobcnGxsJgxCuswxDCwyddz/media_httpalexbowes3a_IGEiq.png">
        <media:thumbnail height="210" width="476" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/jvzgIFyDmdbqFwyIdHigkqwxFiozxfqwdBvefBobcnGxsJgxCuswxDCwyddz/media_httpalexbowes3a_IGEiq.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/yarrr-me-hearties</feedburner:origLink></item>
    <item>
      <pubDate>Sun, 08 May 2011 07:11:00 -0700</pubDate>
      <title>Generating Binary Permutations in Popcount Order</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/N6fAq4AIgGA/generating-binary-permutations-in-popcount-or</link>
      <guid isPermaLink="false">http://www.alexbowe.com/generating-binary-permutations-in-popcount-or</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_mpcoj" height="114" src="http://posterous.com/getfile/files.posterous.com/alexbowe/IwoyyHhJxvbnjmHchmlJmacFcrcgIvICbEswbxHDGvCAFGyqipCxAdBrlfCC/media_httpalexbowes3a_mpcoj.png.scaled500.png" width="241" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;ve been keeping an eye on the search terms that land people at my site, and although I get the occasional &amp;ldquo;alex bowe: fact or fiction&amp;rdquo; and &amp;ldquo;alex bowe bad ass phd student&amp;rdquo; queries (the frequency strangely increased when I mentioned this on &lt;a href="http://www.twitter.com/alexbowe"&gt;Twitter&lt;/a&gt;) I also get some queries that relate to the actual content.&lt;/p&gt;

&lt;p&gt;One query I received recently was &amp;ldquo;generating integers in popcount order&amp;rdquo;, I guess because I mentioned popcounts (the number of 1-bits in a binary string) in a previous post, but the post wasn&amp;rsquo;t able to answer that visitors question.&lt;/p&gt;

&lt;p&gt;What would this be used for? Among other applications, I have used it for generating a table of numbers ordered by popcount, which I used in a compression algorithm: by breaking a bitstring into fixed-length chunks (of B bits) and replacing them with a (P, O) pair, where P is the block&amp;rsquo;s popcount which can be used to point to the table where each entry has the popcount P, and O is the offset in that subtable. Then P can be stored with log2(B + 1) bits &amp;ndash; we need to represent all possible P values from 0 to B &amp;ndash; and O can be stored with log2(binomial(B, P)) bits.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_adipi" height="235" src="http://posterous.com/getfile/files.posterous.com/alexbowe/jjwEwjlsIjhHBHiBnpEawJiHxBqBfqgBBumjukiwdtbmxuIeyvwabziDDFyv/media_httpalexbowes3a_ADipi.png.scaled500.png" width="379" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;Note that the bit-length of P varies; binomial represents the binomial coefficient, which can be seen in Pascal&amp;rsquo;s triangle expanded to row 5:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpalexbowes3a_szwrn" height="155" src="http://posterous.com/getfile/files.posterous.com/alexbowe/HxtDnzstgtCxkwJqHHcerCmbjHnqAohGstIxdDAExmcqjuzklfxkbtGuoffD/media_httpalexbowes3a_szwrn.png.scaled500.png" width="198" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;So binomial(5, x) for x = 0, 1, &amp;hellip; , 5 yields the sequence 1, 5, 10, 10, 5, 1 &amp;ndash; some things take more bits than others. Once you know the P value, you will know how many bits the O value is, so you can read it that way. This means access is O(N) (since each values position relies on the previous P values), but you can build an index on top of that to allow O(1) lookup. But all this is a story for another time ;) [1].&lt;/p&gt;

&lt;p&gt;Here is the code:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;next_perm&lt;/span&gt;(v):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;    Generates next permutation with a given amount of set bits,&lt;/span&gt;
&lt;span class="k"&gt;    given the previous lexicographical value.&lt;/span&gt;
&lt;span class="k"&gt;    Taken from http://graphics.stanford.edu/~seander/bithacks.html&lt;/span&gt;
&lt;span class="k"&gt;    &lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    t = (v | ( v - &lt;span class="i"&gt;1&lt;/span&gt;)) + &lt;span class="i"&gt;1&lt;/span&gt;
    w = t | ((((t &amp;amp; -t) / (v&amp;amp;-v)) &amp;gt;&amp;gt; &lt;span class="i"&gt;1&lt;/span&gt;) - &lt;span class="i"&gt;1&lt;/span&gt;)
    &lt;span class="kw"&gt;return&lt;/span&gt; w&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;This will take a number with a certain popcount, and generate the next number with the same popcount. For example, if you feed it 7, which is 111 in binary,
you will get 1011 back &amp;ndash; or 11 &amp;ndash; the next number with the same popcount (lexicographically speaking).&lt;/p&gt;

&lt;p&gt;To find the first number of a given popcount, you can use this:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;element_0&lt;/span&gt;(c):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;Generates first permutation with a given amount&lt;/span&gt;
&lt;span class="k"&gt;       of set bits, which is used to generate the rest.&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    &lt;span class="kw"&gt;return&lt;/span&gt; (&lt;span class="i"&gt;1&lt;/span&gt; &amp;lt;&amp;lt; c) - &lt;span class="i"&gt;1&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I should clear up what lexicographically means in the context of numbers. Well, it&amp;rsquo;s actually the same as any other symbols (such as an alphabet), 0 is the symbol that comes before 1 (if it helps, you can picture 0 as a and 1 as b):&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;00111 aabbb
01011 ababb
01101 abbab
01110 abbba
10011 baabb
10101 babab
10110 babba
11001 bbaab
11010 bbaba
11100 bbbaa&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Psuedocode&lt;/h2&gt;

&lt;p&gt;Looking at the above pattern, here is some loose pseudocode that may help us understand how the above bithacks work:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set i to the position of the rightmost bit&lt;/li&gt;
&lt;li&gt;Stop if there are no set bits, or if we have looked at all the bits (i &gt;= length of bitstring)&lt;/li&gt;
&lt;li&gt;If the i+1th bit (one place to the left) is 0: move the ith bit left&lt;/li&gt;
&lt;li&gt;Otherwise, if the i+1th bit (one place to the left) is 1:  set i to i + 1 and repeat from 2.&lt;/li&gt;
&lt;li&gt;Shift the bits on the range [0, i] right so that the rightmost bit is in position 0&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;Explanation&lt;/h2&gt;

&lt;p&gt;Understanding &lt;code&gt;element_0()&lt;/code&gt; is pretty easy. &lt;code&gt;1 &amp;lt;&amp;lt; c&lt;/code&gt; is the same as moving a &lt;code&gt;1&lt;/code&gt; to position c, then -1 sets all the bits from 0 to c &amp;ndash; 1, giving c set bits:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;c = 4
1 &amp;lt;&amp;lt; c = 10000
10000 - 1 = 01111&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;code&gt;next_perm()&lt;/code&gt; is a bit more complicated. The &lt;code&gt;v | (v -1)&lt;/code&gt; in &lt;code&gt;t = (v | (v - 1)) + 1&lt;/code&gt; right-propagates the rightmost bit. Allow me to show an example: &lt;code&gt;01110 | 01101 = 01111&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;In the case of 0, this isn&amp;rsquo;t quite correct: &lt;code&gt;00000 | 11111 = 11111&lt;/code&gt; But it&amp;rsquo;s okay because we proceed to add 1 to this value (which returns it to zero). This increment, combined with the right-propagation, will do step 2, 3, and part of step 5 above. For example, &lt;code&gt;10100 -&amp;gt; 10111 -&amp;gt; 11000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We are on our way to generating integers by popcount in lexicographical order.&lt;/p&gt;

&lt;p&gt;Now let&amp;rsquo;s break down the next line, &lt;code&gt;w = t | ((((t &amp;amp; -t) / (v&amp;amp;-v)) &amp;gt;&amp;gt; 1) - 1)&lt;/code&gt;. Bitwise equations of the form &lt;code&gt;(x &amp;amp; -x)&lt;/code&gt; isolate the rightmost bit:
&lt;code&gt;01110 &amp;amp; 10010 (two's complement) = 00010&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;If you take the two&amp;rsquo;s complement, you invert a numbers bits and then add 1. If you think about it, this means there are 0s where there were 1s, and 1s where there were 0s, and adding 1 bumps the rightmost 1 left and sets the subsequent right bits to 0. This means that the only position that will remain set in both numbers is the the rightmost 1-bit.&lt;/p&gt;

&lt;p&gt;So let R(x) denote the isolated rightmost bit of x, then for &lt;code&gt;x = 01110&lt;/code&gt; we calculate &lt;code&gt;t = 10000&lt;/code&gt;, &lt;code&gt;R(x) = 00010&lt;/code&gt; and &lt;code&gt;R(t) = 10000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Following the calculation of w, we need to divide them: &lt;code&gt;R(t) / R(x) = 10000 / 00010 = 01000&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Shift to the right by 1: &lt;code&gt;00100&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Subtract 1: &lt;code&gt;00011&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then we bitwise-or them to stick them together: &lt;code&gt;10000 | 00011 = 10011&lt;/code&gt;. This corresponds to our table above :)&lt;/p&gt;

&lt;p&gt;So &lt;code&gt;w = t | ((((t &amp;amp; -t) / (v&amp;amp;-v)) &amp;gt;&amp;gt; 1) - 1)&lt;/code&gt; corresponds to the rest of step 5 (the moving part, t was the zeroing part of moving the sub-range) in our pseudocode. Well, kind of anyway, there are a few steps happening in parallel, but the pseudocode was only loosely explaining what was happening :)&lt;/p&gt;

&lt;h2&gt;Testing it out&lt;/h2&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;gen_blocks&lt;/span&gt;(p, b):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;    Generates all blocks of a given popcount and blocksize&lt;/span&gt;
&lt;span class="k"&gt;    &lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    v = initial = element_0(p)
    block_mask = element_0(b)

    &lt;span class="kw"&gt;while&lt;/span&gt; (v &amp;gt;= initial):
        &lt;span class="kw"&gt;yield&lt;/span&gt; v
        v = next_perm(v) &amp;amp; block_mask



&amp;gt;&amp;gt;&amp;gt; &lt;span class="kw"&gt;for&lt;/span&gt; x &lt;span class="kw"&gt;in&lt;/span&gt; gen_blocks(&lt;span class="i"&gt;3&lt;/span&gt;, &lt;span class="i"&gt;5&lt;/span&gt;): &lt;span class="kw"&gt;print&lt;/span&gt; &lt;span class="pd"&gt;bin&lt;/span&gt;(x, &lt;span class="i"&gt;5&lt;/span&gt;)
... 
&lt;span class="oc"&gt;00111&lt;/span&gt;
&lt;span class="oc"&gt;01011&lt;/span&gt;
&lt;span class="oc"&gt;01101&lt;/span&gt;
&lt;span class="oc"&gt;01110&lt;/span&gt;
&lt;span class="i"&gt;10011&lt;/span&gt;
&lt;span class="i"&gt;10101&lt;/span&gt;
&lt;span class="i"&gt;10110&lt;/span&gt;
&lt;span class="i"&gt;11001&lt;/span&gt;
&lt;span class="i"&gt;11010&lt;/span&gt;
&lt;span class="i"&gt;11100&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;: &lt;code&gt;bin&lt;/code&gt; is just a function I found online for printing binay numbers and isn&amp;rsquo;t important to this post, but you can find it &lt;a href="http://www.gossamer-threads.com/lists/python/python/645216"&gt;here&lt;/a&gt; if you need one.&lt;/p&gt;

&lt;p&gt;Then of course you can loop through all values of P from 0 to B to build the complete table.&lt;/p&gt;

&lt;p&gt;Questions? Comments? Flames? I wanna hear em :)&lt;/p&gt;

&lt;p&gt;[1] &amp;ndash; Check out &lt;a href="http://www.springerlink.com/content/yv33538123433477/"&gt;Tables by Munro, 1996&lt;/a&gt;, and &lt;a href="http://portal.acm.org/citation.cfm?id=545411"&gt;Succinct indexable dictionaries with applications to encoding k-ary trees and multisets by Raman et al, 2002&lt;/a&gt; for a first step into this stuff. Also &lt;a href="github.com/alexbowe/honours-thesis/downloads"&gt;check out my honours thesis, 2010&lt;/a&gt; for a recent look at succinct data structures. I will write a blog post about this stuff sooner or later :P&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/generating-binary-permutations-in-popcount-or"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/generating-binary-permutations-in-popcount-or#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/IIYbnT4ftXDIAkhneaoP5H88fKw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/IIYbnT4ftXDIAkhneaoP5H88fKw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/IIYbnT4ftXDIAkhneaoP5H88fKw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/IIYbnT4ftXDIAkhneaoP5H88fKw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/N6fAq4AIgGA" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="155" width="198" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/HxtDnzstgtCxkwJqHHcerCmbjHnqAohGstIxdDAExmcqjuzklfxkbtGuoffD/media_httpalexbowes3a_szwrn.png">
        <media:thumbnail height="155" width="198" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/HxtDnzstgtCxkwJqHHcerCmbjHnqAohGstIxdDAExmcqjuzklfxkbtGuoffD/media_httpalexbowes3a_szwrn.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="114" width="241" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/IwoyyHhJxvbnjmHchmlJmacFcrcgIvICbEswbxHDGvCAFGyqipCxAdBrlfCC/media_httpalexbowes3a_mpcoj.png">
        <media:thumbnail height="114" width="241" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/IwoyyHhJxvbnjmHchmlJmacFcrcgIvICbEswbxHDGvCAFGyqipCxAdBrlfCC/media_httpalexbowes3a_mpcoj.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="235" width="379" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/jjwEwjlsIjhHBHiBnpEawJiHxBqBfqgBBumjukiwdtbmxuIeyvwabziDDFyv/media_httpalexbowes3a_ADipi.png">
        <media:thumbnail height="235" width="379" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/jjwEwjlsIjhHBHiBnpEawJiHxBqBfqgBBumjukiwdtbmxuIeyvwabziDDFyv/media_httpalexbowes3a_ADipi.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/generating-binary-permutations-in-popcount-or</feedburner:origLink></item>
    <item>
      <pubDate>Sat, 23 Apr 2011 12:17:00 -0700</pubDate>
      <title>Some Lazy Fun</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/oKqmXMJtHJU/some-lazy-fun</link>
      <guid isPermaLink="false">http://www.alexbowe.com/some-lazy-fun</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpianumcesedu_imcvu" height="188" src="http://posterous.com/getfile/files.posterous.com/alexbowe/hssHaJuwdpajgFecIaEesBsaxuanivclrujffytbAiwtAslnwAFffdxupofC/media_httpianumcesedu_imCvu.png.scaled500.png" width="400" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Image taken from&amp;nbsp;&lt;a href="http://ian.umces.edu/"&gt;http://ian.umces.edu/&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update:&amp;nbsp;&lt;/strong&gt;fellow algorithms researcher&amp;nbsp;&lt;a href="http://fclaude.recoded.cl"&gt;Francisco Claude&lt;/a&gt;&amp;nbsp;just posted&amp;nbsp;&lt;a href="http://fclaude.recoded.cl/archives/177"&gt;a great article about using lazy evaluation to solve Tic Tac Toe games in Common Lisp&lt;/a&gt;. &lt;a href="http://niki.code-karma.com"&gt;Niki&lt;/a&gt;&amp;nbsp;(my brother) also wrote a post using &lt;a href="http://niki.code-karma.com/2011/05/hiding-io-latency-in-generators-by-async-prefetching/"&gt;generators with asynchronous prefetching to hide IO latency&lt;/a&gt;.&amp;nbsp;Worth a read I say!&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve recently been obsessing over this programming idea called &lt;em&gt;streams&lt;/em&gt; (also known as &lt;em&gt;infinite lists&lt;/em&gt; or &lt;em&gt;generators&lt;/em&gt;), which I learned about from the &lt;a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html"&gt;Structure and Interpretation of Computer Programs&lt;/a&gt; book. It is kind of like an iterator that creates its own data as you go along, and it can lead to performance increases and wonderfully readable code when you utilise them with &lt;a href="http://en.wikipedia.org/wiki/Higher-order_function"&gt;higher order functions&lt;/a&gt; such as &lt;a href="http://en.wikipedia.org/wiki/Map_(higher-order_function)"&gt;map&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Fold_(higher-order_function)"&gt;reduce&lt;/a&gt; (which many things can be rewritten in). It also allows you to express infinitely large data structures.&lt;/p&gt;
&lt;p&gt;When regular lists are processed with higher order functions, you need to compute the entire list at each stage; if you have 100 elements, and you map a function to them, then filter them, then partially reduce them, you may be doing up to 300 operations, but what if you only want to take the first 5 elements of the result? That would be a waste, hence streams are sometimes a better choice.&lt;/p&gt;
&lt;p&gt;Although SICP details how to do it in Scheme, in this blog post I will show some languages that have it built in - Haskell and Python - and how to implement streams yourself if you ever find yourself in a language without it&lt;a title="see footnote" class="footnote"&gt;1&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Haskell&lt;/h2&gt;
&lt;p&gt;Haskell is a lazy language. It didn&amp;rsquo;t earn this reputation from not doing the dishes when you ask it to (although that is another reason it is lazy). What it means in the context of formal languages is that evaluation is postponed until &lt;em&gt;absolutely necessary&lt;/em&gt; (&lt;a href="http://blog.ezyang.com/2011/04/the-haskell-heap/"&gt;Here&lt;/a&gt; is a cute (illustrated) blog post describing this lazy evaluation stuff). Take this code for example:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Prelude&amp;gt; let x = [1..10]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;At this stage you might be tempted to say that x is the list of numbers from 1 to 10. Actually it only represents a &lt;em&gt;promise&lt;/em&gt; that when you need that list, x is your guy. The above code that creates a list from 1 to 10 still hasn&amp;rsquo;t been executed until I finally ask it to be (by referring to x):&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Prelude&amp;gt; x
[1,2,3,4,5,6,7,8,9,10]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;It is kind of like telling your mum you&amp;rsquo;ll do the dishes, but waiting until she shouts your name out again before you put down your DS. Actually, it is sliiiiightly different - if I instead wrote:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Prelude&amp;gt; let x = [1..10]
Prelude&amp;gt; let y = x ++ [11..20]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I have referred to x again when I declared y, but x &lt;em&gt;still&lt;/em&gt; hasn&amp;rsquo;t evaluated. Only after I shout y&amp;rsquo;s name will y shout x&amp;rsquo;s name and give me back my whole list:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Prelude&amp;gt; y
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Here you ask your robot to wash half the dishes, but he is too busy playing DS too (stupid robot). Finally when your mum shouts, you shout at the robot, and he does his set of dishes, and you do yours. But what is the benefit here? It isn&amp;rsquo;t that I can get more DS time in&amp;hellip;&lt;/p&gt;
&lt;p&gt;Take for example a list of positive integers from 1. Yes, all of them. In other languages it might be hard to express this, but in Haskell it is as simple as &lt;code&gt;[1..]&lt;/code&gt;. This means we have a list of infinite integers, but we will only calculate as far as we need:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Prelude&amp;gt; let z = [1..]
Prelude&amp;gt; head z
1
Prelude&amp;gt; take 5 z
[1,2,3,4,5]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;The syntax here is amazingly terse, and it may make your code more efficient. But even if we don&amp;rsquo;t have the syntax for it in another language, we can provide it ourselves very easily.&lt;/p&gt;
&lt;h2&gt;Python&lt;/h2&gt;
&lt;p&gt;Python has a similar concept called &lt;em&gt;generators&lt;/em&gt;, which are made using the &lt;code&gt;yield&lt;/code&gt; keyword in place of &lt;code&gt;return&lt;/code&gt;, more than one time (or in a loop) in a function:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;integers_from&lt;/span&gt;(N):
     &lt;span class="kw"&gt;while&lt;/span&gt;(&lt;span class="i"&gt;1&lt;/span&gt;):
         &lt;span class="kw"&gt;yield&lt;/span&gt; N
         N += &lt;span class="i"&gt;1&lt;/span&gt;

&amp;gt;&amp;gt;&amp;gt; z = integers_from(&lt;span class="i"&gt;1&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; z.next()
&lt;span class="i"&gt;1&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; z.next()
&lt;span class="i"&gt;2&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Python generators are stateful and are hence slightly different to an infinite list in Haskell. For example &lt;code&gt;z.next()&lt;/code&gt; returns different values in two places, and thus is time sensitive - we cannot get z to &amp;lsquo;rewind&amp;rsquo; like we could in Haskell, where &lt;code&gt;z&lt;/code&gt; is stateless. Statelessness can lead to easier to understand code, among other benefits.&lt;/p&gt;
&lt;h2&gt;Rolling Our Own&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s reinvent this wheel in Python (but in a stateless manner), so if we ever find ourselves craving infinite lists we can easily roll our own in pretty much any language with &lt;a href="http://en.wikipedia.org/wiki/Lambda_(programming)"&gt;Lambdas&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I have chosen Python to implement this, even though it already supports infinite lists through generators, simply because its syntax is more accessible. Indeed, the below can already be done with Python&amp;rsquo;s built-in-functions (although with state). It is probably &lt;em&gt;not a great idea to do it this way in Python&lt;/em&gt;, as it doesn&amp;rsquo;t have &lt;a href="http://stackoverflow.com/questions/310974/what-is-tail-call-optimization"&gt;tail call optimisation&lt;/a&gt; (unless you use &lt;a href="http://code.activestate.com/recipes/474088/"&gt;this hack&lt;/a&gt; using decorators and exceptions).&lt;/p&gt;
&lt;p&gt;First we&amp;rsquo;ll look at adding lazy evaluation, however the syntax requires it to be explicit:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; x = &lt;span class="kw"&gt;lambda&lt;/span&gt;: &lt;span class="i"&gt;5&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; y = &lt;span class="kw"&gt;lambda&lt;/span&gt;: &lt;span class="i"&gt;2&lt;/span&gt; + x()&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;x&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; 5, and &lt;code&gt;y&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; 7, they are both functions that will evaluate to that when we finally run them; the expression inside the lambda won&amp;rsquo;t be evaluated until we do so explicitly:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; x()
&lt;span class="i"&gt;5&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; y()
&lt;span class="i"&gt;7&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;And that&amp;rsquo;s pretty much all the heavy lifting. To make an infinite list, we basically make a linked list where we generate each node as we need it:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;integers_from&lt;/span&gt;(N): &lt;span class="kw"&gt;return&lt;/span&gt; (N, &lt;span class="kw"&gt;lambda&lt;/span&gt;: integers_from(N+&lt;span class="i"&gt;1&lt;/span&gt;))

&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;head&lt;/span&gt;((H, _)): &lt;span class="kw"&gt;return&lt;/span&gt; H

&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;tail&lt;/span&gt;((_, T)): &lt;span class="kw"&gt;return&lt;/span&gt; T()&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;And there is our infinite list. To access it use &lt;code&gt;head()&lt;/code&gt; and &lt;code&gt;tail()&lt;/code&gt; (recursively if necessary):&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; z = integers_from(&lt;span class="i"&gt;1&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; head(z)
&lt;span class="i"&gt;1&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; head(tail(z))
&lt;span class="i"&gt;2&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;h2&gt;Helper Functions&lt;/h2&gt;
&lt;p&gt;First we should make a way for us to look at our streams:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;to_array&lt;/span&gt;(stream):
    &lt;span class="kw"&gt;return&lt;/span&gt; &lt;span class="pd"&gt;reduce&lt;/span&gt;(&lt;span class="kw"&gt;lambda&lt;/span&gt; a, x: a + [x], [], stream)&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Which is a &lt;a href="http://en.wikipedia.org/wiki/Fold_(higher-order_function)"&gt;reduce&lt;/a&gt; operation that puts each head element into an array (which is carried along as a parameter to &lt;code&gt;reduce()&lt;/code&gt;). Here is &lt;code&gt;reduce()&lt;/code&gt; (&lt;code&gt;map()&lt;/code&gt; can be found in this &lt;a href="https://gist.github.com/938886"&gt;gist&lt;/a&gt;):&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;null_stream = (&lt;span class="pc"&gt;None&lt;/span&gt;, &lt;span class="pc"&gt;None&lt;/span&gt;)
&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;reduce&lt;/span&gt;(f, result, stream):
    &lt;span class="kw"&gt;if&lt;/span&gt; stream &lt;span class="kw"&gt;is&lt;/span&gt; null_stream: &lt;span class="kw"&gt;return&lt;/span&gt; result
    &lt;span class="kw"&gt;return&lt;/span&gt; &lt;span class="pd"&gt;reduce&lt;/span&gt;(f, f(result, head(stream)), tail(stream))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;We needed some way to tell if we had reached the end of a stream - not all streams are infinitely long. Meet our next function, which will help us terminate a stream:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;take&lt;/span&gt;(N, stream):
    &lt;span class="kw"&gt;if&lt;/span&gt; N &amp;lt;= &lt;span class="i"&gt;0&lt;/span&gt; &lt;span class="kw"&gt;or&lt;/span&gt; stream &lt;span class="kw"&gt;is&lt;/span&gt; null_stream: &lt;span class="kw"&gt;return&lt;/span&gt; null_stream
    &lt;span class="kw"&gt;return&lt;/span&gt; (head(stream), &lt;span class="kw"&gt;lambda&lt;/span&gt;: take(N-&lt;span class="i"&gt;1&lt;/span&gt;, tail(stream)))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;This will take the first &lt;code&gt;N&lt;/code&gt; elements from the specified stream. So now we can inspect the first &lt;code&gt;N&lt;/code&gt; elements:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; to_array(take(&lt;span class="i"&gt;10&lt;/span&gt;, integers_from(&lt;span class="i"&gt;1&lt;/span&gt;)))
[&lt;span class="i"&gt;1&lt;/span&gt;, &lt;span class="i"&gt;2&lt;/span&gt;, &lt;span class="i"&gt;3&lt;/span&gt;, &lt;span class="i"&gt;4&lt;/span&gt;, &lt;span class="i"&gt;5&lt;/span&gt;, &lt;span class="i"&gt;6&lt;/span&gt;, &lt;span class="i"&gt;7&lt;/span&gt;, &lt;span class="i"&gt;8&lt;/span&gt;, &lt;span class="i"&gt;9&lt;/span&gt;, &lt;span class="i"&gt;10&lt;/span&gt;]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;For our upcoming example, we also need a &lt;code&gt;filter()&lt;/code&gt; method, which will filter out elements that meet a provided predicate:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;filter&lt;/span&gt;(pred, stream):
    &lt;span class="kw"&gt;if&lt;/span&gt; pred(head(stream)):
        &lt;span class="kw"&gt;return&lt;/span&gt; (head(stream), &lt;span class="kw"&gt;lambda&lt;/span&gt;: &lt;span class="pd"&gt;filter&lt;/span&gt;(pred, tail(stream)))
    &lt;span class="kw"&gt;return&lt;/span&gt; &lt;span class="pd"&gt;filter&lt;/span&gt;(pred, tail(stream))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Now onto our example :)&lt;/p&gt;
&lt;h2&gt;Textbook Example&lt;/h2&gt;
&lt;p&gt;Here is the standard example to demonstrate the terseness of streams:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;sieve&lt;/span&gt;(stream):
    h = head(stream)
    &lt;span class="kw"&gt;return&lt;/span&gt; (h, &lt;span class="kw"&gt;lambda&lt;/span&gt;: sieve(
        &lt;span class="pd"&gt;filter&lt;/span&gt;(&lt;span class="kw"&gt;lambda&lt;/span&gt; x: x%h != &lt;span class="i"&gt;0&lt;/span&gt;, tail(stream))))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Here is a function which recursively filters anything which is divisible by any number we have previously seen in our stream. Math aficionados will notice that this is the &lt;a href="http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes"&gt;Sieve of Eratosthenes&lt;/a&gt; algorithm.&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; primes = sieve(integers_from(&lt;span class="i"&gt;2&lt;/span&gt;))
&amp;gt;&amp;gt;&amp;gt; to_array(take(&lt;span class="i"&gt;10&lt;/span&gt;, primes))
[&lt;span class="i"&gt;2&lt;/span&gt;, &lt;span class="i"&gt;3&lt;/span&gt;, &lt;span class="i"&gt;5&lt;/span&gt;, &lt;span class="i"&gt;7&lt;/span&gt;, &lt;span class="i"&gt;11&lt;/span&gt;, &lt;span class="i"&gt;13&lt;/span&gt;, &lt;span class="i"&gt;17&lt;/span&gt;, &lt;span class="i"&gt;19&lt;/span&gt;, &lt;span class="i"&gt;23&lt;/span&gt;, &lt;span class="i"&gt;29&lt;/span&gt;]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Recursively defined data, and only as much of it as we want - pretty neat.&lt;/p&gt;
&lt;h2&gt;And Now For Something Almost Completely Different&lt;/h2&gt;
&lt;p&gt;When I first saw this, I wondered what application there might be to have a stream of functions. Here I have defined a stream which recursively applies a function to itself:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;rec_stream&lt;/span&gt;(f):
    &lt;span class="kw"&gt;return&lt;/span&gt; (f, &lt;span class="kw"&gt;lambda&lt;/span&gt;: rec_stream(&lt;span class="kw"&gt;lambda&lt;/span&gt; x: f(f(x))))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;When might this be useful? It might yield speed improvements if you commonly want to recursively apply a function a certain amount of times, but have costly branching (so the condition check at each level is slow). It could also be used as a abstraction for recursive iteration &lt;a title="see footnote" class="footnote"&gt;2&lt;/a&gt;, which gives you back the function 'already recursed' so to speak (although lazily).&lt;/p&gt;
&lt;p&gt;One such recursive process I can think of is &lt;a href="http://en.wikipedia.org/wiki/Newton's_method"&gt;Newton&amp;rsquo;s method for approximating roots&lt;/a&gt;, defined recursively as:&lt;/p&gt;
&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmathurlcom3_ghfkj" height="39" src="http://posterous.com/getfile/files.posterous.com/alexbowe/fsluenbqqIrpuqqmoiJcfjuhyIlanJwFIAhlxxtCxuucABGclvAioEHxFlIi/media_httpmathurlcom3_gHFkj.png.scaled500.png" width="146" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;When &lt;code&gt;f(x) = 0&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The more iterations you do the more accurate the solution becomes. One use of Newton&amp;rsquo;s method is to use it until you have reached a certain error tolerance. Another way, which I learned about recently when reading about the &lt;a href="http://en.wikipedia.org/wiki/Fast_inverse_square_root"&gt;fast inverse square root algorithm&lt;/a&gt;, which uses just one step of Newton&amp;rsquo;s method as a cheap way to improve it&amp;rsquo;s (already pretty good) initial guess. There is a really great article &lt;a href="http://betterexplained.com/articles/understanding-quakes-fast-inverse-square-root/"&gt;here&lt;/a&gt; which explains it very well.&lt;/p&gt;
&lt;p&gt;After reading that, I wondered about a stream that would consist of functions of increasing accuracy of Newton&amp;rsquo;s method.&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;newton&lt;/span&gt;(f, fdash):
    &lt;span class="kw"&gt;return&lt;/span&gt; &lt;span class="kw"&gt;lambda&lt;/span&gt; x: x - f(x)/&lt;span class="pd"&gt;float&lt;/span&gt;(fdash(x))&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;newton()&lt;/code&gt; function accepts &lt;code&gt;f(x)&lt;/code&gt; and &lt;code&gt;f'(x)&lt;/code&gt;, and returns a function that accepts a first guess.&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;newton_solver&lt;/span&gt;(iters, f, fdash):
    &lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;solve&lt;/span&gt;(v):
        n = newton(&lt;span class="kw"&gt;lambda&lt;/span&gt; x: f(x) - v, fdash)
        stream = rec_stream(n)
        &lt;span class="kw"&gt;return&lt;/span&gt; to_array(take(iters, stream))[-&lt;span class="i"&gt;1&lt;/span&gt;]
    &lt;span class="kw"&gt;return&lt;/span&gt; solve&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;This one is a little more complicated. In order to have it solve for a value other than zero, I needed to either define it in &lt;code&gt;f(x)&lt;/code&gt;, since &lt;code&gt;f(x)&lt;/code&gt; must equal zero, but I didn&amp;rsquo;t want the user to have to iterate over the stream each time they wanted to compute the square root of a different number, say. To allow it to return a function that solved for square roots in the general case, I had to make the internal function &lt;code&gt;solve()&lt;/code&gt;, which would bind for the value the caller specifies, hence solving &lt;code&gt;f(x) = v&lt;/code&gt; for &lt;code&gt;x&lt;/code&gt;. Hopefully this becomes clearer with an example:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; sqrt = newton_solver(&lt;span class="i"&gt;1&lt;/span&gt;, &lt;span class="kw"&gt;lambda&lt;/span&gt; x: x**&lt;span class="i"&gt;2&lt;/span&gt;, &lt;span class="kw"&gt;lambda&lt;/span&gt; x: &lt;span class="i"&gt;2&lt;/span&gt;*x) &lt;span class="c"&gt;# 1 iter&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; sqrt(&lt;span class="i"&gt;64&lt;/span&gt;)(&lt;span class="i"&gt;4&lt;/span&gt;) &lt;span class="c"&gt;# Sqrt of 64 with initial guess of 4&lt;/span&gt;
&lt;span class="fl"&gt;10.0&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; sqrt = newton_solver(&lt;span class="i"&gt;3&lt;/span&gt;, &lt;span class="kw"&gt;lambda&lt;/span&gt; x: x**&lt;span class="i"&gt;2&lt;/span&gt;, &lt;span class="kw"&gt;lambda&lt;/span&gt; x: &lt;span class="i"&gt;2&lt;/span&gt;*x) &lt;span class="c"&gt;# 3 iters&lt;/span&gt;
&amp;gt;&amp;gt;&amp;gt; sqrt(&lt;span class="i"&gt;64&lt;/span&gt;)(&lt;span class="i"&gt;4&lt;/span&gt;)
&lt;span class="fl"&gt;8.000000371689179&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Now we can pass around this square root function and it will always do 3 iterations of Newton's method.&lt;/p&gt;
&lt;p&gt;This may not be practical unless compilers can optimise the resulting function (or if there is a way to do the reduction myself easily), but it was fun to do :) As always comments and suggestions are appreciated. If anyone who reads this is good with compilers, advice would be great :D&lt;/p&gt;
&lt;p&gt;What you can do now is read &lt;a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html"&gt;SICP&lt;/a&gt; for more cool things like streams and functional programming, or check out &lt;a href="http://graphics.stanford.edu/~seander/bithacks.html"&gt;Sean Anderson&amp;rsquo;s bit hacks page&lt;/a&gt; for more cool hacks like the fast inverse square root. Or refactor your code to use map, reduce and streams :)&lt;/p&gt;
&lt;div class="footnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The reason I have chosen Python for this exercise is for reasons of accessibility. &lt;a href="http://chneukirchen.org/blog/archive/2005/05/lazy-streams-for-ruby.html"&gt;Here&lt;/a&gt; is a post about implementing streams in Ruby, and &lt;a href="http://khigia.wordpress.com/2007/05/07/44/"&gt;here&lt;/a&gt; is one for Erlang :) but of course it&amp;rsquo;s all pretty much the same deal.&lt;a title="return to article" class="reversefootnote"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If a compiler could optimise this, simplifying the reapplied function, but keeping the generality, that&amp;rsquo;d be really cool :) I don&amp;rsquo;t think many compilers would/could do that for lambdas though. Any information would be great.&lt;a title="return to article" class="reversefootnote"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/some-lazy-fun"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/some-lazy-fun#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/esh2k3Uq96fQ8bxHkl_LJd7u6oM/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/esh2k3Uq96fQ8bxHkl_LJd7u6oM/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/esh2k3Uq96fQ8bxHkl_LJd7u6oM/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/esh2k3Uq96fQ8bxHkl_LJd7u6oM/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/oKqmXMJtHJU" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="188" width="400" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/hssHaJuwdpajgFecIaEesBsaxuanivclrujffytbAiwtAslnwAFffdxupofC/media_httpianumcesedu_imCvu.png">
        <media:thumbnail height="188" width="400" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/hssHaJuwdpajgFecIaEesBsaxuanivclrujffytbAiwtAslnwAFffdxupofC/media_httpianumcesedu_imCvu.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="39" width="146" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/fsluenbqqIrpuqqmoiJcfjuhyIlanJwFIAhlxxtCxuucABGclvAioEHxFlIi/media_httpmathurlcom3_gHFkj.png">
        <media:thumbnail height="39" width="146" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/fsluenbqqIrpuqqmoiJcfjuhyIlanJwFIAhlxxtCxuucABGclvAioEHxFlIi/media_httpmathurlcom3_gHFkj.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/some-lazy-fun</feedburner:origLink></item>
    <item>
      <pubDate>Sun, 17 Apr 2011 05:03:00 -0700</pubDate>
      <title>Design Pattern Flash Cards</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/qmK2DvoFSdw/design-pattern-flash-cards</link>
      <guid isPermaLink="false">http://www.alexbowe.com/design-pattern-flash-cards</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/temp-2011-04-17/hJFJGmGwrBFnhBICsqFDntttstGagIbJaDGytfaiyFaAEBqGjCxGBqrpHyAH/flashcards.png.scaled1000.png"&gt;&lt;img alt="Flashcards" height="287" src="http://posterous.com/getfile/files.posterous.com/temp-2011-04-17/hJFJGmGwrBFnhBICsqFDntttstGagIbJaDGytfaiyFaAEBqGjCxGBqrpHyAH/flashcards.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;Last year I studied a subject which required me to memorise design patterns. I tried online flash card web sites, but I was irritated that I didn't own the data I put up (they had no export option). So I wrote a something in Python to generate flash cards for me using LaTeX and the Cheetah templating library. The repository is hosted &lt;a href="https://github.com/alexbowe/cardgen"&gt;here&lt;/a&gt;, although it could do with a refactor.&lt;/p&gt;
&lt;p&gt;If you don't want to generate your own, you can download the pre-generated design pattern intent flash cards &lt;a href="https://github.com/alexbowe/cardgen/raw/master/intents.pdf"&gt;here&lt;/a&gt; which contains the 23 original design patterns from the Gang Of Four.&lt;/p&gt;
&lt;p&gt;To generate your own flash cards, create an input text file with this structure:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Front text (such as pattern name):
Definition line 1.
Definition line 2.&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;For example:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Abstract Factory:
Provides an interface for creating families of related or
dependent objects without specifying their concrete classes.&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Currently the front text is single-line only. The regex could be updated of course (if you do, feel free to send a pull request!).&lt;/p&gt;
&lt;p&gt;To compile this:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;./cardgen.py -i inputfile -o outputfile
pdflatex outputfile&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Then just print it out on a double side printer (or glue the two sheets together). I carried these around with me all the time during the lead-up to the exam, and I was scay-fast when it came to recalling which design pattern did what. Just flick through them (shuffle first) in forward or reverse order when you are on the train next :)&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/design-pattern-flash-cards"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/design-pattern-flash-cards#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/iH8lLN7bVuyvNAAvA39HPISez_Y/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/iH8lLN7bVuyvNAAvA39HPISez_Y/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/iH8lLN7bVuyvNAAvA39HPISez_Y/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/iH8lLN7bVuyvNAAvA39HPISez_Y/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/qmK2DvoFSdw" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="388" width="675" url="http://getfile9.posterous.com/getfile/files.posterous.com/temp-2011-04-17/hJFJGmGwrBFnhBICsqFDntttstGagIbJaDGytfaiyFaAEBqGjCxGBqrpHyAH/flashcards.png">
        <media:thumbnail height="287" width="500" url="http://getfile8.posterous.com/getfile/files.posterous.com/temp-2011-04-17/hJFJGmGwrBFnhBICsqFDntttstGagIbJaDGytfaiyFaAEBqGjCxGBqrpHyAH/flashcards.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/design-pattern-flash-cards</feedburner:origLink></item>
    <item>
      <pubDate>Mon, 04 Apr 2011 00:49:00 -0700</pubDate>
      <title>Metaprogramming Erlang the Easy Way</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/o2HMLZk-DhE/48392639</link>
      <guid isPermaLink="false">http://www.alexbowe.com/48392639</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/aboJwDhddnFznrrbgsifFsjcEdldgsvozdhfyozfetDrCoqsAsqpwaGqghsv/media_httpimprovisazn_zdzss.png.scaled1000.png"&gt;&lt;img alt="Media_httpimprovisazn_zdzss" height="265" src="http://posterous.com/getfile/files.posterous.com/alexbowe/aboJwDhddnFznrrbgsifFsjcEdldgsvozdhfyozfetDrCoqsAsqpwaGqghsv/media_httpimprovisazn_zdzss.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve recently taken &lt;a href="http://www.erlang.org/"&gt;Erlang&lt;/a&gt; back up&lt;a title="see footnote" class="footnote"&gt;1&lt;/a&gt;, and I wanted to use this blog post to talk about something cool I learned over the weekend.&lt;/p&gt;
&lt;p&gt;I am implementing a data structure. Reimplementing actually, as it is the structure from my &lt;a href="https://github.com/alexbowe/honours-thesis/downloads"&gt;thesis&lt;/a&gt; - a succinct text index (I will post a blog on this soon).&lt;/p&gt;
&lt;p&gt;Why am I reimplementing it in Erlang? The structure involves many bit-level operations, and I wanted to try out Erlang&amp;rsquo;s &lt;a href="http://www.erlang.org/doc/programming_examples/bit_syntax.html"&gt;primitive Binary type&lt;/a&gt;, which seems to be allow efficient splitting and concatenation (which I require). Erlang&amp;rsquo;s approach to concurrency will hopefully assist me to experiment with distributing the structure, too. As a bonus, the functional approach has lent itself well to this math-centric data structure, and my code is much, MUCH cleaner because of it (I love &lt;a href="http://en.wikipedia.org/wiki/Fold_(higher-order_function)"&gt;reduce&lt;/a&gt; and other &lt;a href="http://en.wikipedia.org/wiki/Higher-order_function"&gt;higher order functions&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;One function I needed to implement was &lt;a href="http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable"&gt;popcount&lt;/a&gt; (the sum of set set bits in a bitvector). For example, if &lt;code&gt;b = 1010&lt;/code&gt; then &lt;code&gt;pop(b)&lt;/code&gt; is 2.&lt;/p&gt;
&lt;p&gt;There are many methods listed on &lt;a href="http://www.valuedlessons.com/2009/01/popcount-in-python-with-benchmarks.html"&gt;this blog&lt;/a&gt;. One of them is the table method, which precomputes all popcounts for 16 bit integers (or any length you have space for):&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;POPCOUNT_TABLE16 = [&lt;span class="i"&gt;0&lt;/span&gt;] * &lt;span class="i"&gt;2&lt;/span&gt;**&lt;span class="i"&gt;16&lt;/span&gt;
&lt;span class="kw"&gt;for&lt;/span&gt; index &lt;span class="kw"&gt;in&lt;/span&gt; &lt;span class="pd"&gt;xrange&lt;/span&gt;(&lt;span class="pd"&gt;len&lt;/span&gt;(POPCOUNT_TABLE16)):
    POPCOUNT_TABLE16[index] = (index &amp;amp; &lt;span class="i"&gt;1&lt;/span&gt;)
    + POPCOUNT_TABLE16[index &amp;gt;&amp;gt; &lt;span class="i"&gt;1&lt;/span&gt;]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I translated this to Erlang:&lt;/p&gt;
&lt;p&gt;&lt;div class="data type-erlang"&gt;
    
      &lt;table class="lines" cellspacing="0" cellpadding="0"&gt;
        &lt;tr&gt;
          &lt;td&gt;
            &lt;pre class="line_numbers"&gt;&lt;span rel="#L1" id="L1"&gt;1&lt;/span&gt;
&lt;span rel="#L2" id="L2"&gt;2&lt;/span&gt;
&lt;span rel="#L3" id="L3"&gt;3&lt;/span&gt;
&lt;span rel="#L4" id="L4"&gt;4&lt;/span&gt;
&lt;span rel="#L5" id="L5"&gt;5&lt;/span&gt;
&lt;span rel="#L6" id="L6"&gt;6&lt;/span&gt;
&lt;span rel="#L7" id="L7"&gt;7&lt;/span&gt;
&lt;span rel="#L8" id="L8"&gt;8&lt;/span&gt;
&lt;span rel="#L9" id="L9"&gt;9&lt;/span&gt;
&lt;/pre&gt;
          &lt;/td&gt;
          &lt;td width="100%"&gt;
            
              
                &lt;div class="highlight"&gt;&lt;pre /&gt;&lt;div class="line" id="LC1"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;popcount_table&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC2"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;export&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;gen_table&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC3"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC4"&gt;&lt;span class="nf"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Bits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="ow"&gt;bsl&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Bits&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC5"&gt;&lt;span class="nf"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC6"&gt;&lt;span class="nf"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC7"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="nv"&gt;New&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Value&lt;/span&gt; &lt;span class="ow"&gt;band&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nn"&gt;erlang&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;element&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;Value&lt;/span&gt; &lt;span class="ow"&gt;bsr&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC8"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="nv"&gt;NewTable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;erlang&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;append_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;New&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC9"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Stop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;NewTable&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/div&gt;
              
            
          &lt;/td&gt;
        &lt;/tr&gt;
      &lt;/table&gt;
    
  &lt;/div&gt;&lt;/p&gt;
&lt;p&gt;So now &lt;code&gt;gen_table(Bits)&lt;/code&gt; will generate a tuple for me with all popcounts from 0 to $2^Bits$. However, we may gain performance&lt;a title="see footnote" class="footnote"&gt;2&lt;/a&gt; from knowing how big we want the table to be at compile time. If we know the amount of bits we want our popcount table to work for, we could type the table directly into the source. But that would impede our flexibility, and make our code ugly.&lt;/p&gt;
&lt;p&gt;Enter &lt;a href="https://github.com/esl/parse_trans"&gt;&lt;code&gt;ct_expand&lt;/code&gt;&lt;/a&gt;. We can use &lt;code&gt;ct_expand:term( &amp;lt;code to execute&amp;gt; )&lt;/code&gt; to run &lt;code&gt;gen_table()&lt;/code&gt; for us at compile time! Now we only run &lt;code&gt;gen_table()&lt;/code&gt; whenever we compile the module - after that the table is embedded in the binary.&lt;/p&gt;
&lt;p&gt;First, check out ct_expand (&lt;code&gt;git clone https://github.com/esl/parse_trans.git&lt;/code&gt;), and compile it with &lt;code&gt;make&lt;/code&gt;&lt;a title="see footnote" class="footnote"&gt;3&lt;/a&gt;. Then just move the &lt;code&gt;.beam&lt;/code&gt; files into your project directory. Now we&amp;rsquo;re ready for some metaprogramming :)&lt;/p&gt;
&lt;p&gt;I created a new module for generating our table at compile time:&lt;/p&gt;
&lt;p&gt;&lt;div class="data type-erlang"&gt;
    
      &lt;table class="lines" cellspacing="0" cellpadding="0"&gt;
        &lt;tr&gt;
          &lt;td&gt;
            &lt;pre class="line_numbers"&gt;&lt;span rel="#L1" id="L1"&gt;1&lt;/span&gt;
&lt;span rel="#L2" id="L2"&gt;2&lt;/span&gt;
&lt;span rel="#L3" id="L3"&gt;3&lt;/span&gt;
&lt;span rel="#L4" id="L4"&gt;4&lt;/span&gt;
&lt;span rel="#L5" id="L5"&gt;5&lt;/span&gt;
&lt;span rel="#L6" id="L6"&gt;6&lt;/span&gt;
&lt;span rel="#L7" id="L7"&gt;7&lt;/span&gt;
&lt;span rel="#L8" id="L8"&gt;8&lt;/span&gt;
&lt;span rel="#L9" id="L9"&gt;9&lt;/span&gt;
&lt;span rel="#L10" id="L10"&gt;10&lt;/span&gt;
&lt;span rel="#L11" id="L11"&gt;11&lt;/span&gt;
&lt;/pre&gt;
          &lt;/td&gt;
          &lt;td width="100%"&gt;
            
              
                &lt;div class="highlight"&gt;&lt;pre /&gt;&lt;div class="line" id="LC1"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;popcount&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC2"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;parse_transform&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ct_expand&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC3"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;popcount_table&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC4"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;export&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;popcount16&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;popcount32&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC5"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC6"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;define&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;TABLE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nn"&gt;ct_expand&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;term&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="nn"&gt;popcount_table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;gen_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC7"&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="ni"&gt;define&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;TABLE16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;ct_expand&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;term&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="nv"&gt;TABLE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC8"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC9"&gt;&lt;span class="nf"&gt;popcount16&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;V&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nn"&gt;erlang&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;V&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="nv"&gt;TABLE16&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC10"&gt;&lt;span class="nf"&gt;popcount32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;V&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;popcount16&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="nv"&gt;V&lt;/span&gt; &lt;span class="ow"&gt;band&lt;/span&gt; &lt;span class="mi"&gt;16#ffff&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC11"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;popcount16&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;V&lt;/span&gt; &lt;span class="ow"&gt;bsr&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;band&lt;/span&gt; &lt;span class="mi"&gt;16#ffff&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/div&gt;
              
            
          &lt;/td&gt;
        &lt;/tr&gt;
      &lt;/table&gt;
    
  &lt;/div&gt;&lt;/p&gt;
&lt;p&gt;The reason we must be in a new module is because &lt;code&gt;ct_expand:term()&lt;/code&gt; requires it&amp;rsquo;s parameter to be something it will know about at compile time. This could be an inline fun (which I tried, but it wasn&amp;rsquo;t pretty), or it can be something you compile before &lt;code&gt;ct_expand:term()&lt;/code&gt; is executed (see line 3). Note on line 2 that we also need to compile &lt;code&gt;parse_transform&lt;/code&gt; and &lt;code&gt;ct_expand&lt;/code&gt; with our module.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s test it out:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;1&amp;gt; c(popcount).
{ok,popcount}
2&amp;gt; popcount:popcount16(2#1010).
2
3&amp;gt; popcount:popcount16(2#101011).
4&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Nice! Thanks to &lt;a href="http://ulf.wiger.net/weblog/"&gt;Ulf Wiger&lt;/a&gt; for making that so easy :)&lt;/p&gt;
&lt;p&gt;( Image stolen from&amp;nbsp;&lt;a href="http://improvisazn.wordpress.com/2010/02/22/meta/"&gt;http://improvisazn.wordpress.com/2010/02/22/meta/&lt;/a&gt; )&lt;/p&gt;
&lt;div class="footnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;My only other encounter with it was when I wrote an essay on the rationale of Erlang, which is something I will convert to blog format and post later if anyone is interested.&lt;a title="return to article" class="reversefootnote"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The compiler may be able to apply further optimisations, so table access might be faster itself, but the main benefit comes from not having to generate the table while your code is running. Note that I haven&amp;rsquo;t experimented with it. I will try to run some tests this week and update the blog post.&lt;a title="return to article" class="reversefootnote"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If rebar gives you an error about crypto being undefined, you will need the &lt;code&gt;erlang-crypto&lt;/code&gt; package. On Mac OS X: &lt;code&gt;sudo port install erlang +ssl&lt;/code&gt;. (I had this issue)&lt;a title="return to article" class="reversefootnote"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/48392639"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/48392639#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/1rWtsL8W_mv1NHvG-bXl5xoV8tI/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1rWtsL8W_mv1NHvG-bXl5xoV8tI/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/1rWtsL8W_mv1NHvG-bXl5xoV8tI/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1rWtsL8W_mv1NHvG-bXl5xoV8tI/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/o2HMLZk-DhE" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="594" width="1121" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/aboJwDhddnFznrrbgsifFsjcEdldgsvozdhfyozfetDrCoqsAsqpwaGqghsv/media_httpimprovisazn_zdzss.png">
        <media:thumbnail height="265" width="500" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/aboJwDhddnFznrrbgsifFsjcEdldgsvozdhfyozfetDrCoqsAsqpwaGqghsv/media_httpimprovisazn_zdzss.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/48392639</feedburner:origLink></item>
    <item>
      <pubDate>Sat, 19 Mar 2011 21:44:00 -0700</pubDate>
      <title>Au Naturale</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/pMi9XUz-Szo/au-naturale</link>
      <guid isPermaLink="false">http://www.alexbowe.com/au-naturale</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpdatawhicdnc_apuda" height="400" src="http://posterous.com/getfile/files.posterous.com/alexbowe/boqcnCxevdvmHwBwnrCACEHibzxAgyrhufhmbxABDxxfznFytHbEtdorivwm/media_httpdatawhicdnc_apuDA.jpg.scaled500.jpg" width="400" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;This blog post is an introduction on how to make a key phrase extractor in Python, using the &lt;a href="http://www.nltk.org/"&gt;Natural Language Toolkit (NLTK)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But how will a search engine know what it is about? How will this document be indexed correctly?
A human can read it and tell that it is about programming, but no search engine company has the money to
pay thousands of people to classify the entire Internet for them. Instead they must reasonably predict
what a human may decide to be the key points of a document. And they must automate this.&lt;/p&gt;

&lt;p&gt;Remember how proper sentences need to be structured with a &lt;a href="http://hubpages.com/hub/Grammar_Mishaps__Building_a_Sentence"&gt;subject and a predicate&lt;/a&gt;?
A subject could be a noun, or a adjective followed by a noun, or a pronoun&amp;hellip; A predicate may be or include a verb&amp;hellip;
We can take a similar approach by defining our key phrases in terms of what types of words (or parts-of-speech) they are, and the pattern in which they occur.&lt;/p&gt;

&lt;p&gt;But how do we know what words are nouns or verbs in an automated fashion?&lt;/p&gt;

&lt;p&gt;Throughout this post I will use an excerpt from &lt;a href="http://www.amazon.com/gp/product/0061673730/alexbowecom-20"&gt;Zen and the Art of Motorcycle Maintenance&lt;/a&gt; as an example:&lt;/p&gt;

&lt;blockquote class="posterous_medium_quote"&gt;&lt;p&gt;The Buddha, the Godhead, resides quite as comfortably in the circuits of a digital
computer or the gears of a cycle transmission as he does at the top of a mountain
or in the petals of a flower. To think otherwise is to demean the Buddha&amp;hellip;which is
to demean oneself.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Before proceeding, make a (mental) note of the key phrases here. What is the document about?&lt;/p&gt;

&lt;h2&gt;Tokenizing&lt;/h2&gt;

&lt;p&gt;In a program, text is represented as a string of characters. How can we go about moving &lt;em&gt;one&lt;/em&gt;
level of abstraction up, to the level of words, or &lt;em&gt;tokens&lt;/em&gt;? To tokenize a sentence you may
be tempted to use Python&amp;rsquo;s &lt;code&gt;.split()&lt;/code&gt; method, but this means you will need to code additional rules
to remove hyphens, newlines and punctuation when appropriate.&lt;/p&gt;

&lt;p&gt;Thankfully the &lt;a href="http://www.nltk.org/"&gt;Natural Language Toolkit (NLTK)&lt;/a&gt; for Python provides a regular expression
tokenizer. There is an example of it (including how it fares against Pythons regular expression tokenization method)
in &lt;a href="http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html"&gt;Chapter 3&lt;/a&gt; of the &lt;a href="http://www.nltk.org/book"&gt;NLTK book&lt;/a&gt;.
It also allows you to have comments:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="c"&gt;#Word Tokenization Regex adapted from NLTK book&lt;/span&gt;
sentence_re = &lt;span class="s"&gt;&lt;span class="mod"&gt;r&lt;/span&gt;&lt;span class="dl"&gt;'''&lt;/span&gt;&lt;span class="k"&gt;(?x)      # set flag to allow comments in regexps&lt;/span&gt;
&lt;span class="k"&gt;# abbreviations, e.g. U.S.A. (with optional last period)&lt;/span&gt;
&lt;span class="k"&gt;([A-Z])(&lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;&lt;span class="k"&gt;[A-Z])+&lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;&lt;span class="k"&gt;?&lt;/span&gt;
&lt;span class="k"&gt;# words with optional internal hyphens&lt;/span&gt;
&lt;span class="k"&gt;| &lt;/span&gt;&lt;span class="k"&gt;\w&lt;/span&gt;&lt;span class="k"&gt;+(-&lt;/span&gt;&lt;span class="k"&gt;\w&lt;/span&gt;&lt;span class="k"&gt;+)*&lt;/span&gt;
&lt;span class="k"&gt;# currency and percentages, e.g. $12.40, 82%&lt;/span&gt;
&lt;span class="k"&gt;| &lt;/span&gt;&lt;span class="k"&gt;\$&lt;/span&gt;&lt;span class="k"&gt;?&lt;/span&gt;&lt;span class="k"&gt;\d&lt;/span&gt;&lt;span class="k"&gt;+(&lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;&lt;span class="k"&gt;\d&lt;/span&gt;&lt;span class="k"&gt;+)?%?&lt;/span&gt;
&lt;span class="k"&gt;# ellipsis&lt;/span&gt;
&lt;span class="k"&gt;| &lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;&lt;span class="k"&gt;\.&lt;/span&gt;
&lt;span class="k"&gt;# these are separate tokens&lt;/span&gt;
&lt;span class="k"&gt;| [][.,;&amp;quot;'?():-_`]&lt;/span&gt;
&lt;span class="dl"&gt;'''&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Once we have constructed our regex for defining what sort of format our words should be in, we call it like so:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;import&lt;/span&gt; &lt;span class="ic"&gt;nltk&lt;/span&gt;
&lt;span class="c"&gt;#doc is a string containing our document&lt;/span&gt;
toks = nltk.regexp_tokenize(doc, sentence_re)

&amp;gt;&amp;gt;&amp;gt; toks
[&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;The&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;Buddha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;the&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;Godhead&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;resides&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, ...&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Tagging&lt;/h2&gt;

&lt;p&gt;The next step is tagging. This uses statistical data to apply a Part-of-speech tag to each token, e.g. ADJ, NN (Noun), and so on.
Since it is statistical, we need to either train our model or use a pre-trained model. NLTK comes with a pretty good one for
general use, but if you are looking at a certain kind of document you may want to train your own tagger, since it may greatly
affect the accuracy (think about very vocabulary-dense fields such as biology).&lt;/p&gt;

&lt;p&gt;Note that to train your own tagger you will need a pre-tagged corpus (NLTK comes with some) or use a &lt;em&gt;bootstrapped&lt;/em&gt; method (which can take a long time). Check out  &lt;a href="http://streamhacker.com/2010/04/12/pos-tag-nltk-brill-classifier/"&gt;Streamhacker&lt;/a&gt; and &lt;a href="http://nltk.googlecode.com/svn/trunk/doc/book/ch05.html"&gt;Chapter 5 of the NLTK book&lt;/a&gt; for a good discussion on training your own (and how to test it empirically).&lt;/p&gt;

&lt;p&gt;For the sake of this introduction, we will use the default one. The result is a list of token-tag pairs:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; postoks = nltk.tag.pos_tag(toks)
&amp;gt;&amp;gt;&amp;gt; postoks
[(&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;The&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;DT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;), (&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;Buddha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;NNP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;), (&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;), (&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;the&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;, &lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;DT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;), ...&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Chunking&lt;/h2&gt;

&lt;p&gt;Now we can use the part-of-speech tags to lift out noun phrases (NP) based on patterns of tags.&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/pxFukEBGmryefepHqchehjJBcppmtDBbbjqAjgjClgHrFpibnuriAjCFbmrs/media_httpnltkgooglec_Amtzp.png.scaled1000.png"&gt;&lt;img alt="Media_httpnltkgooglec_amtzp" height="103" src="http://posterous.com/getfile/files.posterous.com/alexbowe/pxFukEBGmryefepHqchehjJBcppmtDBbbjqAjgjClgHrFpibnuriAjCFbmrs/media_httpnltkgooglec_Amtzp.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: All diagrams have been stolen from the NLTK book (which is available under the Creative Commons Attribution Noncommercial No Derivative Works 3.0 US License).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is called chunking. We can define the form of our chunks using a regular expression, and build a chunker from that:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="c"&gt;# This grammar is described in the paper by S. N. Kim,&lt;/span&gt;
&lt;span class="c"&gt;# T. Baldwin, and M.-Y. Kan.&lt;/span&gt;
&lt;span class="c"&gt;# Evaluating n-gram based evaluation metrics for automatic&lt;/span&gt;
&lt;span class="c"&gt;# keyphrase extraction.&lt;/span&gt;
&lt;span class="c"&gt;# Technical report, University of Melbourne, Melbourne 2010.&lt;/span&gt;
grammar = &lt;span class="s"&gt;&lt;span class="mod"&gt;r&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;    NBAR:&lt;/span&gt;
&lt;span class="k"&gt;        # Nouns and Adjectives, terminated with Nouns&lt;/span&gt;
&lt;span class="k"&gt;        {&amp;lt;NN.*|JJ&amp;gt;*&amp;lt;NN.*&amp;gt;}&lt;/span&gt;

&lt;span class="k"&gt;    NP:&lt;/span&gt;
&lt;span class="k"&gt;        {&amp;lt;NBAR&amp;gt;}&lt;/span&gt;
&lt;span class="k"&gt;        # Above, connected with in/of/etc...&lt;/span&gt;
&lt;span class="k"&gt;        {&amp;lt;NBAR&amp;gt;&amp;lt;IN&amp;gt;&amp;lt;NBAR&amp;gt;}&lt;/span&gt;
&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;

chunker = nltk.RegexpParser(grammar)
tree = chunker.parse(postoks)&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;It is also possible to describe a Context Free Grammar (CFG) to do this, and help deal with ambiguity &amp;ndash; information can be found in &lt;a href="http://nltk.googlecode.com/svn/trunk/doc/book/ch08.html"&gt;Chapter 8 of the NLTK book&lt;/a&gt;. Chunk regexes can be much more complicated if needed, and support
&lt;em&gt;chinking&lt;/em&gt;, which allows you to specify patterns in terms &lt;em&gt;what you don&amp;rsquo;t want&lt;/em&gt; &amp;ndash; see &lt;a href="http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html"&gt;Chapter 7 of the NLTK book&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The output of chunking is a tree, where the noun phrase nodes are located just one level before the leaves, which are the words that constitute the noun phrase:&lt;/p&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/DfvIBCpfjwkfbDyFcrsuuHyslvAketJIxBJqBvwfAnbkCvcckDwAdeJrHuxg/media_httpnltkgooglec_HwqkC.png.scaled1000.png"&gt;&lt;img alt="Media_httpnltkgooglec_hwqkc" height="180" src="http://posterous.com/getfile/files.posterous.com/alexbowe/DfvIBCpfjwkfbDyFcrsuuHyslvAketJIxBJqBvwfAnbkCvcckDwAdeJrHuxg/media_httpnltkgooglec_HwqkC.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;To access the leaves, we can use this code:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;leaves&lt;/span&gt;(tree):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;Finds NP (nounphrase) leaf nodes of a chunk tree.&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    &lt;span class="kw"&gt;for&lt;/span&gt; subtree &lt;span class="kw"&gt;in&lt;/span&gt; tree.subtrees(filter = &lt;span class="kw"&gt;lambda&lt;/span&gt; t: t.node==&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;NP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;):
        &lt;span class="kw"&gt;yield&lt;/span&gt; subtree.leaves()&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Walking the tree and Normalisation&lt;/h2&gt;

&lt;p&gt;We can now walk the tree to get the terms, applying normalisation if we want to:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;get_terms&lt;/span&gt;(tree):
    &lt;span class="kw"&gt;for&lt;/span&gt; leaf &lt;span class="kw"&gt;in&lt;/span&gt; leaves(tree):
        term = [ normalise(word) &lt;span class="kw"&gt;for&lt;/span&gt; word, tag &lt;span class="kw"&gt;in&lt;/span&gt; leaf
            &lt;span class="kw"&gt;if&lt;/span&gt; acceptable_word(word) ]
        &lt;span class="kw"&gt;yield&lt;/span&gt; term&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Normalisation may consist of lower-casing words, removing stop-words which appear in many documents  (i.e. if, the, a&amp;hellip;),
stemming (i.e. cars &amp;ndash;&gt; car), and lemmatizing (i.e. drove, drives, rode &amp;ndash;&gt; drive). We normalise so that at later stages we can compare similar key phrases to be the same; &lt;code&gt;'the man drove the truck'&lt;/code&gt; should be comparable to &lt;code&gt;'The man drives the truck'&lt;/code&gt;. This will allow us to better rank our key phrases :)&lt;/p&gt;

&lt;p&gt;Functions for normalising and checking for stop-words are described below:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;lemmatizer = nltk.WordNetLemmatizer()
stemmer = nltk.stem.porter.PorterStemmer()

&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;normalise&lt;/span&gt;(word):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;Normalises words to lowercase and stems and lemmatizes it.&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    word = word.lower()
    word = stemmer.stem_word(word)
    word = lemmatizer.lemmatize(word)
    &lt;span class="kw"&gt;return&lt;/span&gt; word

&lt;span class="kw"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;acceptable_word&lt;/span&gt;(word):
    &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;Checks conditions for acceptable word: length, stopword.&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    &lt;span class="kw"&gt;from&lt;/span&gt; &lt;span class="ic"&gt;nltk.corpus&lt;/span&gt; &lt;span class="kw"&gt;import&lt;/span&gt; &lt;span class="ic"&gt;stopwords&lt;/span&gt;
    stopwords = stopwords.words(&lt;span class="s"&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="k"&gt;english&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;/span&gt;)

    accepted = &lt;span class="pd"&gt;bool&lt;/span&gt;(&lt;span class="i"&gt;2&lt;/span&gt; &amp;lt;= &lt;span class="pd"&gt;len&lt;/span&gt;(word) &amp;lt;= &lt;span class="i"&gt;40&lt;/span&gt;
        &lt;span class="kw"&gt;and&lt;/span&gt; word.lower() &lt;span class="kw"&gt;not&lt;/span&gt; &lt;span class="kw"&gt;in&lt;/span&gt; stopwords)
    &lt;span class="kw"&gt;return&lt;/span&gt; accepted&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;And the result is:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; terms = get_terms(tree)
&amp;gt;&amp;gt;&amp;gt; &lt;span class="kw"&gt;for&lt;/span&gt; term &lt;span class="kw"&gt;in&lt;/span&gt; terms:
...    &lt;span class="kw"&gt;for&lt;/span&gt; word &lt;span class="kw"&gt;in&lt;/span&gt; term:
...        &lt;span class="kw"&gt;print&lt;/span&gt; word,
...    &lt;span class="kw"&gt;print&lt;/span&gt;
buddha
godhead
circuit
digit comput
gear
cycl transmiss
mountain
petal
flower
buddha
demean oneself&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Are these similar to the key phrases you chose? There are lots of areas above that can be tweaked. Let me know what you come up with :) (the code can be found in &lt;a href="https://gist.github.com/879414"&gt;this gist&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;In future posts I will talk about how to rank key phrases. I will also discuss how to scale this to
process many documents at once using MapReduce.&lt;/p&gt;

&lt;p&gt;In the mean time check out the &lt;a href="http://text-processing.com/demo/"&gt;demos&lt;/a&gt; on &lt;a href="http://streamhacker.com/"&gt;Streamhacker&lt;/a&gt;, solve the problems in the &lt;a href="http://www.nltk.org/"&gt;NLTK book&lt;/a&gt;, or read the &lt;a href="http://www.amazon.com/gp/product/1849513600/alexbowecom-20"&gt;NLTK Cookbook&lt;/a&gt; :)&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/au-naturale"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/au-naturale#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/1zkZ9lzPU_SDaLBuuUrnlDGCZbE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1zkZ9lzPU_SDaLBuuUrnlDGCZbE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/1zkZ9lzPU_SDaLBuuUrnlDGCZbE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1zkZ9lzPU_SDaLBuuUrnlDGCZbE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/pMi9XUz-Szo" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="400" width="400" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/boqcnCxevdvmHwBwnrCACEHibzxAgyrhufhmbxABDxxfznFytHbEtdorivwm/media_httpdatawhicdnc_apuDA.jpg">
        <media:thumbnail height="400" width="400" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/boqcnCxevdvmHwBwnrCACEHibzxAgyrhufhmbxABDxxfznFytHbEtdorivwm/media_httpdatawhicdnc_apuDA.jpg.scaled500.jpg" />
      </media:content>
      <media:content type="image/png" height="405" width="1975" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/pxFukEBGmryefepHqchehjJBcppmtDBbbjqAjgjClgHrFpibnuriAjCFbmrs/media_httpnltkgooglec_Amtzp.png">
        <media:thumbnail height="103" width="500" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/pxFukEBGmryefepHqchehjJBcppmtDBbbjqAjgjClgHrFpibnuriAjCFbmrs/media_httpnltkgooglec_Amtzp.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="696" width="1934" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/DfvIBCpfjwkfbDyFcrsuuHyslvAketJIxBJqBvwfAnbkCvcckDwAdeJrHuxg/media_httpnltkgooglec_HwqkC.png">
        <media:thumbnail height="180" width="500" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/DfvIBCpfjwkfbDyFcrsuuHyslvAketJIxBJqBvwfAnbkCvcckDwAdeJrHuxg/media_httpnltkgooglec_HwqkC.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/au-naturale</feedburner:origLink></item>
    <item>
      <pubDate>Sat, 05 Mar 2011 05:44:00 -0800</pubDate>
      <title>Avian Autopilot</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/4DxjX_Cq5_Q/avian-autopilot</link>
      <guid isPermaLink="false">http://www.alexbowe.com/avian-autopilot</guid>
      <description>&lt;p&gt;
	&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_http2bpblogspot_imjvj" height="360" src="http://posterous.com/getfile/files.posterous.com/alexbowe/ineDcuDrkrEjllDdDamFqwmtFGmlemcirvrDrzcgFhvEGucFjEssmIzFEsGh/media_http2bpblogspot_imjvJ.jpg.scaled500.jpg" width="360" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;In this post I will detail how to remove yourself from your social life. Well, just the link sharing part on Twitter.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;At an event held last September, &lt;a href="http://techcrunch.com/2010/09/14/twitter-seeing-90-million-tweets-per-day/"&gt;Twitter released some usage stats&lt;/a&gt;, including this:&amp;nbsp;around &lt;strong&gt;25%&lt;/strong&gt; of Tweets contain links.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;One must wonder if Twitter spammers have skewed this data. How would my own account compare against this? Using&amp;nbsp;&lt;a href="http://snapbird.org/"&gt;SnapBird&lt;/a&gt;&amp;nbsp;to search my tweets for&amp;nbsp;&lt;code&gt;http&lt;/code&gt;, I have calculated that about &lt;strong&gt;34%&lt;/strong&gt; of my activity on Twitter is link sharing...&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;I don't think&amp;nbsp;&lt;a href="http://www.twitter.com/alexbowe"&gt;my tweets&lt;/a&gt;&amp;nbsp;are that spammy, but please correct me if I'm wrong. Most of the enjoyment I get from Twitter is from&amp;nbsp;&lt;em&gt;useful and interesting&lt;/em&gt;&amp;nbsp;link exchange, and having questions answered by people who write popular programming books and blogs.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;On to the focus of this blog post, which will either reduce the friction of doing what may be a significant chunk of your Twitter workflow, or may just serve to drive these stats up...&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Once upon a time in&amp;nbsp;&lt;em&gt;an attempt&lt;/em&gt;&amp;nbsp;at clever laziness, I set my Delicious bookmarks to post a weekly blog aggregating the new links I added. People don't go to a blog to see a list of links plastered there by a cron job.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;But what if you could take this idea and deliver the links in limited chunks (i.e. one link at a time, with a maximum of three per thirty minutes) on a platform where linking is accepted? If it is any indicator, none of my followers complained. Just make sure you keep the human touch by manually Tweeting thoughts, engaging people with questions, and the occasional Retweet.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;The idea is simple: gather your&amp;nbsp;&lt;a href="http://www.instapaper.com/"&gt;Instapaper&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="http://reader.google.com/"&gt;Google Reader&lt;/a&gt;&amp;nbsp;starred items into a RSS feed, and set it to post new items to your Twitter (or Facebook).&lt;/p&gt;
&lt;h2 style="font-family: Times; font-size: medium;"&gt;Instapaper&lt;/h2&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Log in to&amp;nbsp;&lt;a href="http://www.instapaper.com/"&gt;Instapaper&lt;/a&gt;&amp;nbsp;and go to your&amp;nbsp;&lt;strong&gt;Starred&lt;/strong&gt;&amp;nbsp;folder,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Insta-starred" height="37" src="http://posterous.com/getfile/files.posterous.com/alexbowe/Vn6gHavLy0wm5PjlKk7foGEJeW003P0KPjUzgCw70VIeEINTQgmxZpkAx55b/insta-starred.png" width="362" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Right click the RSS link and copy the address into a text file (for later use).&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Insta-rss" height="78" src="http://posterous.com/getfile/files.posterous.com/alexbowe/3RuLix87B1hlQSnjHwyiqPHitQDgcNQVgXSPkOm2SRWZUk7SjXOJQhhnZCbd/insta-rss.png" width="201" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;h2 style="font-family: Times; font-size: medium;"&gt;Reader&lt;/h2&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Now go to your&amp;nbsp;&lt;a href="http://reader.google.com/"&gt;Google Reader&lt;/a&gt;&amp;nbsp;account and open the Google Reader settings,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Reader-settings" height="61" src="http://posterous.com/getfile/files.posterous.com/alexbowe/zYwfstUkFYqHrrSNPYXP43U2xUSBpbodKWaVvxWUJVV8wWaF1Lfln9u52Uoq/reader-settings.png" width="260" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Open the&amp;nbsp;&lt;em&gt;Folders and Tags&lt;/em&gt;&amp;nbsp;tab, and select&amp;nbsp;&lt;em&gt;Your Starred Items&lt;/em&gt;,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Reader-selectstarred" height="25" src="http://posterous.com/getfile/files.posterous.com/alexbowe/g8kK01tCEjQhmwoEUWrEyKT2cv0A1Uyeat5AEDPvl1AOfleF19zdrGE2Gg8x/reader-selectstarred.png" width="158" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Change the sharing settings to&amp;nbsp;&lt;em&gt;Public&lt;/em&gt;,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;span style="font-family: Arial, Helvetica, sans-serif; font-size: 13px;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Reader-sharing" height="66" src="http://posterous.com/getfile/files.posterous.com/alexbowe/XR1Q5sMS7mI85dF6aq9yYoDj8q58N6SDbGX3RU72FfldImIjctz2kg2Fxxul/reader-sharing.png" width="144" /&gt;
&lt;/div&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;span style="font-family: Arial, Helvetica, sans-serif; font-size: 13px;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/8YOt2orSugoNRiyZrMCASc5FxOcy9iJizJPvm2tqCbtCqSME0n8VjoGFGM2K/reader-starredrow.png"&gt;&lt;img alt="Reader-starredrow" height="44" src="http://posterous.com/getfile/files.posterous.com/alexbowe/eldewIIFlkJ8h5iEyvM9kBrxViaN9OGPFrk5Q5DULw3MV00aGYxjsvTTgUF3/reader-starredrow.png.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Click&amp;nbsp;&lt;em&gt;View Public Page&lt;/em&gt;&amp;nbsp;and right click the Atom feed link and copy the address into your text file.&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Reader-atom" height="78" src="http://posterous.com/getfile/files.posterous.com/alexbowe/FkAapprYqSUPsFf7znwS6eMHrZjghgEdotE9R2FV0xl0RVPrplvOEc92xJAS/reader-atom.png" width="229" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;h2 style="font-family: Times; font-size: medium;"&gt;Pipes&lt;/h2&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;With&amp;nbsp;&lt;a href="http://pipes.yahoo.com/"&gt;Yahoo Pipes&lt;/a&gt;&amp;nbsp;we create a&amp;nbsp;&lt;em&gt;master feed&lt;/em&gt;. Sign up there and create a new pipe. Add a&amp;nbsp;&lt;em&gt;Fetch Feed&lt;/em&gt;&amp;nbsp;module to it,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Pipes-fetchempty" height="115" src="http://posterous.com/getfile/files.posterous.com/alexbowe/z4Rop0sKIMJyUw108GcR1iPVuWZXbiJejBAzKFOJUXCG5ecdP5yAVRcSEVEd/pipes-fetchempty.png" width="341" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Fill in the feed addresses that you copied down before, and connect the Fetch Feed module to the&amp;nbsp;&lt;em&gt;Pipe Output&lt;/em&gt;,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Pipes-complete" height="271" src="http://posterous.com/getfile/files.posterous.com/alexbowe/1YMNpFE2WM9AJxGnTOozbtyI0shOx3fNEJtpVrrFiPe3tKYfaeIPxB2gyzgY/pipes-complete.png" width="329" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Save it and right click&amp;nbsp;&lt;em&gt;Get as RSS&lt;/em&gt;, copy the address down too :) (&lt;strong&gt;Note&lt;/strong&gt;: Not the&amp;nbsp;&lt;em&gt;Pipe Web Address&lt;/em&gt;)&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/HPbpKalmuAiBbAHpu5zG665bQorQL3nDmxPSiVzARrmoPWL2ljc6qqsdRcGx/pipes-address.png"&gt;&lt;img alt="Pipes-address" height="208" src="http://posterous.com/getfile/files.posterous.com/alexbowe/7yWIq011x0YeezFsiOvlmhL9v5lXuX1EtyPD37HxypebaBKzt3cBMyliquKk/pipes-address.png.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;h2 style="font-family: Times; font-size: medium;"&gt;TwitterFeed&lt;/h2&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;a href="http://www.twitterfeed.com/"&gt;TwitterFeed&lt;/a&gt;&amp;nbsp;is used to take your master feed and update Twitter, or Facebook, or&amp;nbsp;&lt;a href="http://www.ping.fm/"&gt;Ping.fm&lt;/a&gt;&amp;nbsp;which can subsequently update many more services. Sign up and create a new feed with the master feed URL you copied from Pipes,&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/xKt3SBGw5Kjv7KuGRw3stEirRytFfRcYWLPV3iFoDtWsydNbyotZrl7KBqjm/feed.png"&gt;&lt;img alt="Feed" height="209" src="http://posterous.com/getfile/files.posterous.com/alexbowe/Q8RZMjGprFwhw8vAfTLBVi0XOuglcIKfXR9TAx9DU5vxJt5tabmPA90HI7st/feed.png.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Follow the next steps to connect it to your desired service (and Tweak how many posts per 30 minutes or whatever...).&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;There you go. Time to test it out:&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;span style="font-family: Arial, Helvetica, sans-serif; font-size: 13px;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/7vtbEMFdRgZORlnGwNWtbB3k9spt7rSUgqsFmgW0ChdmPAmJfaEYGMu38Igb/insta-star-ex.png"&gt;&lt;img alt="Insta-star-ex" height="68" src="http://posterous.com/getfile/files.posterous.com/alexbowe/4lmPbW8CIrJY6Z6eiA4Nw6M4V0T8gmdMiFJ6OfagDxAR3SZnwaTbKwsZJ3Gm/insta-star-ex.png.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/span&gt;&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/temp-2011-03-05/BxjeadswatCxusgccHBDbgvrjJgAcBAuaAJjwEoIorjbwwrlazdzgibGmert/insta-tweet.png.scaled1000.png"&gt;&lt;img alt="Insta-tweet" height="186" src="http://posterous.com/getfile/files.posterous.com/temp-2011-03-05/BxjeadswatCxusgccHBDbgvrjJgAcBAuaAJjwEoIorjbwwrlazdzgibGmert/insta-tweet.png.scaled500.png" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;You may also want to add your&amp;nbsp;&lt;a href="http://www.last.fm/"&gt;Last.fm&lt;/a&gt;&amp;nbsp;loved tracks,&amp;nbsp;&lt;a href="http://www.flickr.com/"&gt;Flickr&lt;/a&gt;&amp;nbsp;photostream or&amp;nbsp;&lt;a href="http://www.weheartit.com/"&gt;WeHeartIt&lt;/a&gt;&amp;nbsp;account. Unfortunately&amp;nbsp;&lt;a href="http://www.pinboard.in"&gt;Pinboard&lt;/a&gt;&amp;nbsp;(a great alternative to Delicious), while providing RSS feeds for certain tags, and supporting "starring", does not allow you to get an RSS feed of the starred items. I get around this by using a tag called&amp;nbsp;&lt;code&gt;highlight&lt;/code&gt;. I have requested this feature, I hope it gets added :)&lt;/p&gt;
&lt;p style="font-family: Times; font-size: medium;"&gt;Pipes actually allows a lot more than simple combination. Have a play and let us know in the comments what you come up with.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/avian-autopilot"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/avian-autopilot#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/fnhctJK059lg3cA0u_YXP3E8-t4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/fnhctJK059lg3cA0u_YXP3E8-t4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/fnhctJK059lg3cA0u_YXP3E8-t4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/fnhctJK059lg3cA0u_YXP3E8-t4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/4DxjX_Cq5_Q" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="79" width="582" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/7vtbEMFdRgZORlnGwNWtbB3k9spt7rSUgqsFmgW0ChdmPAmJfaEYGMu38Igb/insta-star-ex.png">
        <media:thumbnail height="68" width="500" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/4lmPbW8CIrJY6Z6eiA4Nw6M4V0T8gmdMiFJ6OfagDxAR3SZnwaTbKwsZJ3Gm/insta-star-ex.png.scaled.500.jpg" />
      </media:content>
      <media:content type="image/png" height="115" width="341" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/z4Rop0sKIMJyUw108GcR1iPVuWZXbiJejBAzKFOJUXCG5ecdP5yAVRcSEVEd/pipes-fetchempty.png">
        <media:thumbnail height="115" width="341" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/z4Rop0sKIMJyUw108GcR1iPVuWZXbiJejBAzKFOJUXCG5ecdP5yAVRcSEVEd/pipes-fetchempty.png" />
      </media:content>
      <media:content type="image/png" height="37" width="362" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/Vn6gHavLy0wm5PjlKk7foGEJeW003P0KPjUzgCw70VIeEINTQgmxZpkAx55b/insta-starred.png">
        <media:thumbnail height="37" width="362" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/Vn6gHavLy0wm5PjlKk7foGEJeW003P0KPjUzgCw70VIeEINTQgmxZpkAx55b/insta-starred.png" />
      </media:content>
      <media:content type="image/png" height="84" width="955" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/8YOt2orSugoNRiyZrMCASc5FxOcy9iJizJPvm2tqCbtCqSME0n8VjoGFGM2K/reader-starredrow.png">
        <media:thumbnail height="44" width="500" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/eldewIIFlkJ8h5iEyvM9kBrxViaN9OGPFrk5Q5DULw3MV00aGYxjsvTTgUF3/reader-starredrow.png.scaled.500.jpg" />
      </media:content>
      <media:content type="image/png" height="271" width="329" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/1YMNpFE2WM9AJxGnTOozbtyI0shOx3fNEJtpVrrFiPe3tKYfaeIPxB2gyzgY/pipes-complete.png">
        <media:thumbnail height="271" width="329" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/1YMNpFE2WM9AJxGnTOozbtyI0shOx3fNEJtpVrrFiPe3tKYfaeIPxB2gyzgY/pipes-complete.png" />
      </media:content>
      <media:content type="image/png" height="66" width="144" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/XR1Q5sMS7mI85dF6aq9yYoDj8q58N6SDbGX3RU72FfldImIjctz2kg2Fxxul/reader-sharing.png">
        <media:thumbnail height="66" width="144" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/XR1Q5sMS7mI85dF6aq9yYoDj8q58N6SDbGX3RU72FfldImIjctz2kg2Fxxul/reader-sharing.png" />
      </media:content>
      <media:content type="image/png" height="356" width="850" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/xKt3SBGw5Kjv7KuGRw3stEirRytFfRcYWLPV3iFoDtWsydNbyotZrl7KBqjm/feed.png">
        <media:thumbnail height="209" width="500" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/Q8RZMjGprFwhw8vAfTLBVi0XOuglcIKfXR9TAx9DU5vxJt5tabmPA90HI7st/feed.png.scaled.500.jpg" />
      </media:content>
      <media:content type="image/png" height="25" width="158" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/g8kK01tCEjQhmwoEUWrEyKT2cv0A1Uyeat5AEDPvl1AOfleF19zdrGE2Gg8x/reader-selectstarred.png">
        <media:thumbnail height="25" width="158" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/g8kK01tCEjQhmwoEUWrEyKT2cv0A1Uyeat5AEDPvl1AOfleF19zdrGE2Gg8x/reader-selectstarred.png" />
      </media:content>
      <media:content type="image/png" height="313" width="751" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/HPbpKalmuAiBbAHpu5zG665bQorQL3nDmxPSiVzARrmoPWL2ljc6qqsdRcGx/pipes-address.png">
        <media:thumbnail height="208" width="500" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/7yWIq011x0YeezFsiOvlmhL9v5lXuX1EtyPD37HxypebaBKzt3cBMyliquKk/pipes-address.png.scaled.500.jpg" />
      </media:content>
      <media:content type="image/png" height="61" width="260" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/zYwfstUkFYqHrrSNPYXP43U2xUSBpbodKWaVvxWUJVV8wWaF1Lfln9u52Uoq/reader-settings.png">
        <media:thumbnail height="61" width="260" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/zYwfstUkFYqHrrSNPYXP43U2xUSBpbodKWaVvxWUJVV8wWaF1Lfln9u52Uoq/reader-settings.png" />
      </media:content>
      <media:content type="image/png" height="78" width="229" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/FkAapprYqSUPsFf7znwS6eMHrZjghgEdotE9R2FV0xl0RVPrplvOEc92xJAS/reader-atom.png">
        <media:thumbnail height="78" width="229" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/FkAapprYqSUPsFf7znwS6eMHrZjghgEdotE9R2FV0xl0RVPrplvOEc92xJAS/reader-atom.png" />
      </media:content>
      <media:content type="image/png" height="78" width="201" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/3RuLix87B1hlQSnjHwyiqPHitQDgcNQVgXSPkOm2SRWZUk7SjXOJQhhnZCbd/insta-rss.png">
        <media:thumbnail height="78" width="201" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/3RuLix87B1hlQSnjHwyiqPHitQDgcNQVgXSPkOm2SRWZUk7SjXOJQhhnZCbd/insta-rss.png" />
      </media:content>
      <media:content type="image/jpeg" height="360" width="360" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/ineDcuDrkrEjllDdDamFqwmtFGmlemcirvrDrzcgFhvEGucFjEssmIzFEsGh/media_http2bpblogspot_imjvJ.jpg">
        <media:thumbnail height="360" width="360" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/ineDcuDrkrEjllDdDamFqwmtFGmlemcirvrDrzcgFhvEGucFjEssmIzFEsGh/media_http2bpblogspot_imjvJ.jpg.scaled500.jpg" />
      </media:content>
      <media:content type="image/png" height="218" width="587" url="http://getfile5.posterous.com/getfile/files.posterous.com/temp-2011-03-05/BxjeadswatCxusgccHBDbgvrjJgAcBAuaAJjwEoIorjbwwrlazdzgibGmert/insta-tweet.png">
        <media:thumbnail height="186" width="500" url="http://getfile5.posterous.com/getfile/files.posterous.com/temp-2011-03-05/BxjeadswatCxusgccHBDbgvrjJgAcBAuaAJjwEoIorjbwwrlazdzgibGmert/insta-tweet.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/avian-autopilot</feedburner:origLink></item>
    <item>
      <pubDate>Mon, 21 Feb 2011 03:58:00 -0800</pubDate>
      <title>Stop Force-Feeding Your Brain</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/w6dNNIrfRIY/stop-force-feeding-your-brain</link>
      <guid isPermaLink="false">http://www.alexbowe.com/stop-force-feeding-your-brain</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpimages20x20_qghzd" height="375" src="http://posterous.com/getfile/files.posterous.com/alexbowe/sidmndrvCIfqzFsGDjBpmegAwDAhBjtsuhrmBhicigGvnszxkiGCqCihyAzD/media_httpimages20x20_qgHzd.jpg.scaled500.jpg" width="500" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;I used to read less than one book per year. According to &lt;a href="http://www.amazon.com/gp/product/0735619670/alexbowecom-20"&gt;Code Complete&lt;/a&gt;, this is the same as the average programmer.&lt;/p&gt;
&lt;p&gt;That was while I was at uni though. I sort of expected to learn slowly at uni,  and anticipated rapid, mind-blowing learning when I entered the industry. My  first job had me working under some of the smartest people I have ever met - it  was at a game company. Unfortunately they ran out of money and I had to find a  job at a slower-paced place, where I didn't have the luxury of mentorship.&lt;/p&gt;
&lt;p&gt;My mind was going stale, I had to do something. I looked up recommended reading  lists online and the first programming book I picked up (that wasn't for a uni  course) was &lt;a href="http://www.amazon.com/gp/product/020161622X/alexbowecom-20"&gt;The Pragmatic Programmer&lt;/a&gt;. Are you sick of me mentioning  this book in every post yet? ;)&lt;/p&gt;
&lt;p&gt;PragProg recommends that you treat your learning as investment, and your  knowledge as a portfolio; you should invest regularly, diversify, balance it  between high-risk and conservative, buy low and sell high (i.e. take interest in  emerging technologies). It recommends several goals, such as reading a  technical book each quarter, and when you've developed the habit, one per  month. It also recommends reading non-technical books too. I totally fell in  love with this idea, and have been reading like a mofo since.&lt;/p&gt;
&lt;p&gt;There is a slight problem though...&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The final point PragProg gives is to review  and rebalance. The point it makes is that you need to stop learning things that  aren't providing value, and revise things you have forgotten occasionally. I  want to focus on the &lt;strong&gt;STOP&lt;/strong&gt; part of this. For the love of God, take it easy!  &lt;a href="http://norvig.com/21-days.html"&gt;It takes years to learn all this stuff&lt;/a&gt; - you don't have to do it all  at once.&lt;/p&gt;
&lt;p&gt;I am guilty of bingeing on blogs and books. I picture learning like a  tree-traversal, with each (sub)topic having a different branch. The problem is,  I keep getting excited about new books or programming languages I find, and  never reach the leaves. I really need to learn focus, because I'm swinging from  branch to branch like a monkey.&lt;/p&gt;
&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/8fZvJNTP01xHyDVimTAj0Qm2rq3o8oCaKRDqABQbdiTc6ex4kbmrlRheNxzH/monkey.jpg"&gt;&lt;img alt="Monkey" height="337" src="http://posterous.com/getfile/files.posterous.com/alexbowe/427xOPbgCqbrIqHlUAVNaGQS4sSB5VmwbowZbnIBAY2F8L1a1sofnJZdSGUU/monkey.jpg.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;During my reading, I have realised that other smart people are with me on this. In &lt;a href="http://www.amazon.com/gp/product/0307465357/alexbowecom-20"&gt;The 4-Hour Workweek&lt;/a&gt;, Tim Ferris mentions that you should only ever  read one factual book concurrently, along with one fictional book. You aren't  productive if you are reading more than one factual book - context switches are  expensive.&lt;/p&gt;
&lt;p&gt;The current non-tech book that I'm reading is &lt;a href="http://www.amazon.com/gp/product/0061673730/alexbowecom-20"&gt;Zen and the Art of Motorcycle Maintenance&lt;/a&gt; - this may be a bit of a cheat, because it is recommended on  some programmer &lt;a href="http://www.joelonsoftware.com/navLinks/fog0000000262.html"&gt;reading&lt;/a&gt; &lt;a href="http://stackoverflow.com/questions/1711/what-is-the-single-most-influential-book-every-programmer-should-read"&gt;lists&lt;/a&gt;. The narrator talks  about his coming to enjoy the journey more than the actual arrival. The process  of learning is itself something to be enjoyed, and isn't just &lt;a href="http://en.wiktionary.org/wiki/yak_shaving"&gt;yak shaving&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.softwarebyrob.com/"&gt;Rob Walling&lt;/a&gt; says in his startup book &lt;a href="http://www.startupbook.net/"&gt;Start Small, Stay Small&lt;/a&gt; that entrepreneurs should stop reading so much. He says  "Information Consumption is Only Good When it Produces Something", although it  can be done for pure enjoyment too - as long as you pay attention to when you &lt;strong&gt;NEED TO DO IT&lt;/strong&gt; and when you just want to (which is more often than you  think).&lt;/p&gt;
&lt;p&gt;How do we know what input will produce good output? We can use heuristics when  choosing (which is what we all do anyway) - such as author name - but you can't  know for sure unless you've read it. Swinging from branch to branch is a  necessary evil, then. We need to work out a balance between input and output,  or that which Monkey sees and that which Monkey Does.&lt;/p&gt;
&lt;p&gt;Here are some of my own tips for curing this  information-eating-disorder:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Allow yourself to quit&lt;/strong&gt;. Sometimes projects just aren't worth your effort.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Only read two books - one technical, one non-technical - at a time&lt;/strong&gt;. As    above, you can quit a book at any time, but avoid constant interleaved     reading.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install &lt;a href="http://www.rescuetime.com/"&gt;RescueTime&lt;/a&gt;&lt;/strong&gt;. It passively monitors and graphs your     activity breakdown. This will awaken you to how much useless crap you do in a     day.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote class="posterous_short_quote"&gt;
&lt;p&gt;What gets measured gets managed - Peter Drucker&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.instapaper.com/"&gt;Instapaper&lt;/a&gt;&lt;/strong&gt;. Use it (on your computer and phone), love it. If you     aren't reading this blog post during down-time (e.g. during a train commute)     then I'm looking at you in particular. I won't read a blog post unless it    goes through my instapaper account first.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Periodically cull your RSS feeds&lt;/strong&gt;. Eddie Smith of    &lt;a href="http://www.practicallyefficient.com/2010/12/31/thinking-about-new-years-resolutions/"&gt;Practically Efficient&lt;/a&gt; suggests eliminating feeds that produce more than    ten posts per day. I recommend scheduling reminders to cull your feeds on    a regular basis. Hacker News should come to mind - give one of    &lt;a href="http://jeffmiller.github.com/2010/07/23/a-cure-for-hacker-news-overload"&gt;these popularity-sensitive feeds&lt;/a&gt; a try instead, to make use of human     filtering.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Write todo lists&lt;/strong&gt;. You probably do this anyway, except this time I want     you to write two of them side by side... let's call this the &lt;em&gt;"twodolist"&lt;/em&gt; method. The first is for things that you absolutely have to finish - maybe     you are obligated to do it for work, maybe it counts toward your primary     objective, like writing a thesis or starting a startup. The second list is&lt;br /&gt; for the things you want to do that seems important enough to put on a list.     This may include reading a certain book.&lt;/p&gt;
&lt;p&gt;The act of applying this mental scalpel really helps me. When I am &lt;em&gt;"busy"&lt;/em&gt;,     it is often the case that I am working off the second list, which means I'm     avoiding the first.&lt;/p&gt;
&lt;div&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/xLGEQxdKRTlNbTJ1Gu41myoykksbtHW90axmdZrKorCJsKiXhcDcjlNDWqGs/twodo.jpg"&gt;&lt;img alt="Twodo" height="308" src="http://posterous.com/getfile/files.posterous.com/alexbowe/TEI8c53ORRFg4fAiFPZuuUUV0aG5BFS2Y9WwOGbr7IDeO0YdPDQQC1BTyq25/twodo.jpg.scaled.500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Turn off push notifications for your phone&lt;/strong&gt;. This is just another     unnecessary context switch - you don't really need to know the exact moment     you get an email.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now it's your turn! Please leave any advice (big or small) in the comments.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/stop-force-feeding-your-brain"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/stop-force-feeding-your-brain#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/TGy0P5veDNgSW9CqNej5WodHKKg/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TGy0P5veDNgSW9CqNej5WodHKKg/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/TGy0P5veDNgSW9CqNej5WodHKKg/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TGy0P5veDNgSW9CqNej5WodHKKg/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/w6dNNIrfRIY" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="540" width="876" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/xLGEQxdKRTlNbTJ1Gu41myoykksbtHW90axmdZrKorCJsKiXhcDcjlNDWqGs/twodo.jpg">
        <media:thumbnail height="308" width="500" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/TEI8c53ORRFg4fAiFPZuuUUV0aG5BFS2Y9WwOGbr7IDeO0YdPDQQC1BTyq25/twodo.jpg.scaled.500.jpg" />
      </media:content>
      <media:content type="image/jpeg" height="655" width="973" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/8fZvJNTP01xHyDVimTAj0Qm2rq3o8oCaKRDqABQbdiTc6ex4kbmrlRheNxzH/monkey.jpg">
        <media:thumbnail height="337" width="500" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/427xOPbgCqbrIqHlUAVNaGQS4sSB5VmwbowZbnIBAY2F8L1a1sofnJZdSGUU/monkey.jpg.scaled.500.jpg" />
      </media:content>
      <media:content type="image/jpeg" height="375" width="500" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/sidmndrvCIfqzFsGDjBpmegAwDAhBjtsuhrmBhicigGvnszxkiGCqCihyAzD/media_httpimages20x20_qgHzd.jpg">
        <media:thumbnail height="375" width="500" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/sidmndrvCIfqzFsGDjBpmegAwDAhBjtsuhrmBhicigGvnszxkiGCqCihyAzD/media_httpimages20x20_qgHzd.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/stop-force-feeding-your-brain</feedburner:origLink></item>
    <item>
      <pubDate>Mon, 07 Feb 2011 02:35:00 -0800</pubDate>
      <title>Advice to CS Undergrads</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/OwgURKSbRwY/advice-to-cs-undergrads</link>
      <guid isPermaLink="false">http://www.alexbowe.com/advice-to-cs-undergrads</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_https3prodwehea_dmuam" height="313" src="http://posterous.com/getfile/files.posterous.com/alexbowe/vghmvcBcHyACzdcGhceiyuBjfHgcsFuuueonoecojyoximBJjbwpgzafnukw/media_https3prodwehea_DmuAm.jpg.scaled500.jpg" width="500" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;Since I&amp;rsquo;m starting my PhD this year, I have been reflecting on how I would be different if I went back in time and started my degree all over again. I am also continuing tutoring, in my 4th year, and I have been occasionally  approached by students and asked for general advice with their studies.&lt;/p&gt;
&lt;p&gt;I repeat the same advice to most students, so I&amp;rsquo;ll attempt to distill it  into the points below. Bear in mind that I am writing from a Computer Science  perspective, although some of the advice can be applied to any field.&lt;/p&gt;
&lt;p&gt;I didn&amp;rsquo;t do most of this stuff during my undergrad years. I still did well, but I think I would have had more fun if I followed this advice. If you&amp;rsquo;re not  doing all the stuff on this list, that&amp;rsquo;s okay. Come back and try again later.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the advice in no particular order:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Decide to get Good.&lt;/strong&gt; I know a lot of students who aren&amp;rsquo;t sure if they are in  the right degree. Computer Science isn&amp;rsquo;t for everyone, but if you are far enough  in (&amp;gt; 1 year) I recommend riding it out. I even had my own doubts; While  programming was a hobby for me prior to uni, most subjects at uni left me disappointed. When I started searching for parts of it that interested me, I  found it much more creative and fun.&lt;/p&gt;
&lt;p&gt;After your degree, you can do Masters in something else, and it will only take  1.5 or 2 years. Or you can study other things in your spare time. Don&amp;rsquo;t bite off  too much though&amp;hellip;&lt;/p&gt;
&lt;p&gt;Supposedly it takes &lt;a href="http://norvig.com/21-days.html"&gt;10,000 hours to become an expert at anything&lt;/a&gt;, but  after that I bet it is faster to become an expert at other things; many of the  10,000 hours you spend will be self-learning that can be applied to the next.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Learn More Languages.&lt;/strong&gt; Everyone gives this advice, but they usually suggest  it because it will give you a perspective on different ways to do things. I think you should learn Python (or Ruby) in order to remove the friction from  learning (try the book &lt;a href="http://learnpythonthehardway.org/"&gt;Learn Python The Hard Way&lt;/a&gt; by Zed Shaw).&lt;/p&gt;
&lt;p&gt;Python has been dubbed &amp;ldquo;executable pseudocode&amp;rdquo;, so this will help you when  reading algorithm books. They are both dynamically typed, which means you don&amp;rsquo;t  have to tell the computer when you are talking about a number vs a string. They  also come with an interactive shell where you can test ideas and easily  enter/modify examples from books.&lt;/p&gt;
&lt;p&gt;Since their syntaxes are C-like, you can start your assignments in one of these  languages, and easily translate it by hand to Java or C. Build one to throw  away (thanks, &lt;a href="http://www.amazon.com/gp/product/0201835959/alexbowecom-20"&gt;Fred Brooks&lt;/a&gt;) so you can learn about the pitfalls of the  problem before you have to deal with pointers or boiler-plate code.&lt;/p&gt;
&lt;p&gt;Lisp is also recommended. It can be daunting to choose a dialect when you know  nothing about it&amp;hellip; Just learn Scheme (you can learn Common Lisp or Clojure  later). There is really amazing free material that you&amp;rsquo;ll probably want to  &lt;a href="http://mitpress.mit.edu/sicp/"&gt;read&lt;/a&gt; or &lt;a href="http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/"&gt;watch&lt;/a&gt; at some point anyway. It will be less  practical for your uni years, but will give you a depth of knowledge that will  carry you through your entire career.&lt;/p&gt;
&lt;p&gt;Oh yeah, and learn and use regular expressions the next time you need to process  text.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Manage Your Time.&lt;/strong&gt; Okay, this is a no-brainer, but I still didn&amp;rsquo;t do it that well. The day you get an assignment, put it on your calendar and set a reminder.  After you&amp;rsquo;ve done that, start working on it right away.&lt;/p&gt;
&lt;p&gt;Remember, you aren&amp;rsquo;t committed to whatever you write down, until you hand it in;  once again, build one to throw away. Even a mind-map or just writing headings  can help a lot. Inevitably you&amp;rsquo;ll rush it at the last minute, but having  previous work to reference rather than a blank page is so much more comforting.  I&amp;rsquo;ll write more productivity advice in a later post, so keep checking back.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Read!&lt;/strong&gt; I don&amp;rsquo;t mean the prescribed text-book - I only read a few of those,  and sometimes they are only recommended because the lecturer is somehow invested  in it (don&amp;rsquo;t get me wrong, some are great, but you&amp;rsquo;ll find out about those books  anyway).&lt;/p&gt;
&lt;p&gt;I recommend starting with &lt;a href="http://www.amazon.com/gp/product/020161622X/alexbowecom-20"&gt;The Pragmatic Programmer&lt;/a&gt;; It&amp;rsquo;s nice and  small, very practical, acclaimed, and it will point you in the direction of  other good books to read when you&amp;rsquo;ve finished.&lt;/p&gt;
&lt;p&gt;Other than that, check out some recommended reading lists &lt;a class="footnote" title="see footnote"&gt;1&lt;/a&gt;, and  subscribe to blogs &lt;a class="footnote" title="see footnote"&gt;2&lt;/a&gt;. I&amp;rsquo;ll also discuss my reading workflow at a later  date, as you have to be careful of information-glut.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sharpen Your Tools.&lt;/strong&gt; Here are some tools that will help you make learning an enjoyable process:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Get a laptop. I feel bad saying that coz it will cost you money, but it  really helps to have your coding environment ready to go.&lt;/li&gt;
&lt;li&gt;Use Unix. I only did this after I switched to Mac OS X, but there are free alternatives. It&amp;rsquo;s just a nicer environment for coding. It will also reduce the amount of time you spend on games.&lt;/li&gt;
&lt;li&gt;Use Git and GitHub. Often, I would try to fix a bug in an assignment only to introduce more. Sometimes I would make backups of the project folder, but this was usually confusing and has led to me losing major chunks of work. Every project you do should be on GitHub to help you manage this, enable you to have pride in your work (by displaying it publicly), and having an  off-site backup.&lt;/li&gt;
&lt;li&gt;Start learning Vim during a break. It has a steep learning curve, but will  make you faster in the end.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Start Your Career Early.&lt;/strong&gt; Another thing I didn&amp;rsquo;t do. Here are some things you  can do to help you hit the ground running when you finish uni:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Contribute to open source. It looks good on your resume, and (as japerk commented) will help you learn about real world programming and software in the wild. You can find beginner-level bugs on  &lt;a href="https://openhatch.org/search/?toughness=bitesize"&gt;openhatch&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Maintain a resume and apply for jobs (even if they are out of your league). Worst-case, you don&amp;rsquo;t get it, but it will may help you spot holes in your  knowledge and experience (failing a Google interview was a big reality check  for me). Best-case, you get relevant work while at uni.&lt;/li&gt;
&lt;li&gt;Tutor people. Studies have shown that we learn best when we are teaching.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Don&amp;rsquo;t Forget: People Matter!&lt;/strong&gt; It&amp;rsquo;s easy for a geek to forget this, but I  think it&amp;rsquo;s the lynchpin to good education.&lt;/p&gt;
&lt;p&gt;The friends you make, even if less intelligent, will drive you to  do Computer Science for fun. They will either show you the cool stuff they  learn, or challenge you to do better than them. But it doesn&amp;rsquo;t stop with your  classmates.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write a blog. This will refine your communication skills, but I still get  emails from people thanking me for my (simple) &lt;a href="http://www.alexbowe.com/education/nanosecond-timing"&gt;nanosecond time wrapper&lt;/a&gt;. It&amp;rsquo;s fun and rewarding. If you are having  trouble maintaining it persistently, &lt;a href="mailto:bowe.alexander@gmail.com"&gt;email me&lt;/a&gt; and I&amp;rsquo;ll tell you  about a relevant project of mine that might help&amp;hellip;&lt;/li&gt;
&lt;li&gt;Tweet about your programming interests. This will build a network outside of  uni, that will remind you why you like doing this. Follow me on twitter:  &lt;a href="http://www.twitter.com/alexbowe"&gt;@alexbowe&lt;/a&gt; :)&lt;/li&gt;
&lt;li&gt;Email your idols. It can be really helpful and exciting to get a response  from someone whose blog you read, or who has written a book or paper you read. This really helped me with my thesis last year. It might also give  you something to blog about ;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To me, these are the most important pieces of advice to take while studying (or as soon as you are ready). Criticism, suggestions, and comments of any other type are welcome, as always. It&amp;rsquo;d also be cool to hear about things you think paint CS in a cool light for students (The game of life is one example).&lt;/p&gt;
&lt;p&gt;For more blog posts on similar matters by much better writers and programmers  than me, check out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.catb.org/~esr/faqs/hacker-howto.html"&gt;Eric Raymond - How To Become A Hacker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.paulgraham.com/college.html"&gt;Paul Graham - Undergraduation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.paulgraham.com/undergrad2.html"&gt;Paul Graham - More Advice for Undergrads&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.joelonsoftware.com/articles/CollegeAdvice.html"&gt;Joel Spolsky - Advice for Computer Science College Students&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://programmers.stackexchange.com/questions/44177/what-is-the-single-most-effective-thing-you-did-to-improve-your-programming-skill"&gt;Quesiton on Stack Overflow: What is the single most effective thing you did to improve your programming skills?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;After &lt;a href="http://www.amazon.com/gp/product/020161622X/alexbowecom-20"&gt;The Pragmatic Programmer&lt;/a&gt;, try &lt;a href="http://www.amazon.com/gp/product/1934356344/alexbowecom-20"&gt;The Passionate Programmer&lt;/a&gt;. If you want a good book to help you with your people skills, try &lt;a href="http://www.amazon.com/gp/product/1439167346/alexbowecom-20"&gt;How To Win Friends And Influence People&lt;/a&gt;. &lt;a href="http://www.codinghorror.com/blog/2004/02/recommended-reading-for-developers.html"&gt;Coding Horror&lt;/a&gt; and &lt;a href="http://stackoverflow.com/questions/1711/what-is-the-single-most-influential-book-every-programmer-should-read"&gt;Stack Overflow&lt;/a&gt; have great book recommendations too. And a list of &lt;a href="http://stackoverflow.com/questions/194812/list-of-freely-available-programming-books"&gt;FREE books&lt;/a&gt;.&lt;a class="reversefootnote" title="return to article"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Too many great blogs to list. Start at &lt;a href="http://news.ycombinator.com"&gt;Hacker News&lt;/a&gt; and you&amp;rsquo;ll soon find some great blogs to follow. Theres also a recommendations &lt;a href="http://news.ycombinator.com/item?id=128762"&gt;thread there already&lt;/a&gt;.&lt;a class="reversefootnote" title="return to article"&gt;&amp;nbsp;↩&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;&lt;/div&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/advice-to-cs-undergrads"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/advice-to-cs-undergrads#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/dRQAI-JkvGppDSCFPtT4PFXZotA/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/dRQAI-JkvGppDSCFPtT4PFXZotA/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/dRQAI-JkvGppDSCFPtT4PFXZotA/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/dRQAI-JkvGppDSCFPtT4PFXZotA/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/OwgURKSbRwY" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="313" width="500" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/vghmvcBcHyACzdcGhceiyuBjfHgcsFuuueonoecojyoximBJjbwpgzafnukw/media_https3prodwehea_DmuAm.jpg">
        <media:thumbnail height="313" width="500" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/vghmvcBcHyACzdcGhceiyuBjfHgcsFuuueonoecojyoximBJjbwpgzafnukw/media_https3prodwehea_DmuAm.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/advice-to-cs-undergrads</feedburner:origLink></item>
    <item>
      <pubDate>Sat, 27 Nov 2010 21:00:00 -0800</pubDate>
      <title>Awesome is Better than Good</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/YQ5_-SaOO60/awesome-is-better</link>
      <guid isPermaLink="false">http://www.alexbowe.com/quote/awesome-is-better</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;a href="http://www.nataliedee.com/index.php?date=091805"&gt;&lt;img src="http://www.nataliedee.com/091805/way-better.jpg" alt="Awesome" width="300" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I am currently reading &lt;a href="http://www.amazon.com/gp/product/0816031789/alexbowecom-20"&gt;Edward De Bono's Thinking Course&lt;/a&gt; (Author of &lt;a href="http://www.amazon.com/gp/product/0316178314/alexbowecom-20"&gt;Six Thinking Hats&lt;/a&gt;),  which has an anecdote from one of his seminars; he describes a problem involving  two pieces of wood, and the task of crossing a room without touching the ground.  He accompanies this with a discussion of three approaches, one very inefficient,  and one seemingly good. The third approach is much faster, but everyone settles  on the second solution, because it seems good enough...&lt;/p&gt;
&lt;blockquote class="posterous_medium_quote"&gt;
&lt;p&gt;The 'shuffle' solution seems so obvious and so adequate that there never seems  any need to set out to look for an alternative. Contentment with an 'adequate'  solution or approach is the biggest block there is to any search for a better  alternative... It is only through realisation of this and an act of  &lt;em&gt;will&lt;/em&gt; that we can set out to look for alternatives - knowing that in  most cases we shall not find anything better, but still being willing to invest that thinking time.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I feel that this advice should be taken in reverse too. If you have a new idea  for how to do something, don't dismiss it because people are already happy with  the current methodology; they could be ecstatic with yours. Maybe a solution you  think is too obvious isn't so clear to them.&lt;/p&gt;
&lt;p&gt;I think many people sell their good ideas short, and become too attached to  ideas that maybe aren't all that great. Regardless of the perceived value of  your idea (within reason), you should test the waters. Run it by some  non-friends (they don't mind hurting your feelings). It may be better, or worse,  than you think.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/quote/awesome-is-better"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/quote/awesome-is-better#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/NJDF89j82irf3xVJWdF5d0xgx5o/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/NJDF89j82irf3xVJWdF5d0xgx5o/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/NJDF89j82irf3xVJWdF5d0xgx5o/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/NJDF89j82irf3xVJWdF5d0xgx5o/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/YQ5_-SaOO60" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="225" width="300" url="http://getfile3.posterous.com/getfile/files.posterous.com/import-jipc-nczi/ECyqqBAxDhdtGaEcvEIGHiaqxBhxfgxCeDhvibwiGIItwagqiaojzbGozgcn/media_httpwwwalexbowe_xIBpq.jpg">
        <media:thumbnail height="225" width="300" url="http://getfile0.posterous.com/getfile/files.posterous.com/import-jipc-nczi/ECyqqBAxDhdtGaEcvEIGHiaqxBhxfgxCeDhvibwiGIItwagqiaojzbGozgcn/media_httpwwwalexbowe_xIBpq.jpg.scaled500.jpg" />
      </media:content>
      <media:content type="image/jpeg" height="525" width="700" url="http://getfile7.posterous.com/getfile/files.posterous.com/alexbowe/yEHotnFBCzJxsaksnFzBssBqcypDfCduwqizivDBaievxvdADteiDvwbDdfg/media_httpwwwnatalied_fdxcF.jpg">
        <media:thumbnail height="375" width="500" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/yEHotnFBCzJxsaksnFzBssBqcypDfCduwqizivDBaievxvdADteiDvwbDdfg/media_httpwwwnatalied_fdxcF.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/quote/awesome-is-better</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 23 Nov 2010 12:26:00 -0800</pubDate>
      <title>Regularly Divisible</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/7rWquEhzQi4/regularly-divisible</link>
      <guid isPermaLink="false">http://www.alexbowe.com/education/regularly-divisible</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_bbizv" height="96" src="http://posterous.com/getfile/files.posterous.com/alexbowe/oztfcdvjmbGdauqzqHCCFnIxEvAIECEEskhqaGtkdgmCvzpFFmhayAlrGqhG/media_httpwwwalexbowe_bbizv.png.scaled500.png" width="373" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: read the comments at  &lt;a href="http://news.ycombinator.com/item?id=1937062"&gt;Hacker News&lt;/a&gt; to see some succinct approaches to this, as discussed by &lt;em&gt;gjm11&lt;/em&gt;,  &lt;em&gt;qntm&lt;/em&gt; and &lt;em&gt;patio11&lt;/em&gt;. Thanks to &lt;em&gt;Robin&lt;/em&gt; for providing &lt;a href="http://s3.boskent.com/divisibility-regex/divisibility-regex.html"&gt;this  demonstration&lt;/a&gt; that can find a regex for testing divisibility of any number, in any base (he  also made the code available, nice).&lt;/p&gt;
&lt;p&gt;Earlier this year, at the advice (once more) of &lt;a href="http://www.amazon.com/gp/product/1934356344/alexbowecom-20"&gt;Chad Fowler&lt;/a&gt;, I took to the idea of practicing programming every day. Perhaps this appealed to me because it echoed the rituals of my better musician friends, and allowed me to draw parallels between programming and my fading dream of becoming a famous rockstar.&lt;/p&gt;
&lt;p&gt;Possibly because of my failed interview at Google (hey, I wouldn't have hired  the back-then me either, so no hard feelings!), I was also interested in  job-interview styled problems [1]. &lt;em&gt;Not&lt;/em&gt; &lt;a href="http://niki.code-karma.com/2010/08/fizzbuzz/"&gt;FizzBuzz&lt;/a&gt; though, more like  the computer science 'riddles' found on &lt;a href="http://wuriddles.com/cs.shtml"&gt;this page&lt;/a&gt; [2].&lt;/p&gt;
&lt;p&gt;At the time I was teaching Computing Theory [3], 80% of which was &lt;a href="http://en.wikipedia.org/wiki/Formal_language"&gt;formal  languages&lt;/a&gt;: &lt;a href="http://en.wikipedia.org/wiki/Regular_expression"&gt;regular expressions&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Context-free_grammar"&gt;context free&lt;/a&gt; and  &lt;a href="http://en.wikipedia.org/wiki/Context-sensitive_grammar"&gt;context sensitive grammars&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Turing_machine"&gt;Turing machines&lt;/a&gt; and &lt;a href="http://en.wikipedia.org/wiki/Automata_theory"&gt;other  automata&lt;/a&gt;, and their locations in the &lt;a href="http://en.wikipedia.org/wiki/Chomsky_hierarchy"&gt;Chomsky Hierarchy&lt;/a&gt;.  So, this problem appealed to me:&lt;/p&gt;
&lt;blockquote class="posterous_short_quote"&gt;
&lt;p&gt;Construct a finite state machine (or equivalently, write a regular  expression) which accepts all strings over the alphabet {0,1} which are  divisible by 3 when interpreted in binary.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It is pretty interesting that languages can be defined to communicate patterns  in binary sequences that are divisible by 3. Let's solve it in more detail than  necessary :)...&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;We will be developing this regex incrementally. I will update the regex at each stage. Since it is easier to construct this regex using a finite state machine  (which is how I worked this out the first time), I'll also include a few  diagrams along the way. You can test your regexes using the Ruby code  &lt;a href="https://gist.github.com/711644"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To get an idea for the pattern, I started by listing out a few multiples of 3  and their binary representations. You can use the following Ruby code if you  want.&lt;/p&gt;
&lt;p&gt;&lt;div class="data type-ruby"&gt;
    
      &lt;table class="lines" cellspacing="0" cellpadding="0"&gt;
        &lt;tr&gt;
          &lt;td&gt;
            &lt;pre class="line_numbers"&gt;&lt;span rel="#L1" id="L1"&gt;1&lt;/span&gt;
&lt;span rel="#L2" id="L2"&gt;2&lt;/span&gt;
&lt;span rel="#L3" id="L3"&gt;3&lt;/span&gt;
&lt;span rel="#L4" id="L4"&gt;4&lt;/span&gt;
&lt;span rel="#L5" id="L5"&gt;5&lt;/span&gt;
&lt;span rel="#L6" id="L6"&gt;6&lt;/span&gt;
&lt;span rel="#L7" id="L7"&gt;7&lt;/span&gt;
&lt;span rel="#L8" id="L8"&gt;8&lt;/span&gt;
&lt;span rel="#L9" id="L9"&gt;9&lt;/span&gt;
&lt;span rel="#L10" id="L10"&gt;10&lt;/span&gt;
&lt;span rel="#L11" id="L11"&gt;11&lt;/span&gt;
&lt;span rel="#L12" id="L12"&gt;12&lt;/span&gt;
&lt;span rel="#L13" id="L13"&gt;13&lt;/span&gt;
&lt;/pre&gt;
          &lt;/td&gt;
          &lt;td width="100%"&gt;
            
              
                &lt;div class="highlight"&gt;&lt;pre /&gt;&lt;div class="line" id="LC1"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fixnum&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC2"&gt;&amp;nbsp;&amp;nbsp;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;to_bin_s&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC3"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;0&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;self&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC4"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC5"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;self&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC6"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC7"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC8"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_s&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC9"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC10"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="k"&gt;end&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC11"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;s&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC12"&gt;&amp;nbsp;&amp;nbsp;&lt;span class="k"&gt;end&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC13"&gt;&lt;span class="k"&gt;end&lt;/span&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/div&gt;
              
            
          &lt;/td&gt;
        &lt;/tr&gt;
      &lt;/table&gt;
    
  &lt;/div&gt;&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt; 30.times {|n| puts &amp;quot;#{n}: #{n.to_bin_s}&amp;quot; if n%3 == 0}
0: 0
3: 11
6: 110
9: 1001
12: 1100
15: 1111
18: 10010
21: 10101
24: 11000
27: 11011&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Observe that the binary representations for 3, 6, 12 and 24 have something in  common: they have the same sequence, with a little bit of zero-padding to the  right. &lt;em&gt;If &lt;code&gt;X&lt;/code&gt; is a binary string divisible by &lt;code&gt;3&lt;/code&gt;, then the binary string &lt;code&gt;X0&lt;/code&gt; is divisible by &lt;code&gt;3&lt;/code&gt; as well&lt;/em&gt;. This makes sense, as adding zeros to the right is  equivalent to multiplying by &lt;code&gt;2&lt;/code&gt;, and we know that &lt;code&gt;n * 3 * 2&lt;/code&gt; is divisible by  &lt;code&gt;3&lt;/code&gt; for any integer &lt;code&gt;n&lt;/code&gt;. This gives us:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = A0*&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;A will represent the leftover regex at each stage&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This can trivially be applied to left zero-padding as well, because a binary  string &lt;code&gt;X&lt;/code&gt; is numerically equivalent to the binary string &lt;code&gt;0X&lt;/code&gt; (issues of  &lt;a href="http://en.wikipedia.org/wiki/Endianness"&gt;endianness&lt;/a&gt; aside):&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = 0*A0*&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Since we already covered even multiples, to work out &lt;code&gt;A&lt;/code&gt; we just need to look at odd multiples of &lt;code&gt;3&lt;/code&gt;. Here are a few:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&amp;gt;&amp;gt; 50.times {|n| puts &amp;quot;#{n}: #{n.to_bin_s}&amp;quot; if n%3 == 0 and n%2 != 0}
3: 11
9: 1001
15: 1111
21: 10101
27: 11011
33: 100001
39: 100111
45: 101101&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;You might have noticed that &lt;code&gt;1111&lt;/code&gt; (15) is just &lt;code&gt;11&lt;/code&gt; (3) concatenated with itself. That's the same as saying &lt;code&gt;15 = 3 * 2 * 2 + 3&lt;/code&gt;. If you add two multiples of  three, you will of course get another multiple of 3 (it's the same for any  multiplier), and concatenating just involves doubling the first number a few  times pre-addition. Hence, you will always get another multiple of 3 by  concatenating two.&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = (0*A0*)+&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;From observation, here are the above numbers that are &lt;em&gt;not&lt;/em&gt; concatenations:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;3: 11
9: 1001
21: 10101
33: 100001
45: 101101&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Take note that it always starts and ends with a &lt;code&gt;1&lt;/code&gt;. It should definitely  end with a &lt;code&gt;1&lt;/code&gt; to be odd, and start with a different &lt;code&gt;1&lt;/code&gt; to be greater than  &lt;code&gt;1&lt;/code&gt;. Our updated regex becomes:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = (0*1A10*)+&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_ebuuh" height="203" src="http://posterous.com/getfile/files.posterous.com/alexbowe/IFAumbooeDCHnmraoskuHmbueiBGoyaJldHyByvrschoHmbFiByxtfmuCGdk/media_httpwwwalexbowe_ebuuH.png.scaled500.png" width="362" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;It also appears that we can optionally insert an even number of zeros between  the end &lt;code&gt;1&lt;/code&gt;s. A proof of this is attached at the end, if you're into that sort of thing.&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = (0*1(0A0)*10*)+&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_ucfck" height="244" src="http://posterous.com/getfile/files.posterous.com/alexbowe/pdafayJaFzrgbehIlchmHgaqFcBDDoDCsdgqhxsgwugpkIvvECiiEjJuIvjH/media_httpwwwalexbowe_uCFCk.png.scaled500.png" width="362" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;So what about those &lt;code&gt;1&lt;/code&gt;s in the middle then? It appears that we're allowed to  have 0, 1 or 2... maybe more consecutive 1s? Maybe we can say our regex is &lt;code&gt;r = (0*1(01*0)*10*)+&lt;/code&gt;. I'll &lt;a href="https://gist.github.com/711644"&gt;test&lt;/a&gt; the regex for the first 5000 integers:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;False Negatives:
  0&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Oops! I forgot about &lt;code&gt;0&lt;/code&gt; being evenly divisible by &lt;code&gt;3&lt;/code&gt; exactly &lt;code&gt;0&lt;/code&gt; times. Our  regex should account for this:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;r = 0|(0*1(01*0)*10*)+&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_fhfjr" height="319" src="http://posterous.com/getfile/files.posterous.com/alexbowe/uaBhfyuJuoDkovoBkwlDwxifikrfkAphrIvdkeeteqlfAyvgwCvCowIfAFng/media_httpwwwalexbowe_FhFjr.png.scaled500.png" width="362" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;Running the test again we get this output:&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Pass =]&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;So there you have it, the regex works for the first 5000 integers. I'll leave a  proof of this for all multiples of 3 as an exercise to the reader ;)&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;[1] I'm not sure I want to work for someone else anymore - I'd rather chase my  own stupid dreams. Probably not the rockstar one though. I'll write about these later.&lt;/p&gt;
&lt;p&gt;[2] Other good sources of practice questions are &lt;a href="http://www.amazon.com/gp/product/0201657880/alexbowecom-20"&gt;Programming Pearls&lt;/a&gt;, &lt;a href="http://projecteuler.net/"&gt;Project Euler&lt;/a&gt;, or you could take a paper from &lt;a href="http://scholar.google.com.au/scholar?as_q=&amp;amp;num=10&amp;amp;btnG=Search+Scholar&amp;amp;as_epq=&amp;amp;as_oq=&amp;amp;as_eq=&amp;amp;as_occt=any&amp;amp;as_sauthors=&amp;amp;as_publication=acm+transactions+on+algorithms&amp;amp;as_ylo=&amp;amp;as_yhi=&amp;amp;as_sdt=1.&amp;amp;as_sdtp=on&amp;amp;as_sdts=5&amp;amp;hl=en"&gt;ACM Transactions on Algorithms&lt;/a&gt;&lt;br /&gt; (for example) and implement the algorithm/data structure. While you're at it,  why not learn &lt;a href="http://www.erlang.org/"&gt;Erlang&lt;/a&gt; (or any other programming language guaranteed to  make you more appealing to the opposite sex) and implement it in that?&lt;/p&gt;
&lt;p&gt;[3] Recently, a past student of mine told me that they have never found  Computing Theory to be of any use. I said "What about regular expressions?" and  they shook their head. This was a smart student too, but I find that regular  expressions are so damn useful &lt;a href="http://m68k.net/2010/09/01/stop-searching-for-regexes.html"&gt;(perhaps a hammer I swing too often)&lt;/a&gt;.  Ironically, Computing &lt;em&gt;Theory&lt;/em&gt; was the most practically useful subject in my  (watered down) degree. In the near future I intend on climbing on my high-horse  and writing a blog post about my ideal CS degree.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Appendix&lt;/h2&gt;
&lt;p&gt;Here is a proof that shows we can optionally insert an even number of zeros  between two &lt;code&gt;1&lt;/code&gt;s to get an odd multiple of 3. This is equivalent to saying that  our left-most &lt;code&gt;1&lt;/code&gt; has to be in an odd position index (if 0 is the rightmost...).&lt;/p&gt;
&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;RTP: 2^(2i + 1) + 1 = 3m, for m odd integer and i positive integer or 0

Let i = 0, then
LHS = 2^1 + 1
    = 2 + 1
    = 3 * 1
    = 3m (1 is an odd integer)
    = RHS
    =&amp;gt; it holds for i = 0

Let i = k for any integer k, then assume
2^(2k + 1) + 1 = 3m

Let i = k + 1, then
LHS = 2^[2(k + 1) + 1] + 1
    = 2^(2k + 1 + 2) + 1
    = 4 * 2^(2k + 1) + 1
    = 3 * 2^(2k + 1) + 3m'
    = 3[ 2^(2k + 1) + m']
    = 3[ 2^(2k + 1) + 1 + (m' - 1)]
    = 3[3m' + (m' - 1)]
    = 3(4m' - 1)
    = 3m (an even number minus 1 will be odd)
    = RHS
QED&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Yes, 2^n + 1 always yields an odd multiple of 3, for n odd.  I should really set  up a math plugin...&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/education/regularly-divisible"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/education/regularly-divisible#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/dugb4I3O5Wxc381laICx9_gbHKo/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/dugb4I3O5Wxc381laICx9_gbHKo/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/dugb4I3O5Wxc381laICx9_gbHKo/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/dugb4I3O5Wxc381laICx9_gbHKo/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/7rWquEhzQi4" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="96" width="373" url="http://getfile5.posterous.com/getfile/files.posterous.com/alexbowe/oztfcdvjmbGdauqzqHCCFnIxEvAIECEEskhqaGtkdgmCvzpFFmhayAlrGqhG/media_httpwwwalexbowe_bbizv.png">
        <media:thumbnail height="96" width="373" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/oztfcdvjmbGdauqzqHCCFnIxEvAIECEEskhqaGtkdgmCvzpFFmhayAlrGqhG/media_httpwwwalexbowe_bbizv.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="203" width="362" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/IFAumbooeDCHnmraoskuHmbueiBGoyaJldHyByvrschoHmbFiByxtfmuCGdk/media_httpwwwalexbowe_ebuuH.png">
        <media:thumbnail height="203" width="362" url="http://getfile8.posterous.com/getfile/files.posterous.com/alexbowe/IFAumbooeDCHnmraoskuHmbueiBGoyaJldHyByvrschoHmbFiByxtfmuCGdk/media_httpwwwalexbowe_ebuuH.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="244" width="362" url="http://getfile1.posterous.com/getfile/files.posterous.com/alexbowe/pdafayJaFzrgbehIlchmHgaqFcBDDoDCsdgqhxsgwugpkIvvECiiEjJuIvjH/media_httpwwwalexbowe_uCFCk.png">
        <media:thumbnail height="244" width="362" url="http://getfile9.posterous.com/getfile/files.posterous.com/alexbowe/pdafayJaFzrgbehIlchmHgaqFcBDDoDCsdgqhxsgwugpkIvvECiiEjJuIvjH/media_httpwwwalexbowe_uCFCk.png.scaled500.png" />
      </media:content>
      <media:content type="image/png" height="319" width="362" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/uaBhfyuJuoDkovoBkwlDwxifikrfkAphrIvdkeeteqlfAyvgwCvCowIfAFng/media_httpwwwalexbowe_FhFjr.png">
        <media:thumbnail height="319" width="362" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/uaBhfyuJuoDkovoBkwlDwxifikrfkAphrIvdkeeteqlfAyvgwCvCowIfAFng/media_httpwwwalexbowe_FhFjr.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/education/regularly-divisible</feedburner:origLink></item>
    <item>
      <pubDate>Sun, 08 Aug 2010 15:19:00 -0700</pubDate>
      <title>Hero Typing</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/sku-aJ2BdzM/hero-typing</link>
      <guid isPermaLink="false">http://www.alexbowe.com/miscellaneous/hero-typing</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_auvth" height="256" src="http://posterous.com/getfile/files.posterous.com/alexbowe/aeIlpzjnAiBEkzHbdeFnDtthcffgsvywnbpbihrbqusHlwqshrusistBiwiv/media_httpwwwalexbowe_Auvth.jpg.scaled500.jpg" width="401" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;"Who is your hero?" is a question I've been asked, but never had an answer for.  Why is this a question that people are compelled to ask? Are we expected to have  a hero, like a favourite colour or number?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;In his book &lt;a href="http://www.amazon.com/gp/product/1934356344/alexbowecom-20"&gt;The Passionate Programmer&lt;/a&gt;, &lt;a href="http://www.chadfowler.com"&gt;Chad Fowler&lt;/a&gt; quotes jazz musician Pat Metheny when he writes this advice to aspiring developers: "Be the worst guy [or girl] in every band you&amp;rsquo;re in" (an excerpt is available &lt;a href="http://media.pragprog.com/titles/cfcar2/worst.pdf"&gt;here&lt;/a&gt;). Chad argues that being in proximity of more talented programmers can make you "better via osmosis".&lt;/p&gt;
&lt;p&gt;As programmers, we have many opportunities to find a better band (e.g. open source) - however, I think that this advice can be reapplied to nearly any domain, and the benefit gained remotely. In general, if it is against your current nature to be as awesome as X, consider X your hero. This is why we should have heroes; not to be in awe of them, but to hypnotise ourselves to want to &lt;a href="http://en.wikipedia.org/wiki/Duck_typing"&gt;quack like a duck&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm writing this at the risk of quacking like a motivational speaker. I'm not a fan of people who try to motivate you by saying obvious, non-actionable things like "believe in yourself". How the fuck am I meant to do that? No, I'm talking about heroes who can actually stir some change of behaviour in you, for the better. Teachers, if you will.&lt;/p&gt;
&lt;p&gt;One of the reasons I found it hard to answer "Who is your hero?" is because I didn't know what a hero was. Take Superman for example: Superman is not a hero in my sense of the word, because he only saved us from the baddies. He didn't improve the human race, he didn't encourage our evolution. He didn't teach mankind to fish, so we only ate for a day.&lt;/p&gt;
&lt;p&gt;In case you want to know who my heroes are, I like the way &lt;a href="http://www.paulgraham.com"&gt;Paul Graham&lt;/a&gt; quacks. If you know me then I have probably mentioned him to you before. The guy has &lt;a href="http://www.paulgraham.com/say.html"&gt;ideas&lt;/a&gt; and he knows how to write.&lt;/p&gt;
&lt;p&gt;Then there's &lt;a href="http://en.wikipedia.org/wiki/Evariste_Galois"&gt;Galois&lt;/a&gt;. His legend says that he stayed up all night writing everything he knew about &lt;a href="http://en.wikipedia.org/wiki/Group_theory"&gt;group theory&lt;/a&gt; before dying in a gun duel the next day. He was 20 and the duel was over a girl. I like that someone who contributed so much to mathematics was also crazy enough to die for a girl. It's chivalry, it's destructive and it's rock and roll.&lt;/p&gt;
&lt;p&gt;I didn't have to choose these heroes, they are just people who I admire. I think that choosing to admire someone is a strong commitment, one that forces you to consider what you care about (&lt;a href="http://pragprog.com/the-pragmatic-programmer/extracts/tips"&gt;tip #1&lt;/a&gt; in &lt;a href="http://www.amazon.com/gp/product/020161622X/alexbowecom-20"&gt;The Pragmatic Programmer&lt;/a&gt;). Think about who your hero is; if it doesn't make you more awesome, then at least you'll have an answer when people ask you "Who is your hero?"&lt;/p&gt;
&lt;p&gt;So, who is your hero?&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/miscellaneous/hero-typing"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/miscellaneous/hero-typing#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/TDOyl2nBN438ZsBdWbxhBXiJ-YQ/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TDOyl2nBN438ZsBdWbxhBXiJ-YQ/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/TDOyl2nBN438ZsBdWbxhBXiJ-YQ/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TDOyl2nBN438ZsBdWbxhBXiJ-YQ/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/sku-aJ2BdzM" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="256" width="401" url="http://getfile2.posterous.com/getfile/files.posterous.com/alexbowe/aeIlpzjnAiBEkzHbdeFnDtthcffgsvywnbpbihrbqusHlwqshrusistBiwiv/media_httpwwwalexbowe_Auvth.jpg">
        <media:thumbnail height="256" width="401" url="http://getfile6.posterous.com/getfile/files.posterous.com/alexbowe/aeIlpzjnAiBEkzHbdeFnDtthcffgsvywnbpbihrbqusHlwqshrusistBiwiv/media_httpwwwalexbowe_Auvth.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/miscellaneous/hero-typing</feedburner:origLink></item>
    <item>
      <pubDate>Sun, 27 Jun 2010 13:40:00 -0700</pubDate>
      <title>Things Smarter People Said</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/JCTJ83gJ-fo/things-smarter-people-said</link>
      <guid isPermaLink="false">http://www.alexbowe.com/programming/things-smarter-people-said</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpwwwalexbowe_gmgnh" height="156" src="http://posterous.com/getfile/files.posterous.com/import-jipc-nczi/omCFucwcvyloqgIeEmfBufuqEGHHzmzavcsmojritdEnngBwpGDpoJuIiCsr/media_httpwwwalexbowe_GmGnH.png.scaled500.png" width="300" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;I am currently reading &lt;a href="http://www.amazon.com/gp/product/0735619670/alexbowecom-20"&gt;Code Complete 2&lt;/a&gt; as per Jeff Atwood's &lt;a href="http://www.codinghorror.com/blog/2004/02/recommended-reading-for-developers.html"&gt;Recommended Reading for Developers&lt;/a&gt; list, where I came across this interesting quote by Glenford Myers:&lt;/p&gt;
&lt;blockquote class="posterous_medium_quote"&gt;
&lt;p&gt;We try to solve the problem by rushing through the design process so that enough time is left at the end of the project to uncover the errors that were  made because we rushed through the design process"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I think it succinctly illustrates the need for balance in software design; it is  just vague enough to communicate exactly what you need to hear.&lt;/p&gt;
&lt;p&gt;I'm interested in hearing any other cool quotes, or book recommendations if  you've got some =]&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/programming/things-smarter-people-said"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/programming/things-smarter-people-said#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/ZNSjGYplnPeFW6qu9IYQ8G9SuDE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ZNSjGYplnPeFW6qu9IYQ8G9SuDE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/ZNSjGYplnPeFW6qu9IYQ8G9SuDE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ZNSjGYplnPeFW6qu9IYQ8G9SuDE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/JCTJ83gJ-fo" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="156" width="300" url="http://getfile3.posterous.com/getfile/files.posterous.com/import-jipc-nczi/omCFucwcvyloqgIeEmfBufuqEGHHzmzavcsmojritdEnngBwpGDpoJuIiCsr/media_httpwwwalexbowe_GmGnH.png">
        <media:thumbnail height="156" width="300" url="http://getfile1.posterous.com/getfile/files.posterous.com/import-jipc-nczi/omCFucwcvyloqgIeEmfBufuqEGHHzmzavcsmojritdEnngBwpGDpoJuIiCsr/media_httpwwwalexbowe_GmGnH.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/programming/things-smarter-people-said</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 25 May 2010 14:08:00 -0700</pubDate>
      <title>I don't know what the F*** I'm doing</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/rIobedwb7lY/no-one-knows-what-the-f-theyre-doing-or-the-3-types-of-knowledge</link>
      <guid isPermaLink="false">http://www.alexbowe.com/education/no-one-knows-what-the-f-theyre-doing-or-the-3-types-of-knowledge</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;img alt="Media_httpmediatumblr_eqzrv" height="370" src="http://posterous.com/getfile/files.posterous.com/import-jipc-nczi/DHmsgviBfcxcmzDDyztEzxmsycqfCyClrwvnqDrGyerkaIspkhjgrFvIFugp/media_httpmediatumblr_eqzrv.png.scaled500.png" width="500" /&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://jangosteve.com/post/380926251/no-one-knows-what-theyre-doing"&gt;This&lt;/a&gt; is a nice article I read back in February which discusses why it's a good thing when you realize how very little you know. I feel just like that right now, and I'm enjoying it because there is so  much territory left to explore...&lt;/p&gt;
&lt;p&gt;&lt;a href="http://jangosteve.com/post/380926251/no-one-knows-what-theyre-doing"&gt;No One Knows What the F*** They're Doing (or "The 3 Types of Knowledge")&lt;/a&gt;&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/education/no-one-knows-what-the-f-theyre-doing-or-the-3-types-of-knowledge"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/education/no-one-knows-what-the-f-theyre-doing-or-the-3-types-of-knowledge#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/aWsoGVlprQHK0n629kDHtaGAA7s/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/aWsoGVlprQHK0n629kDHtaGAA7s/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/aWsoGVlprQHK0n629kDHtaGAA7s/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/aWsoGVlprQHK0n629kDHtaGAA7s/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/rIobedwb7lY" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/png" height="370" width="500" url="http://getfile6.posterous.com/getfile/files.posterous.com/import-jipc-nczi/DHmsgviBfcxcmzDDyztEzxmsycqfCyClrwvnqDrGyerkaIspkhjgrFvIFugp/media_httpmediatumblr_eqzrv.png">
        <media:thumbnail height="370" width="500" url="http://getfile4.posterous.com/getfile/files.posterous.com/import-jipc-nczi/DHmsgviBfcxcmzDDyztEzxmsycqfCyClrwvnqDrGyerkaIspkhjgrFvIFugp/media_httpmediatumblr_eqzrv.png.scaled500.png" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/education/no-one-knows-what-the-f-theyre-doing-or-the-3-types-of-knowledge</feedburner:origLink></item>
    <item>
      <pubDate>Sun, 28 Mar 2010 11:47:00 -0700</pubDate>
      <title>How to Win Friends and Generate People</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/24R7nH5m8XU/generators</link>
      <guid isPermaLink="false">http://www.alexbowe.com/education/generators</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/yxvbfmdsgFCEfnEAlygrqsComlsuxfjhqspFAtHjyIrrAipFeucCFrrHafoD/media_httpimages20x20_eeusq.jpg.scaled1000.jpg"&gt;&lt;img alt="Media_httpimages20x20_eeusq" height="369" src="http://posterous.com/getfile/files.posterous.com/alexbowe/yxvbfmdsgFCEfnEAlygrqsComlsuxfjhqspFAtHjyIrrAipFeucCFrrHafoD/media_httpimages20x20_eeusq.jpg.scaled500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;I'm doing a project for a subject at RMIT which needs to manage thousands of patient records for a hospital. We haven't been given any sample data though, so I wanted to write a generator (so we can test it with small or large data sets whenever needed).&lt;/p&gt;
&lt;p&gt;I started with the name generator (in Python), which selected a random male/female/last name from a file [1]. I then realised an address generator would behave similarly (street, city, country lists), so I decided to make a Generator base class. You can get the source code for these &lt;a href="http://github.com/alexbowe/generators"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I wanted the Generator base class to create methods dynamically; It would be instantiated with a bunch of methodname-filename pairs and have the methods for randomly selecting an entry from each file made dynamically. Exactly how to do this wasn't easy to find, as it wasn't well documented... the solution ended up being:&lt;/p&gt;
&lt;p&gt;&lt;div class="data type-python"&gt;
    
      &lt;table class="lines" cellspacing="0" cellpadding="0"&gt;
        &lt;tr&gt;
          &lt;td&gt;
            &lt;pre class="line_numbers"&gt;&lt;span rel="#L1" id="L1"&gt;1&lt;/span&gt;
&lt;span rel="#L2" id="L2"&gt;2&lt;/span&gt;
&lt;span rel="#L3" id="L3"&gt;3&lt;/span&gt;
&lt;span rel="#L4" id="L4"&gt;4&lt;/span&gt;
&lt;span rel="#L5" id="L5"&gt;5&lt;/span&gt;
&lt;span rel="#L6" id="L6"&gt;6&lt;/span&gt;
&lt;span rel="#L7" id="L7"&gt;7&lt;/span&gt;
&lt;span rel="#L8" id="L8"&gt;8&lt;/span&gt;
&lt;span rel="#L9" id="L9"&gt;9&lt;/span&gt;
&lt;span rel="#L10" id="L10"&gt;10&lt;/span&gt;
&lt;span rel="#L11" id="L11"&gt;11&lt;/span&gt;
&lt;span rel="#L12" id="L12"&gt;12&lt;/span&gt;
&lt;span rel="#L13" id="L13"&gt;13&lt;/span&gt;
&lt;/pre&gt;
          &lt;/td&gt;
          &lt;td width="100%"&gt;
            
              
                &lt;div class="highlight"&gt;&lt;pre /&gt;&lt;div class="line" id="LC1"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC2"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;Adds the instance method from function f to the object obj, callable by fname (i.e. obj.fname())&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC3"&gt;&lt;span class="sd"&gt;    example:&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC4"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC5"&gt;&lt;span class="sd"&gt;    def func(self):&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC6"&gt;&lt;span class="sd"&gt;        print &amp;#39;test&amp;#39;&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC7"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC8"&gt;&lt;span class="sd"&gt;    add_method(myObject, func, &amp;#39;newmethodname&amp;#39;)&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC9"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="line" id="LC10"&gt;&lt;span class="sd"&gt;    myObject.newmethodname()&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC11"&gt;&lt;span class="sd"&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC12"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;new&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;instancemethod&lt;/span&gt;&lt;/div&gt;&lt;div class="line" id="LC13"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__dict__&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fname&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instancemethod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__class__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/div&gt;
              
            
          &lt;/td&gt;
        &lt;/tr&gt;
      &lt;/table&gt;
    
  &lt;/div&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://niki.code-karma.com/"&gt;Niki&lt;/a&gt; (my brother) helped me get 70% of the way there... (after lots of scope and decorator issues) thanks man :)&lt;/p&gt;
&lt;p&gt;[1] The name lists were taken from &lt;a href="http://www.census.gov/genealogy/www/data/1990surnames/names_files.html"&gt;this census data&lt;/a&gt;, which actually provides percentages for each name too, if you wanted to make the name distribution more realistic...&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/education/generators"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/education/generators#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/21LE10d0jg_cd8XDCRhJL-7ME7Q/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/21LE10d0jg_cd8XDCRhJL-7ME7Q/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/21LE10d0jg_cd8XDCRhJL-7ME7Q/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/21LE10d0jg_cd8XDCRhJL-7ME7Q/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/24R7nH5m8XU" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="590" width="800" url="http://getfile0.posterous.com/getfile/files.posterous.com/alexbowe/yxvbfmdsgFCEfnEAlygrqsComlsuxfjhqspFAtHjyIrrAipFeucCFrrHafoD/media_httpimages20x20_eeusq.jpg">
        <media:thumbnail height="369" width="500" url="http://getfile3.posterous.com/getfile/files.posterous.com/alexbowe/yxvbfmdsgFCEfnEAlygrqsComlsuxfjhqspFAtHjyIrrAipFeucCFrrHafoD/media_httpimages20x20_eeusq.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/education/generators</feedburner:origLink></item>
    <item>
      <pubDate>Fri, 12 Mar 2010 10:51:09 -0800</pubDate>
      <title>Amorphism</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/jVHjO3fH7pk/amorphism</link>
      <guid isPermaLink="false">http://www.alexbowe.com/photography/amorphism</guid>
      <description>&lt;p&gt;
	&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/import-jipc-nczi/bhAtohGABCBwbcnhFrwtHctchzdArEvDdCGviEGkBanDCpvnjdfBDpblvCyu/media_httpkitsunenoir_ibIaj.jpg.scaled1000.jpg"&gt;&lt;img alt="Media_httpkitsunenoir_ibiaj" height="667" src="http://posterous.com/getfile/files.posterous.com/import-jipc-nczi/bhAtohGABCBwbcnhFrwtHctchzdArEvDdCGviEGkBanDCpvnjdfBDpblvCyu/media_httpkitsunenoir_ibIaj.jpg.scaled500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
This guy, &lt;strong&gt;&lt;a href="http://www.behance.net/indiffident"&gt;Alberto Seveso&lt;/a&gt;&lt;/strong&gt;, made this by dropping varnish into a fishbowl. I like how such a simple idea can yield such complex results :)
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/photography/amorphism"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/photography/amorphism#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/AAWP_dxN0atlKEb0AYXdLuox_Iw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/AAWP_dxN0atlKEb0AYXdLuox_Iw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/AAWP_dxN0atlKEb0AYXdLuox_Iw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/AAWP_dxN0atlKEb0AYXdLuox_Iw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/jVHjO3fH7pk" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="768" width="576" url="http://getfile0.posterous.com/getfile/files.posterous.com/import-jipc-nczi/bhAtohGABCBwbcnhFrwtHctchzdArEvDdCGviEGkBanDCpvnjdfBDpblvCyu/media_httpkitsunenoir_ibIaj.jpg">
        <media:thumbnail height="667" width="500" url="http://getfile8.posterous.com/getfile/files.posterous.com/import-jipc-nczi/bhAtohGABCBwbcnhFrwtHctchzdArEvDdCGviEGkBanDCpvnjdfBDpblvCyu/media_httpkitsunenoir_ibIaj.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/photography/amorphism</feedburner:origLink></item>
    <item>
      <pubDate>Tue, 23 Feb 2010 01:37:00 -0800</pubDate>
      <title>Nanosecond Timing</title>
      <link>http://feedproxy.google.com/~r/alexbowe/~3/JoCu3bZKcyU/nanosecond-timing</link>
      <guid isPermaLink="false">http://www.alexbowe.com/education/nanosecond-timing</guid>
      <description>&lt;p&gt;
	&lt;p&gt;&lt;div class='p_embed p_image_embed'&gt;
&lt;a href="http://posterous.com/getfile/files.posterous.com/alexbowe/cJujeiypgemjoGobaEhghxuHzdwbHfnGkCbieFxGwkekenslHHsnyaClJwqs/media_httpprotomagcom_qDfqt.jpg.scaled1000.jpg"&gt;&lt;img alt="Media_httpprotomagcom_qdfqt" height="290" src="http://posterous.com/getfile/files.posterous.com/alexbowe/cJujeiypgemjoGobaEhghxuHzdwbHfnGkCbieFxGwkekenslHHsnyaClJwqs/media_httpprotomagcom_qDfqt.jpg.scaled500.jpg" width="500" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;/p&gt;
&lt;p&gt;My Uni (RMIT) uses a mixture of Solaris, Mac OS X, Linux, and Windows computer labs. Our programs are nearly always tested on Solaris, though. Sometimes we are required to provide nanosecond timings in our experiments using the (real-time) POSIX function &lt;code&gt;gethrtime()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Depending on which lab I work in, or if I&amp;rsquo;m working from home, I might need to comment out (or add compile guards) into my code to compile it correctly. This can make the code less readable, and can make its behavior (particularly output) on Solaris less obvious while testing on a foreign system.&lt;/p&gt;
&lt;p&gt;Although the granularity, accuracy and any possible side effects (such as function overhead, or being affected by changing the system clock) may be different for each function, I find it helps to at least give a ball-park figure. &lt;em&gt;Experiments should still be done on Solaris&lt;/em&gt;. If you want to use or modify my nanosecond function call wrapper, get it on &lt;a href="http://github.com/alexbowe/nanotime_wrapper"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Not thoroughly tested on all target operating systems or compilers. It is only intended as a convenience.&lt;/p&gt;
	
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.alexbowe.com/education/nanosecond-timing"&gt;Permalink&lt;/a&gt; 

	| &lt;a href="http://www.alexbowe.com/education/nanosecond-timing#comment"&gt;Leave a comment&amp;nbsp;&amp;nbsp;&amp;raquo;&lt;/a&gt;

&lt;/p&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/1uRoUhHQxza13HviEntlFOvQ4Vc/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1uRoUhHQxza13HviEntlFOvQ4Vc/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/1uRoUhHQxza13HviEntlFOvQ4Vc/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1uRoUhHQxza13HviEntlFOvQ4Vc/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/alexbowe/~4/JoCu3bZKcyU" height="1" width="1"/&gt;</description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/1084320/Me.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/Ztktix09Ofv</posterous:profileUrl>
        <posterous:firstName>Alex</posterous:firstName>
        <posterous:lastName>Bowe</posterous:lastName>
        <posterous:nickName>Alex</posterous:nickName>
        <posterous:displayName>Alex Bowe</posterous:displayName>
      </posterous:author>
      <media:content type="image/jpeg" height="360" width="620" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/cJujeiypgemjoGobaEhghxuHzdwbHfnGkCbieFxGwkekenslHHsnyaClJwqs/media_httpprotomagcom_qDfqt.jpg">
        <media:thumbnail height="290" width="500" url="http://getfile4.posterous.com/getfile/files.posterous.com/alexbowe/cJujeiypgemjoGobaEhghxuHzdwbHfnGkCbieFxGwkekenslHHsnyaClJwqs/media_httpprotomagcom_qDfqt.jpg.scaled500.jpg" />
      </media:content>
    <feedburner:origLink>http://www.alexbowe.com/education/nanosecond-timing</feedburner:origLink></item>
  </channel>
</rss>

