Honest to a Segfault syndicationhttp://blog.cdleary.com2013-07-27T20:00:00-07:00What couldn't you ship?2013-07-27T20:00:00-07:00Chris Learyhttps://twitter.com/cdlearydbeb896e-8869-4903-85e1-256f9eb7196c<a name="what-couldn-t-you-ship"></a><p>Great excerpt from Jason Hong's article in this month's <em>Communications of the
ACM</em>:</p><blockquote><p>The most impressive story I have ever heard about owning your research is
from Ron Azuma's retrospective "So Long, and Thanks for the Ph.D." Azuma
tells the story of how one graduate student needed a piece of equipment for
his research, but the shipment was delayed due to a strike. The graduate
student flew out to where the hardware was, rented a truck, and drove it
back, just to get his work done.</p></blockquote><p>Stories like that pluck at my heart strings. Best part of <a href="http://5by5.tv/b2w/1">Back to Work,
Episode 1</a> was this, when around 19 minutes in Merlin Mann said:</p><blockquote><p>I was drinking, which I don't usually do, but I was with a guy who likes to
drink, who is a friend of mine, and actually happens to be a client. And, we
were talking about what we're both really interested in and fascinated by,
which is <strong>culture</strong>. What is it that makes some environments such a petri dish
for great stuff, and what is it about that makes people wanna run away
from the petri dish stealing office supplies and peeing in someone's desk?
<strong>What is it, what makes that difference, and can you change it?</strong></p><p>In time, I found myself moving more towards this position — as we
had more drinks — that it kind of doesn't really matter what people do,
given that ultimately you're the one who's gotta be the animus. You're the
one who's actually going to have to go ship, right?</p><p>And, my sense was — great guy — he kept moving further toward, "Yeah,
but...". "This person does this", and "that person does that", and "I need
this to do that". And I found myself saying, "Well, okay, but <em>what</em>?" <strong>What
are you gonna do as a result of that?</strong> Do you just give up? Do you spend all
of your time trying to fix these things that these other people are doing
wrong?</p><p>And, to get to the nut of the nut; apparently — I'm told by the security
guards who removed me from the room — that it ended with me basically yelling
over and over, "What couldn't you ship?!" "What <em>couldn't</em> you ship?!" "What
couldn't <em>you</em> ship?!"</p><p>... If we really, really are honest with ourselves, there's really not that
much stuff we can't ship because of other people...</p><p>... When are you ever gonna get enough change in other people to satisfy
you? When are you ever gonna get enough of exactly how you need it to be to
make <em>one</em> thing?</p><p>Well, you know, that is always gonna be there. You're
always gonna find some reason to not run today. You're always gonna find
some reason to eat crap from a machine today. You're always gonna find a
reason for everything.</p><p>To quote that wonderful Renoir film, <em>Rules of the
Game</em>, something along the lines of, "The trouble in life is that every man
has his reasons." Everybody's got their reasons. And the thing that
separates the people who make cool stuff from the people who don't make
cool stuff is not whether they live in San Francisco. And it's not whether
they have a cool system. It's whether they made it. That's it, end of
story. Did you make it or didn't you make it?</p></blockquote><p>The way I see it, you should never stop asking yourself:</p><ul><li><p>What's really going to be different about tomorrow that you couldn't go make
happen today? Why isn't past inaction indicative of what's going to happen
today, or tomorrow?</p></li><li><p>What reason do you have to believe that appropriate steps to deliver on your
vision are in flight, and what would it take for you to go drive them harder.</p></li><li><p>What losses might you have to cut in order to get <em>some thing</em> done,
rather than a theoretically more perfect <em>no thing</em>. For some outcomes, it
really does take a village. I wouldn't expect anybody to single-handedly ship
the Great Pyramid.</p></li></ul><p>Of course, sunk costs are powerful siren, so you have to be very careful to
evaluate whether compromises still allow you to hit the marks you care about as
<em>true goals</em>. But, at the end of the day, all those trade-offs roll up into one
subtly simple question:</p><p><em>What couldn't you ship?</em></p>Big design vs simple solutions2013-01-20T16:00:00-08:00Chris Learyhttps://twitter.com/cdleary8c06a274-3ca3-4dbf-9cd1-2ec806546fd3<a name="big-design-vs-simple-solutions"></a><p>The distinction between essential complexity and accidental complexity is a
useful one — it allows you to identify the parts of your design where you're
stumbling over <em>yourself</em> instead of working against something truly reflected
in the <em>problem domain</em>.</p><p>The simplest-solution-that-could-possibly-work (SSTCPW) concept is inherently
appealing in that, by design, you're trying to minimize these pieces that you
may come to stumble over. Typically, when you take this approach, you
acknowledge that an unanticipated change in requirements will entail major
rework, and accept that fact in light of the perceived benefits.</p><p>Benefits cited typically include:</p><ul><li><p>Less design to validate.</p></li><li><p>Less implementation to perform.</p></li><li><p>Less surface area to debug.</p></li><li><p>Increased confidence the resulting product executes properly (though perhaps
modestly in scope).</p></li></ul><p>As a more quantifiable example: if a SSTCPW contains comparatively less code
paths than an alternative solution, you can see how some of the above merits
could fall out of it.</p><p>This also demonstrates some of the appeal of fail-fast and crash-only
approaches to software implementation, in that cutting out unanticipated
program inputs and states, via an acceptance of "failure" as a concept, tends
to hone in on SSTCPW.</p><a name="contrast"></a><h3>Contrast</h3><p>In my head, this approach is contrasted most starkly against an approach called
big-design-up-front (BDUF). The essence of BDUF is that, in the design process,
one attempts to consider the whole set of <em>possible</em> requirements (typically
both currently-known and projected) and build into the initial design and
implementation the flexibility and structure to accommodate large swaths of
them in the future, if not in the current version.</p><p>In essence, this approach acknowledges that the target is likely moving, tries
to anticipate the target's movement, and takes steps to remain one step ahead
of the game by building in flexibility, genericity, and a more 1:1-looking
mapping between the problem domain and the code constructs.</p><p>Benefits cited usually relate to ongoing maintenance in some sense and
typically include:</p><ul><li><p>Reuse via genericity.</p></li><li><p>Flexibility for feature addition.</p></li><li><p>A more robust model of the problem domain imbued in the program.</p></li></ul><a name="head-to-head"></a><h3>Head to head</h3><p>In a lot of software engineering doctrine that I've read, been taught, and
toyed with throughout the years, the prevalence of unknown and ever-changing
business requirements for application software has lent a lot of credence to
BDUF, especially in that space.</p><p>There have also been enabling trends for this mentality; for example, the
introduction of indirection through abstractions has monumentally less cost on
today's JVM than on the Java interpreter of yore. In that same sense, C++ has
attempted to satisfy an interesting niche in the middle ground with its design
concept of "zero cost abstractions", which intend to be known-reducible to more
easily understood and more predictable underlying code forms at compile time.
On the hardware side, the steady provisioning of single-thread performance and
memory capacity throughout the years has also played an enabling role.</p><p>By contrast, the system-software implementation doctrine and conventional
wisdom skews heavily towards SSTCPW, in that any "additional" design reflected
in the implementation tends to come under higher levels of duress from a
{performance, code-size, debuggability, correctness} perspective. Ideas like
"depending on concretions" — which I specifically use because it's denounced
by the D in SOLID — are wholly accepted in SSTCPW given that it (a) makes the
resulting artifact simpler to understand in some sense (b) without sacrificing
the ability to meet necessary requirements.</p><p>So what's the underlying trick in acting on a SSTCPW philosophy? You have to do
enough design work (and detailed engineering legwork) to distinguish between
what is <em>necessary</em> and what is <em>wanted</em>, and have some good-taste arbitration
process to distinguish between the two when there's disagreement about the
classification. As part of that process, you have to make the most difficult
decisions: what you definitely <em>will not</em> do and what the design <em>will not</em>
accommodate without major rework.</p>Quick tips for getting into systems programming2012-12-18T18:30:00-08:00Chris Learyhttps://twitter.com/cdleary5a252a5b-6b4c-481f-a02a-2d2490f09468<a name="quick-tips-for-getting-into-systems-programming"></a><a name="in-reply"></a><h3>In reply</h3><p><a href="https://twitter.com/ndrwdn">Andrew (@ndrwdn)</a> asked a great followup question to the last entry on
<a href="http://blog.cdleary.com/2012/12/systems-programming-at-my-alma-mater/">systems programming at my alma mater</a>:</p><a href="https://twitter.com/ndrwdn/status/279949951096721409"><img src="http://static.cdleary.com/images/andrew-tweet.png" alt="@cdleary Just read your blog post. Are there any resources you would recommend for a Java guy interested in doing systems programming?" /></a><p>What follows are a few quick-and-general pointers on "I want to start doing
lower level stuff, but need a motivating direction for a starter project."
They're somewhat un-tested because I haven't mentored any apps-to-systems
transitions, but, as somebody who plays on both sides of that fence, I think
they all sound pretty fun.</p><p>A word of warning: systems programming may feel crude at first compared
to the managed languages and application-level design you're used to. However,
<em>even among experts</em>, the prevalence of footguns motivates simple designs and APIs, which can be a beautiful thing.
<strong>As a heuristic, when starting out, just code it the simple, ungeneralized way.</strong>
If you're doing something interesting, hard problems are likely to present
themselves anyhow!</p><a name="microcontrollers-rock"></a><h4>Microcontrollers rock</h4><p>Check out sites like <a href="http://hackaday.com/">hackaday.com</a> to see the incredible feats that
people accomplish through microcontrollers and hobby time.
When starting out, it's great to get the tactile feedback of lighting up a
bright blue LED or successfully sending that first UDP packet to your desktop
at four in the morning.</p><p>Microcontroller-based development is also nice because you can build up your understanding of C code, if you're feeling rusty, from basic usage — say, keeping everything you need to store as a global variable or array — to fancier techniques as you improve and gain experience with what works well.</p><p>Although I haven't played with them specifically, I understand that <a href="http://www.amazon.com/gp/product/B0051QHPJM/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B0051QHPJM&linkCode=as2&tag=honetoasegf-20">Arduino
boards are all the rage</a> these days — there are great tutorials and support
communities out on the web that love to help newbies get started with
microcontrollers. <a href="http://www.avrfreaks.net/">AVR freaks</a> was around even when I was programming on my
STK500. I would recommend reading some forums to figure out which board looks right for you and your intended projects.</p><p>At school, people really took to <a href="http://people.ece.cornell.edu/land/courses/ece4760/">Bruce Land's microcontroller class</a>,
because you can't help but feel the <a href="http://www.whatgamesare.com/fiero.html">fiero</a> as you work towards more and more
ambitious project goals.
Since that class is still being taught, look to
the exercises and projects (link above) as good examples of what's possible with bright
students and four credits worth of time. <tt><a href="#quick-tips-for-getting-into-systems-programming-0" name="quick-tips-for-getting-into-systems-programming-0-ref">[*]</a></tt></p><a name="start-fixing-bugs-on-low-level-open-source-projects"></a><h4>Start fixing bugs on low-level open source projects</h4><p>Many open source projects love to see willing new contributors. <em>Especially</em> check out projects a) that are known for having good/friendly mentoring and
b) that you think are cool (which will help you stay motivated).</p><p>I know <a href="https://blog.mozilla.org/mrbkap/2011/03/23/how-i-got-started-at-mozilla/">one amazing person I worked with at Mozilla</a> got into the project by
taking his time to figure out how to properly patch some open bugs.
If you take that route, either compare your patch to what the project
member has already posted, or request that somebody give you feedback on your
patch.
This is another good way to pick up mentor-like connections.</p><a name="check-out-open-courseware-for-conceptual-background"></a><h4>Check out open courseware for conceptual background</h4><p>I personally love the rapid evolution of open courseware we're seeing. If you're feeling confident, pick a random low-level thing you've heard-of-but-never-quite-understood, type it into a search engine, and do a deep dive on a lecture or series. If you want a more structured approach, a simple search for <a href="http://www.lmgtfy.com/?q=systems+programming+open+courseware">systems programming open courseware</a> has quite educational looking results.</p><a name="general-specifics-oses-and-reversing"></a><h3>General specifics: OSes and reversing</h3><a href="https://twitter.com/ndrwdn/status/280093709213777921"><img src="http://static.cdleary.com/images/andrew-tweet2.png" alt="@cdleary Some general but also OS implementation and perhaps malware analysis/RE." /></a><a name="oses"></a><h4>OSes</h4><p>If you're really into OSes, I think you should just dive in and try writing a little kernel on top of your hardware of choice in qemu (a hardware emulator). Quick searches turn up some <a href="https://singpolyma.net/category/singpolyma-kernel/">seemingly excellent tutorials on writing simple OS kernels on qemu</a>, and writing simple OSes for microcontrollers is often a student project topic in courses like the one I mention above. <tt><a href="#quick-tips-for-getting-into-systems-programming-1" name="quick-tips-for-getting-into-systems-programming-1-ref">[†]</a></tt></p><p>With some confidence, patience, maybe a programming guide, and recall of some low-level background from school, I think this should be doable. Some research will be required on effective methods of debugging, though — that's always the trick with bare metal coding.</p><p>Or, for something less audacious sounding: build your own Linux kernel with some modifications to figure out what's going on.
There are plenty of guides on how to do this for your Linux distribution of choice, and you can learn a great deal just by fiddling around with code paths and using <tt class="literal">printk</tt>.
Try doing something on the system (in userspace) that's simple to isolate in the kernel source using <tt class="literal">grep</tt> — like <tt class="literal">mmap</tt>ping <tt class="literal">/dev/mem</tt> or accessing an entry in <tt class="literal">/proc</tt> — to figure out how it works, and leave no stone unturned.</p><p>I recommend taking copious notes, because I find that's the best way to trace out any complex system. Taking notes makes it easy to refer back to previous realizations and backtrack at will.</p><p>Read everything that interests you on <a href="http://kernelnewbies.org/">Linux Kernel Newbies</a>, and subscribe to kernel changelog summaries. Attempt to understand things that interest you in the source tree's <a href="https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=tree;f=Documentation;h=9db920be164d978581854f22374e53782aa2168b;hb=HEAD">/Documentation</a>. Write a really simple Linux Kernel Module. Then, refer to <a href="https://lwn.net/Kernel/LDD3/">freely available texts</a> for help in making it do progressively more interesting things. Another favorite read of mine was <a href="http://www.amazon.com/gp/product/0596005652/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=0596005652&linkCode=as2&tag=honetoasegf-20">Understanding the Linux Kernel</a>, if you have a hobby budget or a local library that carries it.</p><a name="reversing"></a><h4>Reversing</h4><p>This I know less about — pretty much everybody I know that has done significant reversing is an <a href="http://www.hex-rays.com/products/ida/index.shtml">IDA</a> wizard, and I, at this point, am not. They are also typically Win32 experts, which I am not. Understanding obfuscated assembly is probably a lot easier with powerful and scriptable tools of that sort, which ideally also have a good understanding of the OS. <tt><a href="#quick-tips-for-getting-into-systems-programming-2" name="quick-tips-for-getting-into-systems-programming-2-ref">[‡]</a></tt></p><p>However, one of the things that struck me when I was doing background research for attack mitigation patches was <strong>how great the security community was at sharing information through papers, blog entries, and proof of concept code</strong>. Also, I found that there are a good number of videos online where security researchers share their insights and methods in the exploit analysis process. Video searches may turn up useful conference proceedings, or it may be more effective to work from the other direction: find conferences that deal with your topic of interest, and see which of those offer video recordings.</p><p>During my research on security-related things, a <a href="http://em386.blogspot.com/2012/04/practical-malware-analysis-review.html">blog entry by Chris Rohlf</a> caused <a href="http://www.amazon.com/gp/product/1593272901/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=1593272901&linkCode=as2&tag=honetoasegf-20">Practical Malware Analysis</a> to end up on my wishlist as an introductory text. Seems to have good reviews all around. Something else to check out on a trip to the library or online forums, perhaps.</p><a name="footnotes"></a><h3>Footnotes</h3><table class="footnote-table"><tbody valign="top"><tr><td><tt><a href="#quick-tips-for-getting-into-systems-programming-0-ref" name="quick-tips-for-getting-into-systems-programming-0">[*]</a></tt></td><td><p>At the end of the page somebody notes: "This page is transmitted using 100% recycled electrons." ;-)</p></td></tr><tr><td><tt><a href="#quick-tips-for-getting-into-systems-programming-1-ref" name="quick-tips-for-getting-into-systems-programming-1">[†]</a></tt></td><td><p>Also, don't pass up a chance to browse through the qemu source. Want to know how to emulate a bunch of different hardware efficiently? Use the source, Luke! (Hint: it's a JIT. :-)</p></td></tr><tr><td><tt><a href="#quick-tips-for-getting-into-systems-programming-2-ref" name="quick-tips-for-getting-into-systems-programming-2">[‡]</a></tt></td><td><p>One other neat thing we occassionally used for debugging at Mozilla was a VMWare-based time-traveling virtual machine instance. It sounded like they were deprecating it a few years back, so I'm not sure the status of it, but if it's still around it would literally allow you to play programs backwards!</p></td></tr></tbody></table>Systems programming at my alma mater2012-12-13T20:30:00-08:00Chris Learyhttps://twitter.com/cdleary20d2589b-2186-46db-b56f-0fef6d8e78cc<a name="systems-programming-at-my-alma-mater"></a><p>Bryan also asked me this at NodeConf last year, where I was chatting with him about the then-in-development <a href="https://blog.mozilla.org/futurereleases/2012/11/26/firefox-beta-adds-ionmonkey-to-improve-javascript-performance/">IonMonkey</a>:</p><a href="https://twitter.com/bcantrill/status/279311462927831042"><img src="http://static.cdleary.com/images/cantrill-tweet.png" alt="An old e-mail to the Cornell CS faculty: https://gist.github.com/4278516 Have things changed in the decade since?" /></a><p>I remembered my talk with Bryan when I went to recruit there last year and asked the same <a href="https://gist.github.com/4278516#file-cornellcs-txt-L28">interview question that he references</a> — except with the pointer uninitialized so candidates would have to enumerate the possibilities — to see what evidence I could collect. My thoughts on the issue haven't really changed since that chat, so I'll just repeat them here.</p><p>(And, although I do not speak for my employer, for any programmers back in Ithaca who think systems programming and stuff like Birman's class is cool beans, <a href="http://www.calxeda.com/job-locations/northern-ca/">my team is hiring both full time and interns in the valley</a>, and I would be delighted if you decided to apply.)</p><a name="my-overarching-thought-bring-the-passion"></a><h3>My overarching thought: bring the passion</h3><p><strong>Many of the people I'm really proud that my teams have hired out of undergrad are just "in love" with systems programming,</strong> just as a skilled artisan "cares" about their craft. They work on personal projects and steer their trajectory towards it somewhat independent of the curriculum.</p><p>Passion seems to be pretty key, along with follow-through, and ability to work well with others, in the people I've thumbs-up'd over the years. Of course I always want people who do well in their more systems-oriented curriculum and live in a solid part the current-ability curve, but I always have an eye out for the passionately interested ones.</p><p>So, I tend to wonder: if an org has a "can systems program" distribution among the candidates, can you predict the existence of the outliers at the career fair from the position of the fat part of that curve?</p><p>Anecdotally, myself and two other systems hackers on the JavaScript engine came from the same undergrad program, modulo a few years, although we took radically different paths to get to the team. They are among the best and most passionate systems programmers I've ever known, which also pushes me to think passionate interest may be a high-order bit.</p><p>Regardless, it's obviously in systems companies' best interest to try to get the most bang per buck on recruiting trips, so you can see how Bryan's point of order is relevant.</p><a name="my-biased-take-away-from-my-time-there"></a><h3>My biased take-away from my time there</h3><p>I graduated less than a decade ago, so I have my own point of reference. From my time there <strong>several years ago, I got the feeling</strong> that the mentality was:</p><ul><li><p>C/C++ are horrible teaching languages, so they shouldn't really be taught in general curricula in circumstances where they can be avoided.</p></li><li><p>Java and <em>applications-level programming</em> is where most of the well-paying industry jobs are. (Not sure how true this is or was, but it seemed to be the conventional wisdom at the time.)</p></li><li><p>It's a Windows world. And, if it's not a Windows world, you've probably got a VM under you.</p></li></ul><p><strong>This didn't come from any kind of authority,</strong> it's just putting into words the "this is how things are done around here" understanding I had at the time. All of them seemed reasonable in context, though I didn't think I wanted to head down the path alluded by those rules of thumb. Of course these were, in the end, just rules of thumb: we still had things like a Linux farm used by some courses.</p><p>I feel that the "horrible for teaching" problem extends to other important real-world systems considerations as well: I learned MIPS and Alpha <tt><a href="#systems-programming-at-my-alma-mater-0" name="systems-programming-at-my-alma-mater-0-ref">[*]</a></tt>, presumably due to their clean RISC heritage, but golly do I ever wish I was taught more about specifics of x86 systems. And POSIX systems. <tt><a href="#systems-programming-at-my-alma-mater-1" name="systems-programming-at-my-alma-mater-1-ref">[†]</a></tt></p><p>Of course that kind of thing — picking a "real-world" ISA or compute platform — can be a tricky play for a curriculum: what do you do about the to-be SUN folks? Perhaps you've taught them all this x86-specific nonsense when they only care about SPARC. How many of the "there-be-dragons" lessons from x86 would cross-apply?</p><p>There's a balance between trade and fundamentals, and <strong>I feel I was often reminded that I was there to cultivate excellent fundamentals</strong> which could later be applied appropriately to the trends of industry and academia.</p><a name="but-seriously-it-s-just-writing-c"></a><h3>But seriously, it's just writing C...</h3><p>For my graduating class, CS undergrad didn't really require writing C. The closest you were <em>forced</em> to get was translating C constructs (like loops and function calls) to MIPS and filling in blanks in existing programs. You note the bijection-looking relationship between C and assembly and can pretty much move on.</p><p><strong>I tried to steer to hit as much interesting systems-level programming as possible.</strong> To summarize a path to learning a workable amount of systems programming in my school of yore, in hopes it will translate to something helpful existing today:</p><ul><li><p>You may have read K&R, but as a newbie it makes sense to beef up on fundamentals, so <tt class="literal">CS 116: Introduction to C Programming</tt> doesn't hurt (and you meet other passionate systems programming people in the process).</p></li><li><p><tt class="literal">CS 415: Operating Systems Practicum</tt> made you write C. Sadly, we were given a library for context switching userspace threads on top of the Win32 API in MSVC that we didn't really have to dig into. We had to write things like concurrency primitives, a scheduler, and a rudimentary filesystem that operated in terms of a soft (i.e. fake) disk model. I think there may have been some networking in there as well. The course was being revamped at the time, so I hope it's more bare-metal now with something practical like <a href="http://wiki.qemu.org/Main_Page">qemu</a>.</p></li><li><p><tt class="literal">ECE 476: Designing with Microcontrollers</tt> was an amazing class for integrating whatever you were most passionate about from CS and ECE curricula. Though at the time we were using 8-bit Atmels on a proprietary compiler that had no dynamic allocation support, you had to write both assembly and C code and talk to your system board via I/O ports. Plus, I got to be a little sneaky and use <tt class="literal">avr-gcc</tt>.</p></li><li><p><tt class="literal">ECE 473: Optimizing Compilers</tt> targeted Alpha at the time, but was a great big systems project that taught a lot about machine specifics and code generation (interfacing to syscalls, executable and linkable formats).</p></li><li><p><tt class="literal">ECE 575: High-Performance Microprocessor Architecture</tt> made you write real and well-performing C applications for things like cache modeling with static binary translation. This was a very formative course for me.</p></li><li><p>I did a bunch of independent projects to mess around and better understand areas where I was lacking knowledge.</p></li><li><p>I did work with systems researchers at the university. Some were unwilling to take any undergrads as a policy, but some groups are more amenable.</p></li></ul><p>I'm not a good alum in failing to keep up with the goings-ons but, if I had a recommendation based on personal experience, it'd be to do stuff like that. Unfortunately, I've also been at companies where the most basic interview question is "how does a vtable actually work" or on nuances of C++ exceptions, so for some jobs you may want to take an advanced C++ class as well.</p><a name="understanding-a-null-pointer-deref-isn-t-writing-c"></a><h3>Understanding a NULL pointer deref isn't writing C</h3><p>Eh, it kind of is. On my recruiting trip, if people didn't get my uninitialized pointer dereference question, I would ask them questions about MMUs if they had taken the computer organization class. Some knew how an MMU worked (of course, some more roughly than others), but didn't realize that OSes had a policy of keeping the null page mapping invalid.</p><p>So <strong>if you understand an MMU, why don't you know what's going to happen in the NULL pointer deref?</strong> Because you've never actually written a C program and screwed it up. Or your haven't written enough assembly with pointer manipulation. If you've actually written a Java program and screwed it up you might say <tt class="literal">NullPointerException</tt>, but then you remember there are no exceptions in C, so you have to quickly come up with an answer that fits and say zero.</p><p>I think another example might help to illustrate the disconnect: the difference between protected mode and user mode is well understood among people who complete an operating systems course, but the conventions associated with them (something like "tell me about <tt class="literal">init</tt>"), or what a "traditional" physical memory space actually looks like, seem to be out of scope without outside interest.</p><p>This kind of interview scenario is usually <a href="http://blog.cdleary.com/2009/11/thoughts-on-programming-language-fluency/">time to fluency</a> sensitive — wrapping your head around modern C and sane manual memory management isn't trivial, so it does require some time and experience. Plus when you're working regularly with footguns, team members want a basic level of trust in coding capability. It's not that you think the person <em>can't</em> do the job, it's just not the right timing if you need to find somebody who can hit the ground running. Bryan also mentions this in his email.</p><p>Thankfully for those of us concerned with the placement of the fat part of the distribution, it sounds like <a href="https://twitter.com/el33th4xor/status/279384174102708225">Professor Sirer is saying</a> it's been moving even more in the right direction in the time since I've departed. And, for the big reveal, I did find good systems candidates on my trip, and at the same time avoided freezing to death despite going soft in California all these years.</p><a name="brain-teaser"></a><h3>Brain teaser</h3><p>I'll round this entry off with a little brain teaser for you systems-minded folks: I contend that the following might <em>not</em> segfault.</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">// ...</span>
<span style="color: #B00040">int</span> <span style="color: #0000FF">main</span>() {
mysterious_function();
A <span style="color: #666666">*</span>a <span style="color: #666666">=</span> <span style="color: #008000">NULL</span>;
printf(<span style="color: #BA2121">"%d</span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>, a<span style="color: #666666">-></span>integer_member);
<span style="color: #008000; font-weight: bold">return</span> EXIT_SUCCESS;
}
</pre></div>
<p>How many reasons can you enumerate as to why? What if we eliminate the call to the mysterious function?</p><a name="footnotes"></a><h3>Footnotes</h3><table class="footnote-table"><tbody valign="top"><tr><td><tt><a href="#systems-programming-at-my-alma-mater-0-ref" name="systems-programming-at-my-alma-mater-0">[*]</a></tt></td><td><p>In an advanced course we had an Alpha 21264 that I came to <a href="http://www.youtube.com/watch?v=NHK9C5cy74c">love deeply</a>.</p></td></tr><tr><td><tt><a href="#systems-programming-at-my-alma-mater-1-ref" name="systems-programming-at-my-alma-mater-1">[†]</a></tt></td><td><p>I'm hoping there's more emphasis on POSIX these days with the mobile growth and Linux/OS X dominance in that space.</p></td></tr></tbody></table>ARM chars are unsigned by default2012-11-14T07:00:00-08:00Chris Learyhttps://twitter.com/cdleary4d085c4f-1c97-4a58-8176-58f7db458c10<a name="arm-chars-are-unsigned-by-default"></a><p>[Latest from the "I can't believe I'm writing a blog entry about this"
department, but the context and surrounding discussion is interesting. --Ed]</p><p>If you're like me, or one of the other thousands of concerned parents who has borne C code into this cruel, topsy-turvy, and oftentimes undefined world, you read the C standard aloud to your programs each night. It's comforting to know that K&R are out there, somewhere, watching over them, as visions of <a href="http://research.swtch.com/duff">Duff's Devices</a> dance in their wee little heads.</p><a name="the-shocking-truth"></a><h3>The shocking truth</h3><p>In all probability, you're one of <a href="http://xkcd.com/1053/">today's lucky bunch</a> who find out that the
signedness of the <tt class="literal">char</tt> datatype in C is undefined. The implication being, when
you write <tt class="literal">char</tt>, the compiler is implicitly (but consistently) giving it
either the <tt class="literal">signed</tt> or <tt class="literal">unsigned</tt> modifier. From the spec: <tt><a href="#arm-chars-are-unsigned-by-default-0" name="arm-chars-are-unsigned-by-default-0-ref">[*]</a></tt></p><blockquote><p>The three types char, signed char, and unsigned char are collectively called
the character types. The implementation shall define char to have the same range,
representation, and behavior as either signed char or unsigned char.</p><p>...</p><p>Irrespective of the choice made, char is a separate type from the
other two and is not compatible with either.</p><p class="attribution">—ISO 9899:1999, section "6.2.5 Types"</p></blockquote><p>Why is <tt class="literal">char</tt> distinct from the explicitly-signed variants to begin with? A
great discussion of historical portability questions is given here:</p><blockquote><p>Fast forward [to 1993] and you'll find no single "load character from
memory and sign extend" in the ARM instruction set. That's why, for
performance reasons, every compiler I'm aware of makes the default char
type signed on x86, but unsigned on ARM. (A workaround for the GNU GCC
compiler is the <tt class="literal">-fsigned-char</tt> parameter, which forces all chars to
become signed.)</p><p class="attribution">—<a href="http://www.drdobbs.com/architecture-and-design/portability-the-arm-processor/184405435">Portability and the ARM Processor</a>, Trevor Harmon, 2003</p></blockquote><p>It's worth noting, though, that in modern times there are both <tt class="literal">LDRB</tt> (Load
Register Byte) and <tt class="literal">LDRSB</tt> (Load Register Signed Byte) instructions available
in the ISA that do sign extension after the load operation in a single
instruction. <tt><a href="#arm-chars-are-unsigned-by-default-1" name="arm-chars-are-unsigned-by-default-1-ref">[†]</a></tt></p><p><strong>So what does this mean in practice?</strong> Conventional wisdom is that you use
<em>unsigned</em> values when you're bit bashing (although you have to <a href="https://www.securecoding.cert.org/confluence/display/seccode/INT02-C.+Understand+integer+conversion+rules#INT02-CUnderstandintegerconversionrules-NoncompliantCodeExample">be extra careful
bit-bashing types smaller than int</a> due to promotion rules) and <em>signed</em> values
when you're doing math, <tt><a href="#arm-chars-are-unsigned-by-default-2" name="arm-chars-are-unsigned-by-default-2-ref">[‡]</a></tt> but now we have this third type, the
implicit-signedness <tt class="literal">char</tt>. What's the conventional wisdom on that?</p><a name="signedness-un-decorated-char-is-for-ascii-text"></a><h3>Signedness-un-decorated char is for ASCII text</h3><p>If you find yourself writing:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #B00040">char</span> some_char <span style="color: #666666">=</span> NUMERIC_VALUE;
</pre></div>
<p>You should probably reconsider. In that case, when you're clearly doing
something numeric, spring for a <tt class="literal">signed char</tt> so the effect of arithmetic
expressions across platforms is more clear. But the more typical usage is still
good:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #B00040">char</span> some_char <span style="color: #666666">=</span> <span style="color: #BA2121">'a'</span>;
</pre></div>
<p>For numeric uses, also consider adopting a fixed-width or minimum-width
datatype from <tt class="literal"><stdint.h></tt>. <strong>You really don't want to hold the additional
complexity of char signedness in your head</strong>, as <a href="http://blog.regehr.org/archives/268">integer promotion rules are
already quite tricky</a>.</p><a name="examples-to-consider"></a><h3>Examples to consider</h3><p>Some of the following mistakes will trigger warnings, but you should realize there's
something to be aware of in the warning spew (or a compiler option to consider
changing) when you're cross-compiling for ARM.</p><a name="example-of-badness-testing-the-high-bit"></a><h4>Example of badness: testing the high bit</h4><p>Let's say you wanted to see if the high bit were set on a <tt class="literal">char</tt>. If you assume signed chars, this easy-to-write comparison seems legit:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">if</span> (some_char <span style="color: #666666"><</span> <span style="color: #666666">0</span>)
</pre></div>
<p>But if your <tt class="literal">char</tt> type is unsigned that test will never pass.</p><a name="example-of-badness-comparison-to-negative-numeric-literals"></a><h4>Example of badness: comparison to negative numeric literals</h4><p>You could also make the classic mistake:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #B00040">char</span> c <span style="color: #666666">=</span> getchar(); <span style="color: #408080; font-style: italic">// Should actually be placed in an int!</span>
<span style="color: #008000; font-weight: bold">while</span> (c <span style="color: #666666">!=</span> EOF)
</pre></div>
<p>This comparison would never return true with an 8-bit unsigned <tt class="literal">char</tt>
datatype and a 32-bit <tt class="literal">int</tt> datatype. Here's the breakdown:</p><p>When <tt class="literal">getchar()</tt> returns <tt class="literal">((signed int) -1)</tt> to represent <tt class="literal">EOF</tt>, you'll
truncate that value into <tt class="literal">0xFFu</tt> (because chars are an unsigned 8-bit datatype).
Then, when you compare against <tt class="literal">EOF</tt>, you'll promote that unsigned value to a
signed integer without sign extension (preserving the bit pattern of the
original, unsigned <tt class="literal">char</tt> value), and get comparison between <tt class="literal">0xFF</tt> (<tt class="literal">255</tt> in
decimal) and <tt class="literal">0xFFFFFFFF</tt> (<tt class="literal">-1</tt> in decimal). For all the values in the unsigned
<tt class="literal">char</tt> range, I hope it's clear that this test will never pass. <tt><a href="#arm-chars-are-unsigned-by-default-3" name="arm-chars-are-unsigned-by-default-3-ref">[§]</a></tt></p><p>To make the example a little more obvious we can replace the call to
<tt class="literal">getchar()</tt> and the <tt class="literal">EOF</tt> with a numeric <tt class="literal">-1</tt> literal and the same thing
will happen.</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #B00040">char</span> c <span style="color: #666666">=</span> <span style="color: #666666">-1</span>;
assert(c <span style="color: #666666">==</span> <span style="color: #666666">-1</span>); <span style="color: #408080; font-style: italic">// This assertion fails. Yikes.</span>
</pre></div>
<p>That last snippet can be tested by compiling in GCC with <tt class="literal">-fsigned-char</tt> and
<tt class="literal">-funsigned-char</tt> if you'd like to see the difference in action.</p><a name="footnotes"></a><h3>Footnotes</h3><table class="footnote-table"><tbody valign="top"><tr><td><tt><a href="#arm-chars-are-unsigned-by-default-0-ref" name="arm-chars-are-unsigned-by-default-0">[*]</a></tt></td><td><p>The spec goes on to say that you can figure out the underlying signedness
by checking whether <tt class="literal">CHAR_MIN</tt> from <tt class="literal"><limits.h></tt> is <tt class="literal">0</tt> or
<tt class="literal">SCHAR_MIN</tt>. In C++ you could do the <tt class="literal"><limits></tt>-based
<tt class="literal">std::numeric_limits<char>::is_signed</tt> dance.</p></td></tr><tr><td><tt><a href="#arm-chars-are-unsigned-by-default-1-ref" name="arm-chars-are-unsigned-by-default-1">[†]</a></tt></td><td><p>Although the same encodings exist in Thumb-sub-ISA, the ARM-sub-ISA encoding for <tt class="literal">LSRSB</tt> lacks a shift capability on the load output as a result of this <a href="http://www.davespace.co.uk/arm/efficient-c-for-arm/memaccess.html">historical artifact</a>.</p></td></tr><tr><td><tt><a href="#arm-chars-are-unsigned-by-default-2-ref" name="arm-chars-are-unsigned-by-default-2">[‡]</a></tt></td><td><p>Although sometimes of the tradeoffs can be more subtle. <a href="http://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf">Scott Meyers discusses more issues</a> quite well, per usual.</p></td></tr><tr><td><tt><a href="#arm-chars-are-unsigned-by-default-3-ref" name="arm-chars-are-unsigned-by-default-3">[§]</a></tt></td><td><p>Notably, if you make the same mistake in in the signed <tt class="literal">char</tt> case you can
breathe easier, because you'll sign extend for the comparison, making the test
passable.</p></td></tr></tbody></table>Simple, selfish, and unscientific shootout2012-06-03T18:00:00-07:00Chris Learyhttps://twitter.com/cdlearyab30fd63-3469-475a-bd5a-99c828dbbd01<a name="simple-selfish-and-unscientific-shootout"></a><a name="disclaimer"></a><h3>Disclaimer</h3><p>I've caught some flak over publishing my "selfish"
(read: <strong>empirical testing that yields results which are only relevant to me</strong>)
multi-language-engine-and-standard-library "shootout"
(read: I wrote the same basic functionality across multiple languages,
somewhat like on the <a href="http://shootout.alioth.debian.org/">shootout.alioth.debian.org</a> site,
the Computer Language Benchmarks Game).
I value the concept and process of <em>learning in the open</em>,
but it may require more time and consideration of clarity
than I had given in this entry.
Taking it down would <a href="http://www.reddit.com/r/programming/comments/ukbd8/simple_selfish_and_unscientific_shootout_look_at/c4wh2mg">apparently be a breach of etiquette</a>,
so please read the following TL;DR as a primer.</p><p><strong>TL;DR:</strong> I encourage you to <em>personally</em> try writing small utilities
against a variety of language engines when you have the opportunity.
Consider how much tweaking of the original code you have to do
in order to obtain a well-performing implementation.
Weigh the development effort and your natural proficiency
against the performance, clarity, and safety
of the resulting program.
Gather evidence and be eager to test your cost assumptions.
Commit to learning about sources of overhead and
unforeseen characteristics of your libraries.
You may be surprised which engines give the best bang per time spent.</p><p>It has also been suggested to me that
all native languages are within ~3x of one another
on generated code performance,
and the rest of the difference is generally attributable
to the library or algorithm,
so that's an interesting rule of thumb to keep in mind.</p><p>If you'd like to see how to write a <a href="https://github.com/cdleary/rand_int_file">small utility against a variety of language engines</a>, you can check out the Github repo.</p><a name="introduction"></a><h3>Introduction</h3><p>We tend to throw around "orders of magnitude" when it comes to "programming language speeds",
even though we know that the concept of a programming language having a speed for arbitrary programs makes little sense.
But, when I'm coding up something small, I find myself pondering a very concrete question:
which available language engine (language implementation and libraries) could <em>I</em> reasonably write this little program against
that would give the best speed over development effort?</p><p>I'm not looking to optimize all the buttery nooks and crannies of this program,
nor do I want to drill into potential deficiencies in the I/O subsystem:
I just want to make a painless little utility that doesn't require me to go on a lunch break.</p><p>XKCD knows what I'm talking about:</p><a href="http://xkcd.com/397/"><img src="http://imgs.xkcd.com/comics/unscientific.png" alt="Unscientific" /></a><p>I was writing a very simple, single-threaded program to generate about a billion uniformly random <tt class="literal">int32</tt>s in a text file,
and I decided I would do a selfish little shootout:
write the same program in a set of "viable" languages
(remember, this is all about me :-),
unscientifically use <tt class="literal">time(1)</tt> on the programs a few times,
consider how painful it was to write,
and see what the runtimes come out to be.</p><p>For 100 million integers on my CentOS Bloomfield box,
these were the runtimes
<strong>for my initial, naive implementations
and their lightly tweaked counterparts</strong>:</p><table><thead><tr><th><p>Impl</p></th><th><p>Naive
Runtime</p></th><th><p>Naive
Ratio</p></th><th><p>Tweaked
Runtime</p></th><th><p>Tweaked
Ratio</p></th><th><p>Engine</p></th></tr></thead><tbody><tr><td><p>.cpp</p></td><td><p>~0m 11s</p></td><td></td><td><p>~0m 15s</p></td><td></td><td><p>GCC 4.4.6 -O3</p></td></tr><tr><td><p>.java</p></td><td><p>~0m 18s</p></td><td><p>~1.5x</p></td><td><p>~0m 19s</p></td><td><p>~1.25x</p></td><td><p>JDK 1.7.0.04</p></td></tr><tr><td><p>.go</p></td><td><p>~1m 5s</p></td><td><p>~6x</p></td><td><p>~0m 23s</p></td><td><p>~1.5x</p></td><td><p>go1.0.1</p></td></tr><tr><td><p>.rs</p></td><td><p>~1m 7s</p></td><td><p>~6x</p></td><td><p>~0m 23s</p></td><td><p>~1.5x</p></td><td><p>rustc -O3 0.2 (trunk)</p></td></tr><tr><td><p>.ml</p></td><td><p>~0m 37s</p></td><td><p>~3.3x</p></td><td><p>~0m 35s</p></td><td><p>~2.5x</p></td><td><p>ocamlopt 3.11.2</p></td></tr><tr><td><p>.py</p></td><td><p>~1m 6s</p></td><td><p>~6x</p></td><td><p>~0m 51s</p></td><td><p>~3.5x</p></td><td><p>PyPy 1.9.1 (nightly)</p></td></tr><tr><td><p>.lua</p></td><td><p>~1m 36s</p></td><td><p>~9x</p></td><td><p>~0m 27s (FFI)</p></td><td><p>~1.8x</p></td><td><p>LuaJIT 2.0.0-beta10 (trunk)</p></td></tr><tr><td><p>.rb</p></td><td><p>~1m 50s</p></td><td><p>~10x</p></td><td></td><td></td><td><p>ruby 2.0.0 (trunk)</p></td></tr></tbody></table><p>Like all developers, I have varied levels of expertise across languages and their standard libraries;
but, as I said, this is a selfish shootout,
so <strong>my competence in a given language is considered part of the baseline</strong>.
You'll see in the comments that
many readers identified performance bugs in these code samples.</p><p>There are also caveats for the random numbers I was generating in OCaml (due to tag bit stealing).</p><p>For a billion integers the naive C++0x version took <tt class="literal">1m 42s</tt>
and the naive Java version took <tt class="literal">2m 18s</tt> (1.35x slower).
I didn't want to spend the time to slow down the others by an order of magnitude.</p><p>As a result — with perpetual intent to improve my abilities in all engines I work with,
willful ignorance of the reasoning,
acknowledgement that I need to perform more experiments like this to draw a more reasonable conclusion over time,
and malice aforethought — I'll hereby declare myself guilty of leaning a bit more towards writing things like this in C++
when I want better runtimes in the giga range for little IO-and-compute programs.</p><a name="show-me-the-code"></a><h3>Show me the code!</h3><p>I threw the code up <a href="https://github.com/cdleary/rand_int_file">on github</a>,
but the versions that I wrote naively
(before optimization suggestions)
are duplicated here for convenience.</p><p>C++0x:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #BC7A00">#include <cstdlib></span>
<span style="color: #BC7A00">#include <cstdio></span>
<span style="color: #BC7A00">#include <random></span>
<span style="color: #B00040">int</span>
<span style="color: #0000FF">main</span>(<span style="color: #B00040">int</span> argc, <span style="color: #B00040">char</span> <span style="color: #666666">**</span>argv)
{
<span style="color: #008000; font-weight: bold">if</span> (<span style="color: #666666">2</span> <span style="color: #666666">!=</span> argc) {
fprintf(stderr, <span style="color: #BA2121">"Usage: %s <elem_count></span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>, argv[<span style="color: #666666">0</span>]);
<span style="color: #008000; font-weight: bold">return</span> EXIT_FAILURE;
}
<span style="color: #B00040">long</span> <span style="color: #B00040">long</span> count <span style="color: #666666">=</span> atoll(argv[<span style="color: #666666">1</span>]);
std<span style="color: #666666">::</span>mt19937 rng;
rng.seed(time(<span style="color: #008000">NULL</span>));
std<span style="color: #666666">::</span>uniform_int<span style="color: #666666"><</span><span style="color: #B00040">int32_t</span><span style="color: #666666">></span> dist;
<span style="color: #B00040">FILE</span> <span style="color: #666666">*</span>file <span style="color: #666666">=</span> fopen(<span style="color: #BA2121">"vec_gen.out"</span>, <span style="color: #BA2121">"w"</span>);
<span style="color: #008000; font-weight: bold">if</span> (<span style="color: #008000">NULL</span> <span style="color: #666666">==</span> file) {
perror(<span style="color: #BA2121">"could not open vector file for writing"</span>);
<span style="color: #008000; font-weight: bold">return</span> EXIT_FAILURE;
}
<span style="color: #008000; font-weight: bold">for</span> (<span style="color: #B00040">long</span> <span style="color: #B00040">long</span> i <span style="color: #666666">=</span> <span style="color: #666666">0</span>; i <span style="color: #666666"><</span> count; <span style="color: #666666">++</span>i) {
<span style="color: #B00040">int32_t</span> r <span style="color: #666666">=</span> dist(rng);
fprintf(file, <span style="color: #BA2121">"%d</span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>, r);
}
fclose(file);
<span style="color: #008000; font-weight: bold">return</span> EXIT_SUCCESS;
}
</pre></div>
<p>Java:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">java.io.BufferedWriter</span><span style="color: #666666">;</span>
<span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">java.io.FileWriter</span><span style="color: #666666">;</span>
<span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">java.util.Random</span><span style="color: #666666">;</span>
<span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">java.io.IOException</span><span style="color: #666666">;</span>
<span style="color: #008000; font-weight: bold">class</span> <span style="color: #0000FF; font-weight: bold">VecGen</span>
<span style="color: #666666">{</span>
<span style="color: #008000; font-weight: bold">static</span> <span style="color: #008000; font-weight: bold">final</span> <span style="color: #B00040">int</span> INT_MAX <span style="color: #666666">=</span> <span style="color: #666666">2147483647;</span>
<span style="color: #008000; font-weight: bold">public</span> <span style="color: #008000; font-weight: bold">static</span> <span style="color: #B00040">void</span> <span style="color: #0000FF">main</span><span style="color: #666666">(</span>String args<span style="color: #666666">[])</span> <span style="color: #666666">{</span>
<span style="color: #008000; font-weight: bold">if</span> <span style="color: #666666">(</span>args<span style="color: #666666">.</span><span style="color: #7D9029">length</span> <span style="color: #666666">!=</span> <span style="color: #666666">1)</span> <span style="color: #666666">{</span>
System<span style="color: #666666">.</span><span style="color: #7D9029">err</span><span style="color: #666666">.</span><span style="color: #7D9029">println</span><span style="color: #666666">(</span><span style="color: #BA2121">"Usage: VecGen <elem_count>"</span><span style="color: #666666">);</span>
System<span style="color: #666666">.</span><span style="color: #7D9029">exit</span><span style="color: #666666">(-1);</span>
<span style="color: #666666">}</span>
<span style="color: #B00040">int</span> count <span style="color: #666666">=</span> Integer<span style="color: #666666">.</span><span style="color: #7D9029">parseInt</span><span style="color: #666666">(</span>args<span style="color: #666666">[0]);</span>
<span style="color: #008000; font-weight: bold">try</span> <span style="color: #666666">{</span>
FileWriter fw <span style="color: #666666">=</span> <span style="color: #008000; font-weight: bold">new</span> FileWriter<span style="color: #666666">(</span><span style="color: #BA2121">"vec_gen.out"</span><span style="color: #666666">);</span>
BufferedWriter bw <span style="color: #666666">=</span> <span style="color: #008000; font-weight: bold">new</span> BufferedWriter<span style="color: #666666">(</span>fw<span style="color: #666666">);</span>
Random rng <span style="color: #666666">=</span> <span style="color: #008000; font-weight: bold">new</span> Random<span style="color: #666666">();</span>
<span style="color: #008000; font-weight: bold">for</span> <span style="color: #666666">(</span><span style="color: #B00040">int</span> i <span style="color: #666666">=</span> <span style="color: #666666">0;</span> i <span style="color: #666666"><</span> count<span style="color: #666666">;</span> <span style="color: #666666">++</span>i<span style="color: #666666">)</span> <span style="color: #666666">{</span>
<span style="color: #B00040">int</span> r <span style="color: #666666">=</span> rng<span style="color: #666666">.</span><span style="color: #7D9029">nextInt</span><span style="color: #666666">(</span>INT_MAX<span style="color: #666666">);</span>
bw<span style="color: #666666">.</span><span style="color: #7D9029">write</span><span style="color: #666666">(</span>r <span style="color: #666666">+</span> <span style="color: #BA2121">"\n"</span><span style="color: #666666">);</span>
<span style="color: #666666">}</span>
bw<span style="color: #666666">.</span><span style="color: #7D9029">close</span><span style="color: #666666">();</span>
<span style="color: #666666">}</span> <span style="color: #008000; font-weight: bold">catch</span> <span style="color: #666666">(</span>IOException e<span style="color: #666666">)</span> <span style="color: #666666">{</span>
System<span style="color: #666666">.</span><span style="color: #7D9029">err</span><span style="color: #666666">.</span><span style="color: #7D9029">println</span><span style="color: #666666">(</span><span style="color: #BA2121">"Received I/O exception: "</span> <span style="color: #666666">+</span> e<span style="color: #666666">);</span>
System<span style="color: #666666">.</span><span style="color: #7D9029">exit</span><span style="color: #666666">(-2);</span>
<span style="color: #666666">}</span>
<span style="color: #666666">}</span>
<span style="color: #666666">};</span>
</pre></div>
<p>Python:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">random</span>
<span style="color: #008000; font-weight: bold">import</span> <span style="color: #0000FF; font-weight: bold">sys</span>
<span style="color: #008000; font-weight: bold">def</span> <span style="color: #0000FF">main</span>():
<span style="color: #008000; font-weight: bold">if</span> <span style="color: #008000">len</span>(sys<span style="color: #666666">.</span>argv) <span style="color: #666666">!=</span> <span style="color: #666666">2</span>:
<span style="color: #008000; font-weight: bold">print</span> <span style="color: #666666">>></span> sys<span style="color: #666666">.</span>stderr, <span style="color: #BA2121">"Usage: </span><span style="color: #BB6688; font-weight: bold">%s</span><span style="color: #BA2121"> <elem_count>"</span> <span style="color: #666666">%</span> sys<span style="color: #666666">.</span>argv[<span style="color: #666666">0</span>]
<span style="color: #008000">exit</span>(<span style="color: #666666">-1</span>)
count <span style="color: #666666">=</span> <span style="color: #008000">int</span>(sys<span style="color: #666666">.</span>argv[<span style="color: #666666">1</span>])
random<span style="color: #666666">.</span>seed()
<span style="color: #008000; font-weight: bold">with</span> <span style="color: #008000">open</span>(<span style="color: #BA2121">'vec_gen.out'</span>, <span style="color: #BA2121">'w'</span>) <span style="color: #008000; font-weight: bold">as</span> <span style="color: #008000">file</span>:
<span style="color: #008000; font-weight: bold">for</span> i <span style="color: #AA22FF; font-weight: bold">in</span> <span style="color: #008000">xrange</span>(count):
r <span style="color: #666666">=</span> random<span style="color: #666666">.</span>getrandbits(<span style="color: #666666">31</span>)
<span style="color: #008000; font-weight: bold">print</span> <span style="color: #666666">>></span> <span style="color: #008000">file</span>, r
<span style="color: #008000; font-weight: bold">if</span> __name__ <span style="color: #666666">==</span> <span style="color: #BA2121">'__main__'</span>:
main()
</pre></div>
<p>OCaml:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">if</span> <span style="color: #666666">(</span><span style="color: #0000FF; font-weight: bold">Array</span>.length <span style="color: #0000FF; font-weight: bold">Sys</span>.argv<span style="color: #666666">)</span> <span style="color: #666666"><></span> <span style="color: #666666">2</span> <span style="color: #008000; font-weight: bold">then</span> <span style="color: #666666">(</span>
<span style="color: #008000; font-weight: bold">let</span> msg <span style="color: #666666">=</span> <span style="color: #BA2121">"Usage: "</span> <span style="color: #666666">^</span> <span style="color: #666666">(</span><span style="color: #0000FF; font-weight: bold">Array</span>.get <span style="color: #0000FF; font-weight: bold">Sys</span>.argv <span style="color: #666666">0)</span> <span style="color: #666666">^</span> <span style="color: #BA2121">" <elem_count></span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span> <span style="color: #008000; font-weight: bold">in</span>
prerr_string msg<span style="color: #666666">;</span>
exit <span style="color: #666666">(-1)</span>
<span style="color: #666666">)</span> <span style="color: #008000; font-weight: bold">else</span> <span style="color: #666666">(</span>
<span style="color: #008000; font-weight: bold">let</span> count <span style="color: #666666">=</span> int_of_string <span style="color: #666666">(</span><span style="color: #0000FF; font-weight: bold">Array</span>.get <span style="color: #0000FF; font-weight: bold">Sys</span>.argv <span style="color: #666666">1)</span> <span style="color: #008000; font-weight: bold">in</span>
<span style="color: #008000; font-weight: bold">let</span> file <span style="color: #666666">=</span> open_out <span style="color: #BA2121">"vec_gen.out"</span> <span style="color: #008000; font-weight: bold">in</span>
<span style="color: #008000; font-weight: bold">let</span> <span style="color: #008000; font-weight: bold">rec</span> write_rand_line n <span style="color: #666666">=</span>
<span style="color: #008000; font-weight: bold">if</span> n <span style="color: #666666">=</span> <span style="color: #666666">0</span> <span style="color: #008000; font-weight: bold">then</span> <span style="color: #008000">()</span>
<span style="color: #008000; font-weight: bold">else</span>
<span style="color: #008000; font-weight: bold">let</span> r <span style="color: #666666">=</span> <span style="color: #0000FF; font-weight: bold">Random</span>.bits <span style="color: #008000">()</span> <span style="color: #008000; font-weight: bold">in</span>
output_string file <span style="color: #666666">((</span>string_of_int r<span style="color: #666666">)</span> <span style="color: #666666">^</span> <span style="color: #BA2121">"</span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span><span style="color: #666666">);</span>
write_rand_line <span style="color: #666666">(</span>n <span style="color: #666666">-</span> <span style="color: #666666">1)</span>
<span style="color: #008000; font-weight: bold">in</span>
write_rand_line count<span style="color: #666666">;</span>
exit <span style="color: #666666">0</span>
<span style="color: #666666">)</span>
</pre></div>
<p>Lua:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">function</span> <span style="color: #0000FF">main</span>(args)
<span style="color: #008000; font-weight: bold">if</span> <span style="color: #666666">#</span>args <span style="color: #666666">~=</span> <span style="color: #666666">1</span> <span style="color: #008000; font-weight: bold">then</span>
io.stderr:write(<span style="color: #BA2121">"Usage: "</span> <span style="color: #666666">..</span> args[<span style="color: #666666">0</span>] <span style="color: #666666">..</span> <span style="color: #BA2121">" <elem_count></span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>)
<span style="color: #008000">os.exit</span>(<span style="color: #666666">-1</span>)
<span style="color: #008000; font-weight: bold">end</span>
<span style="color: #008000; font-weight: bold">local</span> count <span style="color: #666666">=</span> <span style="color: #008000">tonumber</span>(args[<span style="color: #666666">1</span>])
<span style="color: #008000">math.randomseed</span>(<span style="color: #008000">os.time</span>())
<span style="color: #008000; font-weight: bold">local</span> upper <span style="color: #666666">=</span> <span style="color: #008000">math.floor</span>(<span style="color: #666666">2^31</span> <span style="color: #666666">-</span> <span style="color: #666666">1</span>)
<span style="color: #008000">io.output</span>(<span style="color: #008000">io.open</span>(<span style="color: #BA2121">"vec_gen.out"</span>, <span style="color: #BA2121">"w"</span>))
<span style="color: #008000; font-weight: bold">for</span> i <span style="color: #666666">=</span> <span style="color: #666666">1</span>,count <span style="color: #008000; font-weight: bold">do</span>
<span style="color: #008000; font-weight: bold">local</span> r <span style="color: #666666">=</span> <span style="color: #008000">math.random</span>(<span style="color: #666666">0</span>, upper)
<span style="color: #008000">io.write</span>(r)
<span style="color: #008000">io.write</span>(<span style="color: #BA2121">"</span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>)
<span style="color: #008000; font-weight: bold">end</span>
<span style="color: #008000">io.close</span>()
<span style="color: #008000; font-weight: bold">end</span>
main(arg)
</pre></div>
<p>Rust:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">import io<span style="color: #666666">::</span>writer_util;
<span style="color: #008000; font-weight: bold">fn</span> main(args<span style="color: #666666">:</span> [<span style="color: #008000; font-weight: bold">str</span>]) {
<span style="color: #008000; font-weight: bold">if</span> vec<span style="color: #666666">::</span>len(args) <span style="color: #666666">!=</span> <span style="color: #666666">2</span><span style="color: #008000; font-weight: bold">u</span> {
<span style="color: #008000; font-weight: bold">let</span> usage <span style="color: #666666">=</span> <span style="border: 1px solid #FF0000">#</span>fmt(<span style="color: #BA2121">"Usage: %s <elem_count></span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>, args[<span style="color: #666666">0</span>]);
io<span style="color: #666666">::</span>stderr().write_str(usage);
os<span style="color: #666666">::</span>set_exit_status(<span style="color: #666666">-1</span>);
ret;
}
<span style="color: #008000; font-weight: bold">let</span> count <span style="color: #666666">=</span> option<span style="color: #666666">::</span>get(<span style="color: #008000; font-weight: bold">int</span><span style="color: #666666">::</span>from_str(args[<span style="color: #666666">1</span>]));
<span style="color: #008000; font-weight: bold">let</span> rng <span style="color: #666666">=</span> rand<span style="color: #666666">::</span>seeded_rng(rand<span style="color: #666666">::</span>seed());
<span style="color: #008000; font-weight: bold">let</span> fw <span style="color: #666666">=</span> result<span style="color: #666666">::</span>get(io<span style="color: #666666">::</span>buffered_file_writer(<span style="color: #BA2121">"vec_gen.out"</span>));
<span style="color: #008000; font-weight: bold">let</span> <span style="color: #008000; font-weight: bold">mut</span> i <span style="color: #666666">=</span> <span style="color: #666666">0</span>;
<span style="color: #008000; font-weight: bold">while</span> i <span style="color: #666666"><</span> count {
<span style="color: #008000; font-weight: bold">let</span> r <span style="color: #666666">=</span> rng.next() <span style="color: #666666">&</span> (<span style="color: #666666">0x7fffffff</span><span style="color: #008000; font-weight: bold">u</span> <span style="color: #008000; font-weight: bold">as</span> <span style="color: #008000; font-weight: bold">u32</span>);
fw.write_line(<span style="color: #008000; font-weight: bold">int</span><span style="color: #666666">::</span>to_str(r <span style="color: #008000; font-weight: bold">as</span> <span style="color: #008000; font-weight: bold">int</span>, <span style="color: #666666">10</span><span style="color: #008000; font-weight: bold">u</span>));
i <span style="color: #666666">+=</span> <span style="color: #666666">1</span>;
}
}
</pre></div>
<p>Go:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">package</span> main
<span style="color: #008000; font-weight: bold">import</span> (
<span style="color: #BA2121">"bufio"</span>
<span style="color: #BA2121">"fmt"</span>
<span style="color: #BA2121">"log"</span>
<span style="color: #BA2121">"math/rand"</span>
<span style="color: #BA2121">"os"</span>
<span style="color: #BA2121">"strconv"</span>
)
<span style="color: #008000; font-weight: bold">func</span> main() {
<span style="color: #008000; font-weight: bold">if</span> <span style="color: #008000">len</span>(os.Args) <span style="color: #666666">!=</span> <span style="color: #666666">2</span> {
fmt.Fprintf(os.Stderr, <span style="color: #BA2121">"usage: %s <elem_count>\n"</span>, os.Args[<span style="color: #666666">0</span>])
os.Exit(<span style="color: #666666">-1</span>)
}
count, err <span style="color: #666666">:=</span> strconv.Atoi(os.Args[<span style="color: #666666">1</span>])
<span style="color: #008000; font-weight: bold">if</span> err <span style="color: #666666">!=</span> <span style="color: #008000; font-weight: bold">nil</span> { <span style="color: #008000">panic</span>(err) }
file, err <span style="color: #666666">:=</span> os.Create(<span style="color: #BA2121">"vec_gen.out"</span>)
<span style="color: #008000; font-weight: bold">if</span> err <span style="color: #666666">!=</span> <span style="color: #008000; font-weight: bold">nil</span> { <span style="color: #008000">panic</span>(err) }
<span style="color: #008000; font-weight: bold">defer</span> file.Close()
bw <span style="color: #666666">:=</span> bufio.NewWriter(file)
<span style="color: #008000; font-weight: bold">defer</span> <span style="color: #008000; font-weight: bold">func</span>() {
err <span style="color: #666666">:=</span> bw.Flush()
<span style="color: #008000; font-weight: bold">if</span> err <span style="color: #666666">!=</span> <span style="color: #008000; font-weight: bold">nil</span> { log.Fatal(err) }
}()
<span style="color: #008000; font-weight: bold">for</span> i <span style="color: #666666">:=</span> <span style="color: #666666">0</span>; i < count; i<span style="color: #666666">++</span> {
r <span style="color: #666666">:=</span> rand.Int31()
fmt.Fprintf(bw, <span style="color: #BA2121">"%d\n"</span>, r)
}
}
</pre></div>
<p>Ruby:</p><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #008000; font-weight: bold">def</span> <span style="color: #0000FF">main</span>()
<span style="color: #008000; font-weight: bold">if</span> <span style="color: #880000">ARGV</span><span style="color: #666666">.</span>length <span style="color: #666666">!=</span> <span style="color: #666666">1</span> <span style="color: #008000; font-weight: bold">then</span>
<span style="color: #008000">warn</span> <span style="color: #BA2121">"Usage: </span><span style="color: #BB6688; font-weight: bold">#{</span><span style="color: #19177C">$0</span><span style="color: #BB6688; font-weight: bold">}</span><span style="color: #BA2121"> <elem_count>"</span>
<span style="color: #008000">exit</span> <span style="color: #666666">-1</span>
<span style="color: #008000; font-weight: bold">end</span>
count <span style="color: #666666">=</span> <span style="color: #880000">ARGV</span><span style="color: #666666">[0].</span>to_i
file <span style="color: #666666">=</span> <span style="color: #880000">File</span><span style="color: #666666">.</span>open <span style="color: #BA2121">"vec_gen.out"</span>, <span style="color: #BA2121">"w"</span>
upper <span style="color: #666666">=</span> <span style="color: #666666">2147483647</span>
<span style="color: #008000; font-weight: bold">for</span> i <span style="color: #008000; font-weight: bold">in</span> <span style="color: #666666">1.</span>.count <span style="color: #008000; font-weight: bold">do</span>
r <span style="color: #666666">=</span> <span style="color: #008000">rand</span>(upper)
file<span style="color: #666666">.</span>write r
file<span style="color: #666666">.</span>write <span style="color: #BA2121">"</span><span style="color: #BB6622; font-weight: bold">\n</span><span style="color: #BA2121">"</span>
<span style="color: #008000; font-weight: bold">end</span>
<span style="color: #008000; font-weight: bold">end</span>
main()
</pre></div>
<a name="updates"></a><h3>Updates</h3><dl><dt>2012-06-03 2100</dt><dd><p>Reflect Makefile switch to ocaml native compiler,
I was using the bytecode compiler.</p></dd><dt>2012-06-04 1500</dt><dd><p>Add Ruby, because I seem to remember enough of it.</p></dd><dt>2012-06-04 1930</dt><dd><p>Update Rust numbers per <a href="http://blog.cdleary.com/2012/06/simple-selfish-and-unscientific-shootout/#comment-547477021">Graydon's comment</a>.
The code under test remained unchanged for the entry's inline results table.</p></dd></dl>Committers beware2012-05-18T17:00:00-07:00Chris Learyhttps://twitter.com/cdleary6e6f2f76-bf43-4312-8dad-9521be8140c1<a name="committers-beware"></a><div class="line-block"><div class="line">Toiling away with hand swept clocks</div><div class="line">Meticulously combed-through kilo-SLOCs</div><div class="line">More and more features borne to bear, but</div><div class="line">For all continents, a continent unaware.</div></div><div class="line-block"><div class="line">Streams of commits slake developer thirst</div><div class="line">Screams from sales pitches, ever averse</div><div class="line">Product with no need but a product indeed, as</div><div class="line">People with Real Problems want and bleed.</div></div><div class="line-block"><div class="line">Words on a page, referred to as "plan", but</div><div class="line">Equivocate: business, science fair, fighting the man?</div><div class="line">Wanton tech fails on bang per buck</div><div class="line">Without users, committer, your work doth suck.</div></div>Website Mania2012-05-06T18:15:00-07:00Chris Learyhttps://twitter.com/cdleary94d10d63-923b-4c8f-bbda-05d4918b9e02<a name="website-mania"></a><blockquote><p>By the 1840s, a "Railroad Mania" was raging, with stocks selling on
multiples of passenger miles, a precursor for multiples of page views that
Yahoo stock would trade on 150 years later. An inventor named Charles
Babbage complained that "the railroad mania withdrew from other pursuits
the most intellectual and skillful draftsmen." [...] Investors made money,
investors lost money, but in the best and worst of times, the railroads got
built, and people and goods were shuffled about.</p><p class="attribution">—<a href="http://www.amazon.com/How-Got-Here-Irreverent-Technology/dp/B000ENWIH6?tag=honetoasegf-20">How We Got Here, A Slightly Irreverent History of Technology and Markets</a></p></blockquote><p>I think that, these days, Babbage would be looking for the systems programmers.</p>Using C89 in 2012 isn't crazy2012-02-29T19:30:00-08:00Chris Learyhttps://twitter.com/cdleary963335ff-ba6a-4d42-b3ac-b819abcfd042<a name="using-c89-in-2012-isn-t-crazy"></a><p>The first group I worked with in industry wrote the compiler in C and made fun of C++ on a regular basis. The second group I worked with in industry wrote the compiler in C++ and made fun of C on occasion. Like most systems programmers I've met, they were a loveable, but <em>snarky</em> bunch!</p><p>In any case, I've seen life on both sides of the fence, and there's <strong>really simple reasoning that dictates what you choose</strong> from C89, GNU C, C99, C++98, or C++11 in the year 2012 AD:</p><ul><li><p><strong>Use the language that you're most productive in</strong>, given that you</p></li><li><p><strong>Use the language which is compatible with your target market.</strong></p></li></ul><p>If this sounds simple, you're lucky!</p><p>Life gets a little bit more interesting when the match is fuzzy: you <em>could</em> make a strategic gamble and (at least initially) ignore parts of your "maximal" target market to gain some productivity. If you're under the gun, that may be the right way to go.</p><p>But then again, keeping your options open is also important. The wider the target market the more people you can give an immediate "yes" to. I have to imagine that phone calls like this can be important:</p><blockquote><p>[A sunny afternoon in a South Bay office park. Just outside, a white Prius merges three lanes without activating a blinker. Suddenly, the phone rings.]</p><p>Nice to hear from you, Bigbucks McWindfall! What's that? You say you want my code to run as an exokernel on an in-house embedded platform with an in-house C89 toolchain? No problem! We'll send a guy to your office to compile our product and run tests tomorrow morning.</p></blockquote><p>Suffice it to say that there are legitimate considerations. Consider that GCC isn't <em>everywhere</em> (though I love how prevalent it is these days!) and it certainly doesn't generate the best code on every platform for every workload. Consider that MSVC can only compile C89 as "real" C (as opposed to a C++ subset). Consider that the folks out there who have custom toolchains probably have them because they can afford them.</p><p>There are benefits to taking a dependency on a lowest common denominator.</p>Paradox of the generalist2012-02-27T22:30:00-08:00Chris Learyhttps://twitter.com/cdlearyb0e1e2b4-e58d-4a2d-866a-699e0963fb43<a name="paradox-of-the-generalist"></a><p>Classic management advice is to build a republic: each team member specializes in what they're good at. It just makes sense.</p><p>You nurture existing talents in attempt to ensure personal growth; simultaneously, you fill niches that need filling, constructively combine strengths, and orchestrate sufficient overlap in order to wind up with a functioning, durable, kick-ass machine of a team. A place for everyone, everyone in their place, and badassery ensues! (So the old saying goes...)</p><p>But what if, instead, you could simultaneously fork off N teams — one for every team member — and make that team member simultaneously responsible for <em>everything</em>? What would happen to the personal knowledge, growth rate, and impact of each member?</p><p>Let's take it one step farther: imagine <em>you're</em> that team member. All of a sudden it sounds terrifying, right? If you don't know it, nobody does. If you don't do it, nobody will. If you don't research it, you'll have no idea what it's about. If you don't network, no contacts are made. If you don't ship it, you know it will never change the firm/industry/world.</p><p>So, you think like you've been trained to think: you disambiguate the possible results. What could happen? Maybe you'd crumble under the pressure. Maybe you wouldn't be able to find your calling because you're glossing over the details that make you an artisan. Maybe you'd look like a fool. Maybe you would ship totally uninteresting crap that's all been done before.</p><p>But, then again, maybe you would grow like you've never grown before, learn things that you never had the rational imperative to learn, talk to interesting people you would have never talked to, ship a product that moves an industry, and blow the fucking lid off of a whole can of worms.</p><p>And so we arrive at one tautological cliché that I actually agree with: you never know until you try. And, if you choose wisely, you'll probably have a damn good time doing it.</p><p>At the least, by definition, you'll learn something you couldn't have learned by specializing.</p>