<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Paul Smith</title>
    <link>https://pauladamsmith.com/</link>
    <description>
      Software engineer. Co-founder, Ad Hoc.
    </description>
    <language>en-us</language>
    <lastBuildDate>Wed, 18 Oct 2023 19:12:15 -0000</lastBuildDate>
    <item>
      <title>The 10 Year Anniversary of the HealthCare.gov Rescue</title>
      <link>https://pauladamsmith.com/blog/2023/10/the-10-year-anniversary-of-the-healthcare.gov-rescue.html</link>
      <guid>https://pauladamsmith.com/blog/2023/10/the-10-year-anniversary-of-the-healthcare.gov-rescue.html</guid>
      <pubDate>Wed, 18 Oct 2023 19:12:15 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>Ten years ago today, on Friday, October 18, 2013, the effort to <a href="/blog/2014/03/fixing-healthcare.gov.html">fix
HealthCare.gov</a> began in earnest. At
7:15 A.M. Eastern time, a small group assembled next to the entrance to the West
Wing of the White House. The group included Todd Park, Brian Holcomb, Gabriel Burt,
Ryan Panchadsaram, Greg Gershman, and myself. Later in the day we were joined by
Mikey Dickerson via a long-running speakerphone call. Some of us were from
outside of government (Gabe, Brian, Mikey, and me), and the others had jobs in
government at the time (Todd, Ryan, and Greg). What we all had in common was
that we were experienced technologists, having been at startups or at large
established technology organizations.</p>
<p>The members of our group were selected by Todd, working with Greg and Ryan and
others behind the scenes to identify people who could help because they had that
kind of technology experience. HealthCare.gov, having launched days earlier on
October 1, 2013, wasn't working. From the vantage from the top of the political
leadership in the country, it was clear outside help was needed. Todd, the CTO
of the United States at the time, which is a position in the White House, was
tapped to help fix it. His plan was to provide reinforcements to complement the
team of government employees and contractors that had built HealthCare.gov and
were in the midst of operating it. We were to be a small team, very discreet.
Todd was our leader. It was already a high-pressure, stressful situation, so
insertion into that context meant melding in, not blowing things up. It was to
be a low-key mission of information gathering and assessment, not the cavalry
storming in. Todd told us the next 30 days would be critical. The goals were to
enroll 7 million people by March 31, 2014, the end of the period known as open
enrollment. When the media was eventually informed of our existence by the White
House, we were referred to as &quot;the tech surge&quot;.</p>
<p>When Todd called me two days prior on October 16 to ask me if I would join the
effort, he didn't have to explain the stakes. I understood what it meant for the
website to work. I immediately agreed, and put on hold what I was doing, which
happened to be raising money for a startup I had founded. I took Todd's call
while walking the grounds of the Palace of Fine Arts in San Francisco, having
met with VCs earlier in the day. I was living in Baltimore at the time with my
wife and toddler daughter. I flew back home right away, and before I knew it I
was taking the earliest morning train I could from Camden Yards to DC's Union
Station. I thought I had timed it right, but still wound up running across
Pennsylvania Avenue so as not to be late.</p>
<figure>
    <img src="/images/hc.gov-10-years/whitehouse.jpg"
         alt="Photo of the White House">
    <figcaption>Photo by me as I hustled. Metadata says taken at 7:11 A.M., so must have just made it</figcaption>
</figure>
<p>We couldn't have started any sooner even if we had wanted to. The federal
government had shut down on October 1, the same day HealthCare.gov had launched.
The shutdown prevented anyone from outside the main team working on
HealthCare.gov from coming in to help. So while days passed with the news
dominated by the twin stories of the shutdown and the slow-moving catastrophe of
the launch, a vacuum of information formed, as well as a surplus of speculation
and worry. The White House couldn't figure out what was wrong with it, and the
implications of it failing were troubling. The Affordable Care Act, the
signature domestic policy achievement of President Obama's tenure, had gone into
effect, and the website was to be the main vehicle for delivering the benefits
of the law to millions of people. If they didn't know what was wrong with
HealthCare.gov, other than that it was manifestly not working, plain for
everyone to see, and therefore they might not be able to fix it, what would that
mean for the the fate of health care reform? Fortunately, the shutdown ended on
October 17, which meant we could get to work and finally understand what was
going on.</p>
<p>Some of us already knew each other, but everyone was new to someone else. We
introduced ourselves, headed inside, and after breakfast in the Navy Mess,
headed upstairs to the Chief of Staff's office. Denis McDonough shook our hands
as we somewhat awkwardly stood in a line. He asked us directly, &quot;can you fix
it&quot;?  In our nervous energy, I remember some of us blurting out, &quot;yes&quot;. We had
confidence, but we also were eager to dive in, learn as much as we could, and
get going.</p>
<p>A van was procured, along with a driver. We piled in and headed across the Mall
to the Hubert H. Humphrey Building, headquarters of the US Department of Health
&amp; Human Services (HHS). Entering the lobby, we passed Secretary Kathleen
Sebelius. She wasn't there for us, she was welcoming federal employees back to
work after the shutdown. Our meeting there was with Marilyn Tavenner and her
staff. Tavenner was the Administrator of the Centers for Medicare &amp; Medicaid
Services (CMS), the largest organization in HHS, and the owner of
HealthCare.gov.</p>
<figure>
    <img src="/images/hc.gov-10-years/visitor-badge.jpg"
         alt="Photo of my faded visitor badge to HHS">
    <figcaption>My thermal printed visitor badge photo is pretty faded after 10 years</figcaption>
</figure>
<p>Our meeting with Tavenner and her team yielded our first details of the system
that was HealthCare.gov. We learned how it was structured and what its main
functional components were. We heard first hand what CMS leadership knew about
what was wrong, or at least where in the system they could see that things were
not working. It was our first sense of the size and complexity of the site,
both in terms of functionality, but also in terms of the number of contractors,
sub-components, and business rules such as for eligibility. But remember, these
were not technical people - they were health policy experts and administrators.
Most of the things they were reporting to us were business metrics about the
site, descriptions of high-level performance. It was helpful to hear this
perspective, and indeed many of these metrics would drive our later work. But
this would not yet be the time to learn about the technical challenges the team
was facing.</p>
<p>Our morning continued back in the van, leaving DC for CMS headquarters in
Woodlawn, Maryland, just outside Baltimore, a 45-minute drive. Here we met with
the leadership of HealthCare.gov itself, a group of people including Michelle
Snyder, CMS's COO, Henry Chao, one of its main architects, and Dave Nelson, a
director with a telecom background, who was being elevated and would oversee
much of the rescue work from CMS's perspective. The sketchy picture of
HealthCare.gov we had was coming into greater relief. We learned about
deployment challenges and bottlenecks, more about where specifically users were
getting &quot;stuck&quot; using the site, and we started to hear bits and pieces about the
particular technologies being used, including something called MarkLogic, an XML
database, which was new to us. We even started to get some details about the
deployment architecture and the types of servers involved. Again, the theme of
complexity stood out. But we were also still talking at a fairly high level. To
really understand what was wrong with HealthCare.gov, we'd have to move on.</p>
<p>The afternoon was spent a few miles down the road back toward DC in Columbia,
Maryland, at something called the XOC, or Exchange Operations Center. This was
to be the mission-control-style hub of operations for HealthCare.gov. We found a
room staffed with a few contractors and some CMS employees. (The XOC eventually
would be the site of so much activity during the rescue that a scheme to keep
people out was needed.) Here was where, away from the leadership-filled rooms,
we finally heard from technologists who were directly working on the site. What
we heard was troubling. There was a lack of confidence in making changes to the
source code. A complex code generation process governed much of the site and
produced huge volumes of code, requiring careful coordination between teams.
Testing was largely a manual process. Releases and deployment required lengthy
overnight downtimes and often failed, required equally-lengthy rollbacks. The
core data layer lacked a schema. Provisioning hosting resources was slow and not
automated. Critically, there was a pervasive lack of monitoring of the site. No
APM, no bog-standard metrics about CPU, memory, disk, network, etc., that could
be viewed by anyone on the team. By this point, the looks we exchanged amongst
ourselves betrayed our fears that the situation was much worse than we initially
expected. I carried with me that day a Moleskin notebook that I furiously
scribbled in so as not to forget what I was hearing. You can see as the day worn
on my panic start to rise reflected in my writing.</p>
<figure>
    <img src="/images/hc.gov-10-years/no-metrics.jpg"
         alt="Photo of notebook writing that says 'No metrics!!'">
    <figcaption>Incredulous that they didn't have monitoring</figcaption>
</figure>
<p>With that dose of reality and the evening setting in, we collected ourselves
back in the van and set off for our final field trip of the day, to suburban
Virginia for the offices of CGI, the prime contractor of HealthCare.gov. In
Herndon, we were greeted by many people on their leadership team and the
technical teams who had built the site, even though it was getting on past
business hours at this point. This was as much an interview of us by them as it
was a chance for us to ask questions. We had a brief opportunity to ingratiate
ourselves and win their trust. We did that in part by showing our eagerness to
dig in on some of the things we had learned earlier in the day, proving our
engineering bona fides (specifically with high-traffic websites), and
emphasizing that we were not there for any other reason than to help. This was a
team on edge, under the gun and exhausted. We needed them to succeed, and they
weren't going to work with us if they thought we were there to cast blame.</p>
<p>We asked many questions, but they mostly boiled down to, what's wrong with the
site, and how do you know what's wrong? Show us where in the system this or that
component isn't performing the way you expected. And they and CMS mostly
couldn't do that. They had daily reporting and analytics that produced those
high level business metrics. But again there was that lack of monitoring of the
system itself, real-time under load. So we focused on that.</p>
<p>It was getting late. Could we throw a Hail Mary? Walk out of that building
leaving behind something of tangible value, something promising they could build
on? We said, there's lots of APM-style monitoring services, but we're familiar
with New Relic. Could we install it on a portion of the servers? Yes, we know
there's a complicated and fragile release process, but if we bypassed that and
just directly connected to some of the machines, we could install the agent on
them and be receiving metrics almost immediately. Glances were exchanged. A CMS
leader in the room made a call on their cell phone - we actually have some
New Relic licenses already, we can use those. There was also a hestitancy to see
if this kind of extraordinary, out-of-the-norm request would be approved -
clearly, even during this period of turmoil, all the stakeholders stuck to the
regular release script. The CMS leader nodded their approval. A small group
assembled to marshall the change through. Many folks had stuck around, even
though it was nearing midnight. Then on a flatscreen in the conference room, we
pulled up the New Relic admin, and within moments, the &quot;red shard&quot; (the subset of
servers we had chosen for this test) was reporting in. And there it was - we
could see clearly the spike in request latency, even at this late hour, that
indicated a struggling website, along with error rate, requests per minute, and
other critical details that had basically been invisible to the team to that
point. Imagine a hospital ICU ward without an EKG monitor. Now that they knew
exactly in what ways it was bad, they could start to correlate them with the
business metrics and other aspects of the site that needed improving. They could
then prioritize the fixes and actions that would yield the biggest improvements.</p>
<figure>
    <img src="/images/hc.gov-10-years/rooseveltroom.jpg"
         alt="Photo of the Roosevelt Room in the White House">
    <figcaption>Photo by me. Taken at 1:50 A.M. on Saturday, October 19, 2013.</figcaption>
</figure>
<p>That would come later. For now, exhausted, well past midnight, we left Herndon
and rode the van back to DC. At roughly 2 A.M. we reconvened in the Roosevelt
Room in a completely silent White House for a quick debrief. As we reflected on
what we had just experienced, another thought was settling in over the group -
that this was obviously not over by any stretch, that none of us were going home
any time soon, that the challenge was much larger in scope than we had imagined,
that any notion we may have had at the outset of possibly just offering some
suggested fixes and moving on was in retrospect hilariously naive, that this was
all we were going to be doing for the foreseeable future until the site was
turned around. Indeed, we all managed to find a few hours of sleep in a nearby
hotel and then were right back at it in the morning, heading straight out to
Herndon.</p>
<p>This was just day one, a roughly 18-hour day, and it certainly wasn't the last
such marathon. Over the next two-and-a-half months until the end of December,
the tech surge expanded and took on new team members, experienced many remarkable
events and surprises, and ultimately, successfully helped turn HealthCare.gov
around. Millions enrolled in health care coverage that year, many for the
first time. I hope to tell more stories of how that happened over the next
few weeks and months.</p>
<iframe src="https://www.google.com/maps/d/u/0/embed?mid=1fMkJP6sSr7p8hCDUXwNbX6cnCsNQuqg&ehbc=2E312F&noprof=1" width="770" height="480"></iframe>

      ]]></content:encoded>
    </item>
    <item>
      <title>Oppenheimer</title>
      <link>https://pauladamsmith.com/blog/2023/07/oppenheimer.html</link>
      <guid>https://pauladamsmith.com/blog/2023/07/oppenheimer.html</guid>
      <pubDate>Tue, 25 Jul 2023 04:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>I saw “<a href="https://letterboxd.com/film/oppenheimer-2023/">Oppenheimer</a>” this weekend at the <a href="https://musicboxtheatre.com">Music Box</a> in Chicago, in its theater capable of projecting 70mm film. The movie is a huge achievement, a complex piece of art that nonetheless tells an efficient story in spite of a 3 hour running time. Here are a few thoughts on the visual storytelling that was presented on screen.</p>
<p>The central question “Oppenheimer” asks is, to whom or what is J. Robert Oppenheimer bound? Director Christopher Nolan spares us a hoary answer like “the truth!”, but early on we are told it is to theory and mathematics, wherever they may lead. They first lead Oppenheimer out of the lab and into the arms of pre-war continental quantum physicists, who are forging a nascent field of inquiry that our hero immediately grasps and excels at. He becomes friends with physicist Isidor Rabi while in Germany, bonding over their shared New York Jewish backgrounds. We see Rabi nurture Oppenheimer several times, offering food from his pocket and quiet counsel, evincing an almost maternal protective quality. Oppenheimer, back in the States, forms allegiances with students and organizers of various labor movements, including the Communist Party, but weakly, always as or via a proxy (using the party as a channel to fund republicans of the Spanish Civil War; as a means to start his relationship with Jean Tatlock). Several scenes hint at deeper connections, but cut away before he does anything incriminating (“but that would be treason …”). Of course his “true” allegiance or lack of to the Communist Party and to the Soviet Union hangs over the balance of the post-test movie, which seems content to leave it unanswered, or perhaps, answered sufficiently by his other deeds, which several characters make voice of.</p>
<p>What of Oppenheimer’s non-professional bonds? He’s prepared to boil off his child like so many neutrons in a fission reaction, driving the colicky baby to a friend’s house in hopes of being relieved of parental duties. In spite of his affairs and scarce evidence of marital happiness, his relationship with his wife Kitty is shown to endure (“we’re adults, we’ve been through fire together”, their own form of fusion), surviving at least to his public image rehabilitation late in life. Frank Oppenheimer, kept to the periphery early on, becomes essential to the triumph at Los Alamos, reuniting the brothers on the same mesa where they forged a connection to the land, a feel for the weather of the desert. He pursues Jean Tatlock with flowers and is repelled; she later makes her own pursuit, reminding him of his off-hand oath, “you said you’d always answer” — a promise he’s now incapable of keeping, acted on by multiple forces much stronger than she.</p>
<p>Oppenheimer becomes an attractive force himself to build Los Alamos, cajoling scientists and convincing the US military, overcoming each group’s respective reservations, the former about the endeavor itself, the latter about him. He’s the nucleus of the most important thing that’s ever happened, to quote Gen. Groves, but we see him often distracted, gazing into the middle distance, drifting off to Chicago or San Francisco, giving misleading testimony to the army’s quietly menacing interrogator. Still, it’s Oppenheimer keeping the energetic particles of scientists around him from flying off or creating ruinous inter-personal explosions. They eke out just enough collaboration and luck to blow up the gadget before Potsdam — immediately, the military dissolves their bond to the man who secured their supremacy (“we got it from here”). We see Oppenheimer unmoored, isolated, radiating out his misgivings and the horror of his revelations.</p>
<p>In a pivotal scene, Oppenheimer dismisses a report of a nuclear chain reaction, stating that theory proves it can’t be so. When his neighbor colleague reproduces the experiment, he immediately forms a new theoretical understanding from it — the bond to pure theory is broken, and a new one connecting theory and practice is made. It takes him no time again to accelerate to the logical end of the implications, and this time, theory must wait for practice to catch up. When the new experiment is finished, first at Trinity and then at Hiroshima and Nagasaki, it’s no longer about what the physics demonstrates, but what it means for the notion of humanity and civilization: his revulsion at the reception of the bomb among his peers and the public is shown itself as a terrible blinding fire, one that now lives within him.</p>
<h2>Stray observations</h2>
<ul>
<li>“Oppenheimer” centers language throughout, and positions Oppenheimer as a language savant in technical ways, but deficient in others. He learns enough Dutch to teach physics in Europe. He reads Marx in the original German. He quotes Sanskrit to his lover. He’s also a translator, bringing the foreign language of quantum mechanics to the United States, and bridging the gap between academics and the military. He fails to learn the language of Washington, and his character is assassinated as a result. Kitty is bedeviled, who can’t understand how a man can be so proficient in one domain and so passive when it comes to himself and his family.</li>
<li>There’s a rich symbolic history to the apple, and one figures prominently in early scenes. First as an impulsive attempt on his professor’s life. We see Oppenheimer stab the apple, which is a pre-quantum apple, Newton’s apple; the needle with which he injects the cyanide is a dagger in Newtonian physics, which Einstein, whom we encounter multiple times including in despair in the final scene, inflicted the first mortal wound, but which quantum mechanics and the bomb finally killed. And then as the poison fruit of knowledge carried by Niels Bohr, who introduces him to quantum mechanics, which leads to our expulsion from Eden when the atomic weapon is used.</li>
<li>The circular badges that the scientists wore at Los Alamos, which had labels like “K-16” and “C-43”, which I’m sure were for some organizational purpose the film doesn't explain (as far as I remember), made me think of them like personified isotopes from the periodic table of the elements.</li>
<li>The act III Strauss plot was less successful to me than the rest of the film. The revelation that he was humiliated and vindictive and then used the apparatus of official power to seek his revenge didn’t land for me in the way I think was intended, perhaps because, while certainly despicable, it’s not shocking nor particularly novel, even discounting our recent experiences with vengeful politicians. Setting that aside, from a storytelling point of view, as an extended denouement after the wallop of the Trinity event, it has to work extra hard to sustain a clear narrative focus, and I felt the film suffered overall for it.</li>
<li>Kudos to Jennifer Lame, who edited the film, for making scene after scene of extensive dialogue so compelling and propulsive. And to Hoyte van Hoytema, the director of photography — it’s tonally gorgeous and lightly under-saturated in a way that serves the somber mood. What more can be said about 70mm film? It just glows. It’s very much worth seeing in a theater.</li>
</ul>

      ]]></content:encoded>
    </item>
    <item>
      <title>Fixing bufferbloat on your home network with OpenBSD 6.2 or newer</title>
      <link>https://pauladamsmith.com/blog/2018/07/fixing-bufferbloat-on-your-home-network-with-openbsd-6.2-or-newer.html</link>
      <guid>https://pauladamsmith.com/blog/2018/07/fixing-bufferbloat-on-your-home-network-with-openbsd-6.2-or-newer.html</guid>
      <pubDate>Wed, 04 Jul 2018 01:22:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>My home network (which is also my work network) is a standard-issue Comcast cable hookup. In spite of a tolerable 120 megabits down, my experience of daily Internet use is regularly frustrating. Video streams and video chats drop in quality inexplicably. SSH sessions become laggy. Web pages fail to load quickly, and then seem to appear all at once. Even though I should have plenty of bandwidth, the feeling is often one of slowness, waiting, data struggling to get through the pipes.</p>
<p>The reason for this is a phenomenon called &quot;bufferbloat&quot;. I'm not going to explain it in detail, there are plenty of good resources to read about it, including the eponymous <a href="https://www.bufferbloat.net/projects/bloat/wiki/Introduction/">Bufferbloat.net</a>. Bufferbloat is the result of complex interactions between the software and hardware systems routing traffic around on the Internet. It causes higher latency in networks, even ones with plenty of bandwidth. In a nutshell, software queues in our routers are not letting certain packets through fast enough to ensure that things feel interactive and responsive. Pings, TCP ACKs, SSH connections, are all being held up behind a long line of packets that may not need to be delivered with the same urgency. There's enough bandwidth to process the queue, the trick is to do it more quickly and more fairly.</p>
<p>Fortunately, because bufferbloat is in part a function of how we configure our routers, it's within our ability to solve the problem. But first, we have to diagnose it, and establish a concrete baseline to improve from. The <a href="http://www.dslreports.com/speedtest/">speed test at dslreports.com</a> tests for bufferbloat in addition download and upload speeds, so we'll use that tool to see how we're doing.</p>
<p>First, I run the speed test, and get the following results:</p>
<p><img src="/images/bufferbloat/before.png" alt="speed test results - before fixes" /></p>
<p>Here you can see the issue starkly: 120 Mbps down and 12 Mbps up yields an &quot;A+&quot; grade (debatable), but we get an &quot;F&quot; for bufferbloat.</p>
<p>We define bufferbloat here as the increased latency of a standard ping while downloading or uploading a large file over ping times while otherwise quiescent.</p>
<p>In our case, our idle latency is 12ms average, a download bloat of about 660ms, and an upload bloat of about 280ms, on average.</p>
<p>The fix is to apply a queue management strategy to our router. Ordinarily, I'd be wary of this. In my experience, QoS administration tends to be fussy and full of unintended consequences. I always felt as if I had cast too broad a net, inadvertantly degrading overall network performance to get slightly better results from one application. And I wasn't sure around what fixed-point I was optimizing. In this case, bufferbloat gives us the measurable target. Administration is made much easier by the appearance of a new algorithm that's easy to apply to network interfaces. It doesn't require much tuning, and you don't need to futz with individual ports or percentages.</p>
<p>Details vary widely by router operating system and administrative UIs. In our case, the router is running <a href="http://openbsd.org/">OpenBSD</a>. (And if yours isn't, why not? Get a <a href="https://www.pcengines.ch/apu2.htm">PC Engines board</a>, throw obsd on it, and you have an inexpensive solution with world-class security, efficiency, and performance, that's simple to operate and well-documented.) The OpenBSD way of being a router is through its <a href="https://www.openbsd.org/faq/pf/"><code>pf</code></a> system, which is analogous to Linux's iptables, but much more capable and efficient. Since <a href="https://www.openbsd.org/62.html">6.2</a>, <code>pf</code> has implemented something called &quot;FQ-CoDel&quot;, which is an algorithm for scheduling packets fairly and is designed to prevent bufferbloat. It is exposed via the <code>flows</code> option on a <code>queue</code> rule. In principle, all we need to do is add two rules, one to fix uplink bufferbloat and one to fix downlink. Let's see how this goes.</p>
<p>In our <code>/etc/pf.conf</code>, we first add a single line to handle the uplink. This will apply a FQ-CoDel queue to the network interface attached to our WAN link, or the cable modem in our case. The way to think about it is, FQ-CoDel is strategy applied to outbound packets only, as they exit the interface, so even though the WAN interface is duplex up and down, in order to handle the downlink part we'll apply it to the network interface connected to our LAN, which we'll do next.</p>
<p>An important detail. In order for the queue algorithm to do its thing, it needs to know the bandwidth of the outbound link. According to Mike Belopuhov, the implementor of FQ-CoDel in OpenBSD, we need to <a href="https://www.reddit.com/r/openbsd/comments/75ps6h/fqcodel_and_pf/doca4uv/">specify 90-95% of the available bandwidth</a>. Fortunately, we've just measured it.</p>
<p>The line to add to <code>pf.conf</code> to fix bufferbloat on the uplink is (assuming <code>em0</code> for the WAN interface):</p>
<pre><code>queue outq on em0 flows 1024 bandwidth 11M max 11M qlimit 1024 default
</code></pre>
<p>A couple of notes. <code>outq</code> is a label we give, but it's an opaque string to <code>pf</code>. <code>11M</code> means 11 megabits per second (92% of the uplink bandwidth). <code>qlimit</code> is also specified explicitly, because its default value of 50 is too low for FQ-CoDel. The <code>default</code> keyword is required.</p>
<p>And that's it: we don't need to alter our filtering rules to assign packets to a queue: all outbound packets on this interface are assigned to our new queue.</p>
<p>Now let's reload <code>pf</code> with the config change, and re-run the speed test.</p>
<pre><code>$ doas pfctl -n -f /etc/pf.conf &amp;&amp; doas pfctl -f /etc/pf.conf
</code></pre>
<p><img src="/images/bufferbloat/after-uplink.png" alt="speed test results - after uplink fix" /></p>
<p>Uplink latency under load is now down to 17ms on average, from 280ms. That's a mere 5ms worse than idle.</p>
<p>(I discount the apparent decrease in uplink bandwidth from this test result. In my experience, dslreports.com could vary by 10-15% in reported bandwidth run-to-run, but over time it converged on 12 Mbps.)</p>
<p>The downlink fix is nearly the same, we just adjust for the name of the interface (the LAN NIC is called <code>em1</code>) and for its 90-95% bandwidth upper bound, which is 110 Mbps.</p>
<pre><code>queue inq on em1 flows 1024 bandwidth 110M max 110M qlimit 1024 default
</code></pre>
<p>Reload, re-run:</p>
<p><img src="/images/bufferbloat/after-downlink.png" alt="speed test results - after downlink fix" /></p>
<p>Always nice to get an A. Downlink latency under load is now 24ms, from 660ms.</p>
<p>I haven't elided much, I think that's a pretty decent result for two lines of config. If you want to go further, there's a <code>quantum</code> knob to turn (baseline is your NIC's MTU, but look at what OpenWRT does for guidance), but that's about it.</p>
<p>Post-fix, my observation is that things feel much snappier. Aside from the ping time improvements, I don't have other measurements to cite. But so far, FQ-CoDel seems to have fixed bufferbloat on my network and made for a substantially better experience.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>2016, my year in review</title>
      <link>https://pauladamsmith.com/blog/2017/01/2016-my-year-in-review.html</link>
      <guid>https://pauladamsmith.com/blog/2017/01/2016-my-year-in-review.html</guid>
      <pubDate>Mon, 02 Jan 2017 08:12:15 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <h2>Sewing</h2>
<p>After years of looking at the sewing machine in the closet and saying I should
learn how to use it, I did something about it this year. Michelle got me a class
at a local crafts store as a gift, and I made a pillow cover. It turned out
pretty good, and I enjoyed doing it, so I kept at it. I didn't have a ton of
time this year to devote to it, but by the end of the year I had made some
blankets for friends and <a href="http://www.instructables.com/id/How-to-make-a-Quillow-blanketpillow/">quillows</a> for the girls, a pair of pajama
pants for Michelle, and repaired the lining of a friend's handbag. I'd like to
keep at it, maybe tackling Halloween costumes next year, and doing projects with
Maxine, like adding circuits to fabric ala <a href="https://www.adafruit.com/flora">FLORA</a>.</p>
<p><img src="/images/2016-yir/IMG_1899-COLLAGE.jpg" alt="collage of sewing" /></p>
<h2>Running</h2>
<p>I've run for exercise before, but never with any consistency. This year I set
out to run at least 2 times a week, at least 20 minutes per run. Turns out, I
liked it a lot. So much so that I was running almost every weekday by
April. Unforunately, my knee didn't like it so much, and in the middle of a run
on the Bloomingdale Trail, it suddenly seized up and I had to take an Uber
home. Luckily, I hadn't done any damage, and an orthopedist said I was just
overdoing it. I took about a month off and ramped back up slowly. By October, I
peaked for the year, doing about 10 miles per week. In November, I ran in <a href="https://www.strava.com/activities/780925903">my
first 10k, the Lincolnwood Turkey Trot</a>. All told for 2016, I logged 156
miles. For 2017, I'd like to attempt a half-marathon, and double my yearly
mileage. <a href="https://www.strava.com/athletes/14254908">Here's my Strava profile</a>.</p>
<p><img src="/images/2016-yir/IMG_2195-COLLAGE.jpg" alt="collage of running" /></p>
<h2>Ad Hoc</h2>
<p><a href="https://adhocteam.us/">The company</a> I started with Greg in 2014 began the year with 7 people and ended
with 41. The growth was thanks to winning our first contracts: we had been
around long enough as a company and had enough &quot;past performance&quot;, as they say
in the industry, to compete and be awarded tasks on our own, instead of only
working as subcontractors. The contracts were with <a href="https://www.cms.gov/">CMS</a>, to continue our
work on HealthCare.gov, and with the Department of Veterans Affairs, to help
build Vets.gov. We also earned spots on two highly sought-after contract
vehicles, the ADELE BPA with CMS, and FLASH with the Department of Homeland
Security, that have will have opportunities for us down the road to bid on. It
was a great year for Ad Hoc, and I'm proud of what we've built: a great team,
and useful software that is delivering actual benefits and services to real
people. For example, this year, we took over the core shopping part of
HealthCare.gov, known internally as Plan Compare. During open enrollment so far,
which is still going on until the end of January, over 3.5 million households
have enrolled for plans using our software. HealthCare.gov also saw its <a href="http://www.nytimes.com/2016/12/21/us/health-exchange-enrollment-jumps-even-as-gop-pledges-repeal.html">biggest
single-day enrollment tally ever</a>, on December 15th. I'm also proud that we're
proving out the model of providing modern software engineering and design
services to the government that are efficient, work well, cost less to build and
operate, and are just better than the status quo. There's a lot of uncertainty
ahead for the programs that we're working on. There's not much we can about
that, other than continue to do the work we have in front of us until it
changes, and look for additional opportunities in state, local, and maybe
outside of government.</p>
<p><img src="/images/2016-yir/IMG_2396-COLLAGE.jpg" alt="collage of Ad Hoc" /></p>
<h2>House</h2>
<p>Michelle and I bought <a href="https://evanston.house/">a house</a> in south Evanston in January, and
we've been renovating it since. It's an old Victorian-style home from the 1890s
with a good stone foundation and timber frame. We're doing extensive changes to
the interior, with a new layout and all new flooring, doors and windows, and
systems like electrical and HVAC. The exterior is mostly unchanged from a
framing perspective, but we're updating the siding, and we dormered out part of
the sloping roof so we could have a master bedroom on the third floor. We're
also converting the garage into a two-story garage/coach house combo: the plan
is to have an office on the second floor for Michelle and me. We had hoped, at
the beginning of the year, to be moved in by fall, but these things go the way
these things go. As of this writing, we're about a month away from being able to
pull up stakes here in Logan Square. We've been working with our friend and
architect David Burns, which has been great. We spent time together with him
talking about what we wanted, and he came up with a vision, drew up detailed
plans, and has been managing the overall construction process. We hired Conrad
Szajna of <a href="http://formedspace.com/">FormedSpace</a> to be our general contractor, and he's hired a great
team of subcontractors and tradespersons to do the work.</p>
<p><img src="/images/2016-yir/23575874262_535fed7a16_o-COLLAGE.jpg" alt="collage of house" /></p>
<h2>Family</h2>
<p>The best part of my year was spending time with my family. Maxine started
Kindergarten at Lincoln Elementary School in Evanston, and Veronica has grown
into a full-fledged toddler. We didn't do as much travel as we would have liked
this year, but we enjoyed biking together (we got a trailer this year for the
girls), exploring our fair city, and working on projects, like the homemade
arcade Maxine and I have been building together.</p>
<p><img src="/images/2016-yir/IMG_1572-COLLAGE.jpg" alt="collage of family" /></p>
<p>A few other things of note from the year:</p>
<ul>
<li>We participated in the Volkswagen settlement and chose to have them buy back
our Jetta TDi. Good riddance. We bought a new Mazda CX-9 as a replacement.</li>
<li>We volunteered for the Hillary Clinton campaign, including taking Maxine up to
Kenosha, WI, to knock on doors for GOTV on election day. Well.</li>
<li>I continue to feel grateful in so many different ways, for dear old friends
and new ones we made this year, for our families immediate and extended, for
our relative health and wealth, for our general dumb luck to be this
fortunate and safe, recognizing just how contigent, random, and unlikely that is.</li>
</ul>

      ]]></content:encoded>
    </item>
    <item>
      <title>Looking at your program’s structure in Go 1.7</title>
      <link>https://pauladamsmith.com/blog/2016/08/go-1.7-ssa.html</link>
      <guid>https://pauladamsmith.com/blog/2016/08/go-1.7-ssa.html</guid>
      <pubDate>Tue, 16 Aug 2016 06:42:20 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>Go 1.7—<a href="https://golang.org/dl/">out today!</a>—features an new
<a href="https://docs.google.com/document/d/1szwabPJJc4J-igUZU4ZKprOrNRNJug2JPD8OYi3i1K0/edit">SSA-based compiler backend</a>. SSA is a method of describing
low-level operations like loads and stores that roughly map to machine
instructions, with the special difference that SSA acts as though it has
infinite number of registers. This is not especially interesting on its own,
except that it enables a class of well-understood optimization passes that make
the resulting binary smaller in code size and faster. The new release of Go is
an indication the implementation is maturing and starting to take advantage of
techniques and practices adopted in the <a href="http://llvm.org/">wider world of compiler technology</a>.</p>
<p>In addition to the performance benefits of the new SSA-based backend, there is a
suite of new tools that allow a developer to interact with the SSA
machinery. One such tool outputs the intermediate SSA statements, optimization
passes, and resulting Go-flavored assembly. This is done by setting the
environment <code>GOSSAFUNC</code> to the name of a function to disassemble when using the
<code>go</code> tool, for example:</p>
<pre><code class="language-shell">$ GOSSAFUNC=main go build
</code></pre>
<p>This invocation will output to the terminal, but the more interesting artifact
is an HTML file, named <code>ssa.html</code>, written out to the current directory. Open
the file in your web browser and you’ll see something like:</p>
<p><img src="/images/gossa/ssa.html.png" alt="screenshot of SSA" /></p>
<p>What you are looking at is a table with many columns extending to the right,
each one except for the first and last representing an optimization pass over
the preceding SSA form. (I counted 37 separate passes.) The first column is the
the compiler’s initial, unoptimized SSA output, and the last column is the
Go-flavored assembly that will be turned into machine code for the final
compiled binary executable or shared library.</p>
<p><img src="/images/gossa/ssahtmlscroll.gif" alt="anim gif of scrolling through SSA" /></p>
<p>While this can look intimidating to the uninitiated, SSA is relatively simple by
design -- each line represents a either a value being assigned the result of an
instruction (i.e., one of the infinite number of registers), or a label of a
&quot;basic block&quot; (a set of statements, aka, the things between curly braces in
source code), or the exit of a basic block which jumps execution to a different
basic block (eg., control flow like an if-statement or returning from a function
call).</p>
<p>For example:</p>
<pre><code>v4 = Const64 &lt;int&gt; [42]
</code></pre>
<p>Means assign the 64-bit integer constant value 42 to the register labeled <code>v4</code>.</p>
<pre><code>b5: ← b4
  v15 = Copy &lt;mem&gt; v14
  v16 = StaticCall &lt;mem&gt; {runtime.printnl} v15
Call v16 → b6
</code></pre>
<p>Means <code>b5</code> is the label for a basic block with two statements. It concludes with
an exit <code>Call</code> instruction, taking program execution to another basic block,
<code>b6</code>, when returning from the function call that produces the <code>v16</code> value.</p>
<p>The tokens like <code>Const64</code>, <code>Copy</code>, and <code>StaticCall</code> are analogous to assembly
instructions like <code>MOV</code> and <code>LEA</code>.</p>
<p>One special operation is <code>Phi</code>, or a &quot;Phi node&quot;. Notice that a Phi node takes
two arguments, which are two values. Also notice that a basic block with a Phi
node has two basic block labels next to its own label, unlike every other basic
block:</p>
<pre><code>b3: ← b1 b2
   v20 = Phi &lt;int&gt; v4 v6
   ...
</code></pre>
<p>This is an interesting construct and it relates to program control flow. A basic
block is defined by having a single entry and a single exit point, and having a
set of statements that execute sequentially (i.e., no branching logic) in
between. And &quot;SSA&quot; stands for &quot;<a href="https://en.wikipedia.org/wiki/Static_single_assignment_form">single static assignment</a>&quot;, which means
that each value is assigned one and only one time. But what do you do if you
have a reference to a variable that could have different values depending on
which branch of an <code>if</code> statement the program took? A Phi node is a way of
resolving this apparent contradiction. Since each branch of an <code>if</code> statement by
definition assigns to a unique value, a Phi node coalesces them into the final
value depending on which branch was actually taken. So you can think of it as
the run-time retrieval of a value based on some condition. This is why the block
has two dependencies at the top rather than just one.</p>
<p>Let’s write a silly program to motivate some examples:</p>
<pre><code class="language-go">package main

func main() {
	x := 5

	if 1 &lt; 0 {
		x = -42
	}

	println(x)
}
</code></pre>
<p>Let’s start with the initial basic block, <code>b1</code>:</p>
<pre><code>b1:
  v1 = InitMem &lt;mem&gt;
  v2 = SP &lt;uintptr&gt;
  v3 = SB &lt;uintptr&gt;
  v4 = Const64 &lt;int&gt; [5]
  v5 = ConstBool &lt;bool&gt; [false]
  v6 = Const64 &lt;int&gt; [-42]
  v11 = OffPtr &lt;*int64&gt; [0] v2
If v5 → b2 b3
</code></pre>
<p>After some program initialization, <code>v4</code> is the assignment to the local var <code>x</code>
in our code of the constant 5. Go knows at compile-time that <code>1 &lt; 0</code> is always
false so it just assigns false to <code>v5</code>. <code>v6</code> is the assignment of -42 to <code>x</code>
that will happen during program execution.</p>
<p>At the end we have the basic block exit, <code>If v5 → b2 b3</code>. This tests the truth
value of <code>v5</code> to decide whether to jump program execution to either <code>b2</code> (if
true) or <code>b3</code> (if false). This is similar to the following chunk of assembly:</p>
<pre><code class="language-asm">    JNZ b2
b3:
  ...
b2:
  ...
</code></pre>
<p>One nice thing about the Go SSA HTML view is you can click on any token in the
SSA form and it will highlight the references to and from that element.</p>
<p>
    <img alt="clicking on SSA elements" src="/images/gossa/ssabblocks.gif" class="no-100-pc-width">
</p>
<p>We can see from the different colors how the control flow will go. You can
visually connect the blocks of code that will execute and the assignments,
function calls, and additional branching that will result.</p>
<p>Clicking on the Phi node and its dependencies, you can see from where the
possible values come from in previous control flow.</p>
<p><img src="/images/gossa/phinodehl.png" alt="highlighted Phi node" /></p>
<p>Moving on, the function call prints out the integer value is in the following
basic block:</p>
<pre><code>b4: ← b3
  v9 = Copy &lt;int&gt; v20
  v10 = Copy &lt;int64&gt; v9
  v12 = Copy &lt;mem&gt; v8
  v13 = Store &lt;mem&gt; [8] v11 v10 v12
  v14 = StaticCall &lt;mem&gt; {runtime.printint} [8] v13
Call v14 → b5
</code></pre>
<p>The <code>StaticCall</code> instruction invokes the function from the Go runtime that is
specialized to format integer values and print them to the terminal. One
interesting thing to note is that the preamble to call sets some things up in
memory, the location of which is fed to the <code>printint</code> function. If you notice,
<code>v11</code> refers back to the value set in <code>b1</code>, which is a pointer offset from <code>v2</code>,
which was set from the stack pointer <code>SP</code> near the top of the program
initialization. Which makes sense, because the generate assembly language needs
concrete memory locations to address when invoking functions taking pointers.</p>
<p>There’s much more to investigate here, including the particular optimization
passes, and tracing how individual instructions make their way through to the
final assembly or are eliminated. But hopefully this has given you an
introduction into SSA and how it maps to constructs in your applications.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Modifying a Go slice in-place during iteration</title>
      <link>https://pauladamsmith.com/blog/2016/07/go-modify-slice-iteration.html</link>
      <guid>https://pauladamsmith.com/blog/2016/07/go-modify-slice-iteration.html</guid>
      <pubDate>Tue, 26 Jul 2016 04:26:51 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p><strong>Update:</strong> See a better way of doing this below.</p>
<hr />
<p>I'll often have a slice that I want to filter down on, removing elements based on some test, and I would prefer to modify the slice in-place for whatever reason, either because I want to retain the reference to the original slice or I don't want to allocate a new slice as destination for the desired values.</p>
<p>You might think that modifying a slice in-place during iteration should not be done, because while you can modify <em>elements</em> of the slice during iteration if they are pointers or if you index into the slice, changing the <em>slice itself</em> by removing elements during iteration would be dangerous.</p>
<p>Here's a straightforward way to accomplish it. The idea is that, when you encounter an element you want to remove from the slice, take the beginning portion of the slice that has values that have passed the test up to that point, and remaining portion of the slice, i.e., after that element to the end, and copy them <em>over</em> the original slice. Then, assign a slice expression up to the number of values that passed the test to the original variable.</p>
<p>Here's an example. Let's say I have a slice of integers, and I only want to retain the even ones.</p>
<pre><code class="language-go">
var x = []int{90, 15, 81, 87, 47, 59, 81, 18, 25, 40, 56, 8}

i := 0
l := len(x)
for i &lt; l {
	if x[i] % 2 != 0 {
		x = append(x[:i], x[i+1:]...)
		l--
	} else {
		i++
	}
}
x = x[:i]
	
fmt.Println(x)
// [90 18 40 56 8]
</code></pre>
<p>The <code>i</code> variable is used to keep track of the number of even values found in the slice. When an element is odd, we create a temporary slice using <code>append</code> and two slice expressions on the original slice, skipping over the current element. The temporary smaller slice is copied over the existing, shifting down the remaining values. The <code>l</code> variable makes sure we make the right number of comparisons despite moving things around. It's important to note the memory location of the original slice is unchanged with the copy. No new heap allocations are performed, even with the temporary slice.</p>
<hr />
<p><strong>Update:</strong> A number of people, including here in comments and on <a href="https://www.reddit.com/r/golang/comments/4uoqr5/modifying_a_go_slice_inplace_while_iterating_over/">the golang reddit</a>, have pointed out that the method I outline here is pretty inefficient; it's doing a lot of extra work, due to the way I'm using <code>append</code>. A <em>much</em> better way to go about it is the following, which also happens to have already been pointed out in the <a href="https://github.com/golang/go/wiki/SliceTricks#filtering-without-allocating">official Go wiki</a>:</p>
<pre><code class="language-go">y := x[:0]
for _, n := range x {
    if n % 2 != 0 {
        y = append(y, n)
    }
}
</code></pre>
<p>This also has the benefit of being simpler and shorter. Use it instead!</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>A simple way to limit the number of simultaneous clients of a Go net/http server</title>
      <link>https://pauladamsmith.com/blog/2016/04/max-clients-go-net-http.html</link>
      <guid>https://pauladamsmith.com/blog/2016/04/max-clients-go-net-http.html</guid>
      <pubDate>Wed, 13 Apr 2016 22:55:13 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>This is a simple and easily generalizable way to put an upper-bound on the
maximum number of simultaneous clients to a Go <code>net/http</code> server or handler.</p>
<p>The idea is to use a counting semaphore, modeled with a buffered channel, to
cause new clients to queue which arrive after the <code>n</code>th current client, where
<code>n</code> is the size of the buffer.</p>
<p>Ideally, we wouldn't want to limit the amount of concurrency to our application,
but practically, there are limits on underlying resources, and forcing clients
to queue after a certain limit gives us control over that resource utilization.</p>
<p>Let's say we have a simple HTTP handler that requests access to some expensive
resource, like a database or complex computation:</p>
<pre><code class="language-go">package main

import (
    &quot;io&quot;
    &quot;log&quot;
    &quot;net/http&quot;
)    

func main() {
     http.Handle(&quot;/&quot;, http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })

     log.Fatal(http.ListenAndServe(&quot;:8080&quot;, nil))
}
</code></pre>
<p>The handler can be requested by an unbounded number of clients, potentially
exhausting our resources.</p>
<p>Let's add a counting semaphore that will gate entry into the handler:</p>
<pre><code class="language-go">func main() {
     const maxClients = 10
     sema := make(chan struct{}, maxClients)

     http.Handle(&quot;/&quot;, http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
         sema &lt;- struct{}{}
         defer func() { &lt;-sema }()

         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })
</code></pre>
<p>We make a channel of type <code>struct{}</code>, because we are only interested in the
send/receive semantics of the channel, not its value. The first statement of the
handler is a send on the channel, which will succeed up to <code>maxClients</code> number
of simulatenous requests. Think of the buffered channel as having empty slots,
and being able to send on it means that you can fill a slot and proceed. If
there are no empty slots, in other words, the length of the channel is equal to
the buffer size, then the send will block, and will have to wait to proceed
until a slot frees up. The next statement defers until after the handler has
returned or panicked, and frees a slot by receiving from the channel.</p>
<p>If we have more than one handler to limit access to, we can move the semaphore
into a middleware and wrap the original handler, leaving the body of it
unchanged:</p>
<pre><code class="language-go">package main

import (
    &quot;io&quot;
    &quot;log&quot;
    &quot;net/http&quot;
)    

func maxClients(h http.Handler, n int) http.Handler {
     sema := make(chan struct{}, n)

     return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
         sema &lt;- struct{}{}
         defer func() { &lt;-sema }()

         h.ServeHTTP(w, r)
     })
}

func main() {
     handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })

     http.Handle(&quot;/&quot;, maxClients(handler, 10))

     log.Fatal(http.ListenAndServe(&quot;:8080&quot;, nil))
}
</code></pre>
<p>Note that this implementation will cause clients beyond the maximum number to
queue without bound, until they hit the system limit of the <code>listen(2)</code> backlog.</p>
<p>This pattern can be used to control the amount of concurrency to any resource,
not just <code>net/http</code> handlers.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>The Bloomingdale Trail</title>
      <link>https://pauladamsmith.com/blog/2015/06/bloomingdale_trail.html</link>
      <guid>https://pauladamsmith.com/blog/2015/06/bloomingdale_trail.html</guid>
      <pubDate>Fri, 05 Jun 2015 21:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>There was a moment last Friday while I was on top of the soon-to-open
<a href="http://www.bloomingdaletrail.org/">Bloomingdale Trail</a> with a tour group when I had a strange feeling. We had
been walking for more than a mile to that point, 17 feet up above Chicago
streets, passing houses, factories, and alleyways in Logan Square. I paused to
consider the feeling, and I realized it was that I had been walking continuously
for half an hour through a Chicago neighborhood and not once had to contend with
an intersection or motor vehicle in all that time. Unless you live in the city
and walk or bike often, it's hard to convey how pleasantly odd that feeling was.
It's not something you can get from a typical park or trail. Parks are usually
compact open spaces, polygons boxed in by streets. Most other trails are at
grade level, so whatever flow or momentum you build up is periodically
interrupted by an intersection. The Bloomingdale Trail, however, is both apart
from and woven through the neighborhoods it is situated in. If you go the length
from the west trailhead to the east trailhead or vice versa, you'll have
travelled 2.7 miles -- a massive span across 4 Chicago neighborhoods -- in an
entirely human-mediated fashion. And yet you never feel as though you've taken
yourself out of the fabric of the city, as you might when going into a park.
Thanks to periodically spaced adjacent parks and access ramps, you can dip in
and out of the Trail as casually or deliberately as you choose. You gain both
a new vista on the city, and a deeper connection to the neighborhoods you've
always known. It's remarkable what a mere 17 feet of elevation can do to both
take you out of the city and give you greater access to it.</p>
<img class="img-responsive" alt="Photo of people walking on elevated Bloomingdale Trail and street below" src="/images/fbt_elevated.jpg">
<p>It is this embeddedness that I believe will ultimately make The Bloomingdale
Trail and the entire <a href="http://the606.org/">606</a> system of parks a success. It's not a jewel,
a thing to be admired, with its aesthetics upfront. It's a relentessly practical
bit of new human-scale infrastructure in a vibrant residential area. It will
materially improve the lives of its neighbors each day by enabling them to be
active, to commute, to play, and to discover in a new and unique way. It's worth
remembering that the project was funded largely by federal transportation
dollars, earmarked for reducing traffic congestion and air pollution. People
will wonder what this thing is, and the answer will be in its daily use.</p>
<img class="img-responsive" alt="Photo of rainbow over The Bloomingdale Trail" src="/images/fbt_rainbow.jpg">
<p>I remember walking around the old Bloomingdale Line, a disused elevated railroad
embankment, in 2002 with a group of work colleagues. We would sometimes take our
lunch up top, ducking under a fence at Milwaukee and Leavitt to gain access. The
germ of the Friends of the Bloomingdale Trail was planted there; the non-profit
community organization officially formed a year later. The circumstances at the
time were fortunate: the development of the High Line in New York provided
a template and a healthy competitive jolt; the railroad company was looking to
rid themselves of their responsibilities to the line; the City wanted to tear
down the embankment and spanning viaducts, providing further impetus; and
crucially, the rights-of-way were all contiguous and owned by the City: there
need be no time-consuming negotiating with private owners to acquire the trail's
property, as there was in New York. From there we held <a href="https://www.flickr.com/gp/psmith/32DCKi">community meetings</a>, <a href="http://www.bloomingdaletrail.org/img/Trailcleanup01.jpg">trash
pick-up days</a>, <a href="https://www.flickr.com/gp/psmith/0qWMDh">festivals</a>, goofy but earnest <a href="http://www.bloomingdaletrail.org/img/valentines.jpg">Valentine's Day events</a>, <a href="http://www.bloomingdaletrail.org/archive/#fbt-walking-tour-notes">led tours</a>,
<a href="https://www.flickr.com/gp/psmith/92r243">pitched aldermen and city planners</a>, <a href="http://www.bloomingdaletrail.org/archive/#bloomingdale-trail-mural-project">documented the Trail as it existed</a>, <a href="https://www.flickr.com/gp/psmith/71GZ13">helped open a new neighborhood park next to the Trail</a>, printed
<a href="http://www.bloomingdaletrail.org/archive/#walk-bike-run-poster">posters</a> and <a href="http://www.bloomingdaletrail.org/archive/#fbt-brochure">brochures</a>, <a href="http://www.bloomingdaletrail.org/archive/#chicago-public-art-group-albany-whipple-workshop-flyer">hosted arts events</a>, let <a href="http://www.bloomingdaletrail.org/reframing-ruin/david-schalliol/">David Schalliol do his magic</a>, connected with <a href="https://www.cityofchicago.org/city/en/depts/dcd/supp_info/logan_square_openspaceplan.html">open space plans</a>, and
started a partnership with the <a href="http://www.tpl.org/">Trust for Public Land</a> and the City of
Chicago to design and build the Trail. In 2007 and 2008, we <a href="https://www.flickr.com/photos/psmith/sets/72157600029547338">convened neighbors
in a series of meetings and
surveys</a> to listen
to, capture, and synthesize the community's vision for the project. The product
of this effort, the <a href="http://www.bloomingdaletrail.org/archive/#community-visioning-update">Community Visioning Update</a>, was perhaps our most important
practical work as an organization: this document was incorporated into the
City's official request for proposals for design and construction. To the best
of our ability, we made sure the future Trail would be reflective of the
community it came from and would serve.</p>
<img class="img-responsive" alt="Photo of ramp down from The Bloomingdale Trail" src="/images/fbt_ramp.jpg">
<p>It's time now to celebrate the opening of the Trail and begin a new phase in the
life of FBT. The original goals of the organization were to:</p>
<ul>
<li>Preserve the elevated right of way</li>
<li>Beautify the public space</li>
<li>Create a new, mixed-use trail/linear park</li>
<li>Establish a broad coalition that supports the proposed park</li>
<li>Connect with neighborhood schools and institutions</li>
</ul>
<p>Our <a href="http://www.bloomingdaletrail.org/about/">new mission</a> is to be the community stewards of the Trail, and to
that end, we recently applied and have been approved to be a Chicago Park
District Advisory Council, or PAC. As befits our unusual new park, we're breaking
new ground as a PAC. We're unique in that our bylaws state there will be board
representation from each of the 4 neighborhoods, and from each of the constituent
park PACs (Julia de Burgos, Walsh, Churchill Field, and Kimball). Because no
other park covers as much ground, cuts through as many neighborhoods, and links
up as many adjacent smaller parks, governance and community orginizing around
The Bloomingdale Trail will be a new experiment for all involved.</p>
<p>One last thought. There are very few good west-east routes in Chicago: most
transporation infrastructure radiates from and to the Loop. The Bloomingdale
Trail is a stroke across the spokes, and the physical, economic, and cultural
circulation it promotes will be fascinating to watch. But there are bigger
things at stake. Even before this new park was built, the Trail conspicuously
ended at the north branch of the Chicago River. (Now it ends at Ashland, that
street's bridge having been born-again over Western.) It's always been a dream
and a goal of FBT and the 606 partners to extend the Trail across the river in
a future phase. From there, on-street bicycle paths can be knit together,
ultimately arriving at the lakefront. However, there's an even bigger dream to
be dreamt. A few miles west of the western terminus of the Trail, the Illinois
Prarie Path has its eastern endpoint. The IPP carries you out due west 60 miles
past the outer suburbs. A network of rural trails beyond can be followed all the
way to Iowa. So while we celebrate the opening of Chicago's next great park
tomorrow, the notion of a bicycle trip that begins at the Mississippi River and
ends at Lake Michigan, on bike paths the entire span, should stay in the back of
our minds as a not-too-distant possibility.</p>
<img class="img-responsive" alt="Map of measurement from Mississippi River to Lake Michigan" src="/images/fbt_miss_river_lake_mich_map.png">
<p>Look up! It's The Bloomingdale Trail</p>
<img class="img-responsive" src="/images/fbt/320210120_f84ffca2ff_o.jpg">
<img class="img-responsive" src="/images/fbt/320221183_3866448949_o.jpg">
<img class="img-responsive" src="/images/fbt/3056808206_4ce94f3638_o.jpg">
<img class="img-responsive" src="/images/fbt/320213168_7aefe30df9_o.jpg">

      ]]></content:encoded>
    </item>
    <item>
      <title>Chicago wards &amp; precincts shapefiles in 2015</title>
      <link>https://pauladamsmith.com/blog/2015/02/chicago-wards-precincts-shapefiles.html</link>
      <guid>https://pauladamsmith.com/blog/2015/02/chicago-wards-precincts-shapefiles.html</guid>
      <pubDate>Sat, 28 Feb 2015 01:53:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p><strong>Update:</strong> On April 6, 2015, the City of Chicago updated its Data Portal with
the official <a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards-2015-/sp34-6z76">wards</a> and <a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Precincts-current-/uvpq-qeeq">precincts</a> shapefiles.</p>
<hr />
<p><strong>tl;dr:</strong> I tried to make a map of Chicago election results, I found only
out-of-date wards &amp; precincts shapefiles, I had to FOIA the up-to-date versions,
I got them, I republished them so anyone can download them, and finally made
that map.</p>
<p>Read on for the full saga.</p>
<hr />
<p>After <a href="http://elections.chicagotribune.com/results/">this week’s municipal general elections in Chicago</a>, I was looking
for detailed results in the mayor’s race, which didn’t end Tuesday night but is
<a href="http://www.reuters.com/article/2015/02/25/us-usa-politics-chicago-idUSKBN0LS1B420150225">headed for a run-off</a> between Mayor <a href="http://www.chicagotogether.org/">Rahm Emanuel</a> and
challenger Cook County Commissioner <a href="http://www.chicagoforchuy.com/">Chuy Garcia</a> on April 7.
Specifically, I wanted to see where in the city the support for each candidate
was, and at as granular a level as possible.</p>
<p>The <a href="http://www.chicagoelections.com/en/home.html">Chicago Board of Elections</a> posts vote tallies by precinct (50 wards
in Chicago, with on average 40 precincts per ward). Precincts are the smallest
unit of political geography—in Chicago, they are roughly a few square city
blocks each. Given the neighborhoody nature of Chicago and the block-by-block
affinities that exist (which leads politicians to produce <a href="http://www.our2ndward.org/">carefully sculpted
gerrymanders like the 2nd Ward</a> in order to corral voters into favorable
pens), a map showing the relative intensity of voting percentages per candidate
by precinct would be a good tool for aiding detailed understanding of this
election or any election, and a building block for many possible similar
analyses in the future.</p>
<p>So I set out to make such a map. My plan was to gather the vote totals per
precinct, shapefiles of the city ward and precinct boundaries, and join them
together using tools like <a href="http://d3js.org/">d3</a> to draw a choropleth or thematic map in a web
browser. This is a straightforward plan and is well-trod ground. However,
I naïvely assumed the official source material I gathered would be accurate and
up-to-date.</p>
<p>After scraping the vote totals from the BOE site[<a href="#fn1-2015-02-27"
id="fnr1-2015-02-27" class="fn">1</a>], I downloaded the wards and precincts
shapefiles from the <a href="https://data.cityofchicago.org/">City of Chicago’s Data Portal site</a>, which is
a service that hosts many different types of data, from building permits to
restaurant inspections. I did this by typing “wards” and “precincts” into the
search box and downloading from the results pages the links titled “[Boundaries</p>
<ul>
<li>Wards]<a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards/bhcv-wqkf">dataportalwards</a>” and “<a href="https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Ward-Precincts/sgsc-bb4n">Ward Precincts</a>”. There was
nothing to indicate that these files were out of date, nor anything else to
indicate that they were not the current, authoritative source of these data
sets.</li>
</ul>
<p>I put together a first draft of the map and shared it with <a href="https://twitter.com/joegermuska">some</a>
<a href="http://www.chicagocarto.com/">colleagues</a> who are experts in mapping and Chicago data. They quickly
pointed out that the map appeared to be using the old wards and precincts.[<a
href="#fn2-2015-02-27" id="fnr2-2015-02-27" class="fn">2</a>] In 2012, the <a href="http://www.wbez.org/no-sidebar/approved-ward-map-95662">city
council approved a new set of ward boundaries</a>, redrawing the city’s
political map. They were to go into effect in 2015, and this week’s election,
which included all 50 aldermanic races, were to be contested on this new
geography. The conspicuously missing new 2nd Ward was the tip-off my map was
wrong.</p>
<p>I searched for the updated boundaries, but came up with only unofficial sources,
and only for wards at that. There was the WBEZ map from their <a href="http://www.wbez.org/no-sidebar/approved-ward-map-95662">original 2012
story</a>, and the Tribune had created <a href="http://media.apps.chicagotribune.com/ward-redistricting-2012/index.html">a side-by-side comparison of the old
and new wards</a>. But I couldn’t trust these for my own use, because of
their uncertain provenance. And without matching updated precincts, I couldn’t
join vote totals for use in a map in any case.</p>
<p>Taking a page from the <a href="http://www.derivativeworks.com/2013/02/on-everyblock-and-the-open-data-movement.html">people person at my old job</a>, I made a phone call
to the Board of Elections: maybe I could just ask for the data and they would
give it to me? I stated my request very plainly and without explanation of
motive, and was told to “hold please” a couple of times while I bounced between
departments. A few moments later, I heard “Districts and Boundaries” on the
line. Success! Here was, literally, the person who could help me, right then. Or
so I thought. I repeated my request, and without a moment’s hesitation, the
Districts and Boundaries voice said that I would need to contact the BOE’s
<a href="http://www.foia.gov/">FOIA</a> officer, and here was their email address.[<a href="#fn3-2015-02-27" id="fnr3-2015-02-27" class="fn">3</a>]</p>
<p>It was hard to tell how much of this was bluffing, as in, let’s see you actually
bother to make a FOIA request, but I went ahead and stubbornly wrote an email to
the FOIA officer anyway. I was under no illusions that my request would be
fulfilled quickly enough to make my post-election map still relevant.</p>
<p><img src="/images/foia-email-request.png" alt="Email request to FOIA officer" /></p>
<p>I then <a href="https://twitter.com/paulsmith/status/571024506560647168">took to Twitter</a> to register my displeasure for this state of affairs—we
just had a citywide election for our top local offices, operating on the
assumption of the new city council-vouched districts, and yet, despite nearly
a decade of the open data movement, despite official portaldom, the key base
layers of the political strata were still available only to the learned
monks—and moved on.</p>
<p>Lo, but was my request not answered but a few scant hours later! I can’t tell
you how surprised I was to see this in my inbox:</p>
<p><img src="/images/foia-email-response.png" alt="Email response from FOIA officer" /></p>
<p>I thanked the officer and downloaded the payload, which was a set of 50 folders,
each corresponding to a ward and containing a shapefile of that ward’s precincts
therein. I eyeballed the boundaries with <a href="http://www2.qgis.org/en/site/">QGIS</a> and was satisfied that
they appeared to be legit. (Again, the shape of the notorious 2nd Ward was the
main clue.)</p>
<p>In the absence of official publication, I was determined to at least not have
the next person who goes looking for wards and precincts to wind up in FOIA
land. As relatively pain-free as this episode was, the fact that I had to engage
with the FOIA plumbing in order to fulfill a minor data request is not good. And
there is every reason to think that a typical FOIA request will take orders of
magnitude longer to fulfill than my jackpot.</p>
<p>My approach was to self-publish the data, but to be clear about its source
and my methodology for any transformations. While I’d much rather prefer this
data appear on Data Portal, I’d also prefer not for our collective energies to
be wasted in pursuits such as these.</p>
<p>Regarding those transformations, I had a set of precincts, but I also wanted the
wards that derive from them (a ward is completely defined by its constituent
precincts). I imported the precincts into a PostgreSQL database with the
<a href="http://postgis.net/">PostGIS</a> extension. From there I created wards by grouping precincts
by their ward number, and unioning their geometries (i.e., merging a bunch of
small precinct polygons into one large ward polygon). Then I exported from the
database into various geospatial data formats—Shapefile, TopoJSON, GeoJSON, KML,
etc.</p>
<p>I made these <a href="https://paulsmith.github.io/chicago_wards_and_precincts/"><strong>exports available for download by anyone</strong></a>, hosted on
<a href="https://github.com/paulsmith/chicago_wards_and_precincts">GitHub</a>.</p>
<p>I finally was able to make the map I wanted, at least, the first-order map,
a basic voter preference density map. I hope to build on this data
infrastructure with different overlays, result sets, future elections, and so
on.</p>
<p>You can <a href="http://bl.ocks.org/paulsmith/1564a99cc7b5d3f8e90c"><strong>view the map here</strong></a>; choose between mayoral candidates in the
drop-down selector to update the map with their vote percentages.</p>
<p>With several candidates, in can be useful to see them arrayed as <a href="http://en.wikipedia.org/wiki/Small_multiple">small
multiples</a> for easier comparison[<a href="#fn4-2015-02-27"
id="fnr4-2015-02-27" class="fn">4</a>]:</p>
<p><img src="/images/chi-2015-mayoral-small-multiples.png" alt="Side-by-side maps of Chicago mayoral election results" /></p>
<p>I’d like to see the left hand of the operators of the Chicago Data Portal talk
with the right hand of the Chicago Board of Elections, and simply take down the
pre-2015 ward and precinct boundaries (or better yet, rename them to something
that won’t be mistaken for the most recent version and leave them up for
historical research) and get the current shapefiles uploaded as soon as
possible. In the meantime, I hope that interested parties will avail themselves
of <a href="https://paulsmith.github.io/chicago_wards_and_precincts/">my hosted shapefiles</a>.</p>
<p>More generally I’d like for stakeholders in the world of government data to
reflect on the state of the open data movement, and consider examples such as
these as the tiny abrasions that impede all sorts of productivity, beyond my
modest map-making efforts. On one hand, we’ve made enormous progress; on the
other, we’re still fighting the same 10-year-old battles.</p>
<p>And to the FOIA officer at the BOE who responded so promptly, many thanks!</p>
<hr />
<ol class="footnotes">
    <li id="fn1-2015-02-27">
        It is 2015 and the third-largest U.S. city is still
        publishing official election results on a decade-old system that doesn’t lend
        itself to machine-readability without substantial friction, which violates #5 of
        the <a href="https://public.resource.org/8_principles.html">8 Principles of Open Government Data</a>.
        I wrote <a href="https://gist.githubusercontent.com/paulsmith/1564a99cc7b5d3f8e90c/raw/scrape.py">a
        Python program to extract the data</a> from the particular formatting of the BOE site.
        <a href="#fnr1-2015-02-27">↩</a>
    </li>
    <li id="fn2-2015-02-27">
        In my defense, while I’ve
        lived in Chicago for more than 10 years, I only recently moved back after
        a 5-year hiatus, so my map intuitions are a little stale.
        <a href="#fnr2-2015-02-27">↩</a>
    </li>
    <li id="fn3-2015-02-27">
        Thus arguably in violation of #1, #3, #4, and #6 of
        the <a href="https://public.resource.org/8_principles.html">8 Principles of Open Government Data</a>.
        <a href="#fnr3-2015-02-27">↩</a>
    </li>
    <li id="fn4-2015-02-27">
        For this I just screenshotted and collaged them in an image editor.
        <a href="#fnr4-2015-02-27">↩</a>
    </li>
</ol>

      ]]></content:encoded>
    </item>
    <item>
      <title>How to get started with the LLVM C API</title>
      <link>https://pauladamsmith.com/blog/2015/01/how-to-get-started-with-llvm-c-api.html</link>
      <guid>https://pauladamsmith.com/blog/2015/01/how-to-get-started-with-llvm-c-api.html</guid>
      <pubDate>Wed, 21 Jan 2015 01:53:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>I enjoy making toy programming languages to better understand how compilers
(and, ultimately, the underlying machine) work and to experiment with techniques
that aren’t in my repertoire. <a href="http://llvm.org/">LLVM</a> is great because I can tinker, and
then wire it up as the backend to have it generate fast code that runs on most
platforms. If I just wanted to see my code execute, I could get away with
a simple hand-rolled interpreter, but having access to LLVM’s JIT, suite of
optimizations, and platform support is like having a superpower — your little
toy can perform impressively well. Plus, LLVM is the foundation of things like
<a href="https://github.com/kripken/emscripten/wiki">Emscripten</a> and <a href="http://www.rust-lang.org/">Rust</a>, so I like developing intuition about how new
technologies I’m interested in are implemented.</p>
<p>I’m going to show how to use the LLVM API to programmatically
construct a function that you can invoke like any other and have it execute
directly in the machine language of your platform.</p>
<p>In this example, I’m going to use <a href="http://llvm.org/docs/doxygen/html/group__LLVMC.html">the C API</a>, because it is
available in the LLVM distribution, along with a C++ API, and so is the simplest
way to get started.  There are bindings to the LLVM API in other languages
— Python, OCaml, Go, Rust — but the concepts behind using LLVM to generate code
are the same across the wrapper APIs.</p>
<p>This example sort of skips to the middle phase of compiler construction. Assume
the frontend (lexer, parser, type-checker) has built an <a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree">AST</a> and we’re now
walking it to emit the intermediate representation of the code for the backend
to take and optimize and spit out machine code.</p>
<p>In this case, we’ll just type out the straight-line procedural code for a simple
function that would normally be dynamically cobbled together in a AST walker
function, calling the LLVM API when it encounters certain nodes in the tree.</p>
<p>For the example, we’ll build a simple adder function, which takes two integers
as arguments and returns their sum, the equivalent of, in C:</p>
<pre><code class="language-c">int sum(int a, int b) {
    return a + b;
}
</code></pre>
<p>To be clear about what we are doing here: we are using LLVM to dynamically build
an in-memory representation of this function, using its API to set up things
like function entry and exit, return and parameter types, and the actual integer
add instruction. Once this in-memory representation is complete, we can instruct
LLVM to jump to it and execute it with arguments we supply, just as if it was
a executable we had compiled from a language like C.</p>
<p><a href="https://github.com/paulsmith/getting-started-llvm-c-api/blob/master/sum.c"><strong>Click here to view the final code.</strong></a></p>
<h2>Modules</h2>
<p>The first step is to create a module. A module is a collection of the global
variables, functions, external references, and other data in LLVM. Modules aren’t
quite like, say, modules in Python, in that they don’t provide separate
namespaces. But they are the top-level container for all things built in LLVM,
so we start by creating one.</p>
<pre><code class="language-c">LLVMModuleRef mod = LLVMModuleCreateWithName(&quot;my_module&quot;);
</code></pre>
<p>The string <code>&quot;my_module&quot;</code> passed to the module factory function is an identifier
of your choosing.</p>
<p>Note that as you’re navigating the <a href="http://llvm.org/docs/doxygen/html/group__LLVMC.html">LLVM C API documentation</a>, different
aspects are grouped together under different header includes. Most of what I’m
detailing here, such as modules and functions, is contained in the <code>Core.h</code>
header, but I’ll include others as we move along.</p>
<h2>Types</h2>
<p>Next, I create the <code>sum</code> function and add it to the module. A function consists of:</p>
<ul>
<li>its type (return type),</li>
<li>a vector of its parameter types, and</li>
<li>a set of basic blocks.</li>
</ul>
<p>I’ll get to basic blocks in a moment. First, we’ll handle the type and parameter
types of the function — its prototype, in C terms — and add it to the module.</p>
<pre><code class="language-c">LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);
LLVMValueRef sum = LLVMAddFunction(mod, &quot;sum&quot;, ret_type);
</code></pre>
<p>LLVM types correspond to the types that are native to the platforms we’re
targeting, such as integers and floats of fixed bit width, pointers, structs,
and arrays. (There’s no platform-dependent <code>int</code> type like in C, where the actual
size of the integer, 32- or 64-bit, depends on the underlying machine
architecture.)</p>
<p>LLVM types have constructors, and follow the form &quot;LLVM<em>TYPE</em>Type()&quot;. In our
example, both the arguments passed to the sum function and the function’s type
itself are 32-bit integers, so we use <code>LLVMInt32Type()</code> for each.</p>
<p>The arguments to <code>LLVMFunctionType()</code> are, in order;</p>
<ol>
<li>the function’s type (return type),</li>
<li>the function’s parameter type vector (the arity of the function should match
the number of types in the array), and</li>
<li>the function’s arity, or parameter count,</li>
<li>a boolean whether the function is variadic, or accepts a variable number of
arguments.</li>
</ol>
<p>Notice that the function type constructor returns a type reference. This
reinforces the notion that what we did here is the LLVM equivalent of declaring
a function prototype in C.</p>
<p>The third line in here adds the function type to the module, and gives it the
name <code>sum</code>. We get a value reference in return, which can be thought of as
a concrete location in the code (ultimately, memory) upon which to add the
function’s body, which we do below.</p>
<h2>Basic blocks</h2>
<p>The next step is to add a basic block to the function. Basic blocks are parts of
code that only have one entry and exit point - in other words, there is no other
way execution can go than by single stepping through a list of instructions. No
if/else, while, loops, or jumps of any kind. Basic blocks are the key to
modeling control flow and creating optimizations later on, so LLVM has
first-class support for adding these to our in-progress module.</p>
<pre><code class="language-c">LLVMBasicBlockRef entry = LLVMAppendBasicBlock(sum, &quot;entry&quot;);
</code></pre>
<p>Note the &quot;append&quot; in the name of the function: it’s helpful to think of what
we’re doing as growing a running tally of chunks of code, and so our basic block
is appended relative to the function we added to the module previously.</p>
<h2>Instruction builders</h2>
<p>This notion of a running tally fits with the instruction builder, which is how
we add instructions to our function’s one and only basic block.</p>
<pre><code class="language-c">LLVMBuilderRef builder = LLVMCreateBuilder();
LLVMPositionBuilderAtEnd(builder, entry);
</code></pre>
<p>Similar to appending the basic block to the function, we’re positioning the
builder to start writing instructions where we left off with the entry to the
basic block.</p>
<h3>LLVM IR</h3>
<p>Sidebar: LLVM’s main stock-in-trade is the LLVM intermediate representation, or
IR. I’ve seen it referred to as a midway point between assembly and C. The LLVM
IR is a very strictly defined language that is meant to facilitate the
optimizations and platform portability that LLVM is known for. If you look at
IR, you can see how individual instructions can be translated into the loads,
stores, and jumps of the ultimate assembly that will be generated. The IR has
3 representations:</p>
<ul>
<li>as an in-memory set of objects, which is what we’re using in this example,</li>
<li>as a textual language like assembly,</li>
<li>as a string of bytes in a compact binary encoding, called bitcode.</li>
</ul>
<p>You may see clang or other tools emit LLVM IR as text or bitcode.</p>
<p>Back to our example. Now comes the crux of our function, the actual instructions
to add the two integers passed in as arguments and return them to the caller.</p>
<pre><code class="language-c">LLVMValueRef tmp = LLVMBuildAdd(builder, LLVMGetParam(sum, 0), LLVMGetParam(sum, 1), &quot;tmp&quot;);
LLVMBuildRet(builder, tmp);
</code></pre>
<p><code>LLVMBuildAdd()</code> takes a reference to the builder, the two integers to add, and
a name to give the result. (The name is required due to LLVM IR’s restriction
that all instructions produce intermediate results. This can further be
simplified or optimized away by LLVM later, but while generating IR, we follow
its strictures.) Since the numbers we wish to add are the arguments that were
supplied to the function by the caller, we can retrieve them in the form of the
function’s parameters using <code>LLVMGetParam()</code>: the second argument to is the
index of the parameter we seek from the function.</p>
<p>We call <code>LLVMBuildRet()</code> to generate the return statement and arrange for the
temporary result of the add instruction to be the value returned.</p>
<h2>Analysis &amp; execution</h2>
<p>That concludes the building instructions phase of creating our function; the
module is now complete. The next phase of the example is setting it up for
execution.</p>
<p>First, let’s verify the module. This will ensure that our module was correctly
built and will abort if we missed or mixed up any steps.</p>
<pre><code class="language-c">char *error = NULL;
LLVMVerifyModule(mod, LLVMAbortProcessAction, &amp;error);
LLVMDisposeMessage(error);
</code></pre>
<p>LLVM provides either a JIT or an interpreter to execute the IR we’ve built. It
will create a JIT if it can for the target platform, and fall back to an
interpreter otherwise. In any case, the thing that will run our code is called
the <em>execution engine</em>.</p>
<pre><code class="language-c">LLVMExecutionEngineRef engine;
error = NULL;
LLVMLinkInJIT();
LLVMInitializeNativeTarget();
if (LLVMCreateExecutionEngineForModule(&amp;engine, mod, &amp;error) != 0) {
    fprintf(stderr, &quot;failed to create execution engine\n&quot;);
    abort();
}
if (error) {
    fprintf(stderr, &quot;error: %s\n&quot;, error);
    LLVMDisposeMessage(error);
    exit(EXIT_FAILURE);
}
</code></pre>
<p>We could hard-code some integers to be summed, but it’s easy enough to have our
program receive them from the command line.</p>
<pre><code class="language-c">if (argc &lt; 3) {
    fprintf(stderr, &quot;usage: %s x y\n&quot;, argv[0]);
    exit(EXIT_FAILURE);
}
long long x = strtoll(argv[1], NULL, 10);
long long y = strtoll(argv[2], NULL, 10);
</code></pre>
<p>Now that we have two integers in the representation of our host language, we
need to transform them into the analogous representation in LLVM. LLVM provides
factory functions that convert values into the types we need to pass to our
function:</p>
<pre><code class="language-c">LLVMGenericValueRef args[] = {
    LLVMCreateGenericValueOfInt(LLVMInt32Type(), x, 0),
    LLVMCreateGenericValueOfInt(LLVMInt32Type(), y, 0)
};
</code></pre>
<p>Now for the moment of truth: we can call our (JIT’d) function!</p>
<pre><code class="language-c">LLVMGenericValueRef res = LLVMRunFunction(engine, sum, 2, args);
</code></pre>
<p>We have a result, but it’s still in LLVM-land. We recover it to a C type, the
reverse operation from above, and print the sum:</p>
<pre><code class="language-c">printf(&quot;%d\n&quot;, (int)LLVMGenericValueToInt(res, 0));
</code></pre>
<p>And there we have it. We’ve programmatically constructed a function from the
ground up, and had it run directly in machine code native to our platform. There
is much more to LLVM, including control flow (eg., implementing if/else) and
optimization passes, but we’ve covered the basics that would be in any
LLVM-IR-to-code program.</p>
<h2>Compiling</h2>
<p>In order to compile our program, we need to reference the LLVM includes and link
its libraries. Even though we’ve written a C program, the linking step requires
the C++ linker. (LLVM is a C++ project, and the C API is a wrapper thereof.)</p>
<pre><code class="language-console">$ cc `llvm-config --cflags` -c sum.c
$ c++ `llvm-config --cxxflags --ldflags --libs core executionengine jit interpreter analysis native bitwriter --system-libs` sum.o -o sum
$ ./sum 42 99
141
</code></pre>
<h2>Bitcode</h2>
<p>One final thing. I mentioned previously that LLVM IR has three representations,
including bitcode. Once you have a completed module, you can emit bitcode and
write it out to a file.</p>
<pre><code class="language-c">if (LLVMWriteBitcodeToFile(mod, &quot;sum.bc&quot;) != 0) {
    fprintf(stderr, &quot;error writing bitcode to file, skipping\n&quot;);
}
</code></pre>
<p>From there, you can use tools to manipulate it, like <code>llvm-dis</code> to disassemble the
bitcode into the textual LLVM IR assembly language.</p>
<pre><code class="language-console">$ llvm-dis sum.bc
$ cat sum.ll
; ModuleID = 'sum.bc'
target datalayout = &quot;e-m:o-i64:64-f80:128-n8:16:32:64-S128&quot;

define i32 @sum(i32, i32) {
entry:
  %tmp = add i32 %0, %1
  ret i32 %tmp
}
</code></pre>
<h2>Source code of example</h2>
<p>Here is the complete source of the program from above:</p>
<pre><code class="language-c">/**
 * LLVM equivalent of:
 *
 * int sum(int a, int b) {
 *     return a + b;
 * }
 */

#include &lt;llvm-c/Core.h&gt;
#include &lt;llvm-c/ExecutionEngine.h&gt;
#include &lt;llvm-c/Target.h&gt;
#include &lt;llvm-c/Analysis.h&gt;
#include &lt;llvm-c/BitWriter.h&gt;

#include &lt;inttypes.h&gt;
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;

int main(int argc, char const *argv[]) {
    LLVMModuleRef mod = LLVMModuleCreateWithName(&quot;my_module&quot;);

    LLVMTypeRef param_types[] = { LLVMInt32Type(), LLVMInt32Type() };
    LLVMTypeRef ret_type = LLVMFunctionType(LLVMInt32Type(), param_types, 2, 0);
    LLVMValueRef sum = LLVMAddFunction(mod, &quot;sum&quot;, ret_type);

    LLVMBasicBlockRef entry = LLVMAppendBasicBlock(sum, &quot;entry&quot;);

    LLVMBuilderRef builder = LLVMCreateBuilder();
    LLVMPositionBuilderAtEnd(builder, entry);
    LLVMValueRef tmp = LLVMBuildAdd(builder, LLVMGetParam(sum, 0), LLVMGetParam(sum, 1), &quot;tmp&quot;);
    LLVMBuildRet(builder, tmp);

    char *error = NULL;
    LLVMVerifyModule(mod, LLVMAbortProcessAction, &amp;error);
    LLVMDisposeMessage(error);

    LLVMExecutionEngineRef engine;
    error = NULL;
    LLVMLinkInJIT();
    LLVMInitializeNativeTarget();
    if (LLVMCreateExecutionEngineForModule(&amp;engine, mod, &amp;error) != 0) {
        fprintf(stderr, &quot;failed to create execution engine\n&quot;);
        abort();
    }
    if (error) {
        fprintf(stderr, &quot;error: %s\n&quot;, error);
        LLVMDisposeMessage(error);
        exit(EXIT_FAILURE);
    }

    if (argc &lt; 3) {
        fprintf(stderr, &quot;usage: %s x y\n&quot;, argv[0]);
        exit(EXIT_FAILURE);
    }
    long long x = strtoll(argv[1], NULL, 10);
    long long y = strtoll(argv[2], NULL, 10);

    LLVMGenericValueRef args[] = {
        LLVMCreateGenericValueOfInt(LLVMInt32Type(), x, 0),
        LLVMCreateGenericValueOfInt(LLVMInt32Type(), y, 0)
    };
    LLVMGenericValueRef res = LLVMRunFunction(engine, sum, 2, args);
    printf(&quot;%d\n&quot;, (int)LLVMGenericValueToInt(res, 0));

    // Write out bitcode to file
    if (LLVMWriteBitcodeToFile(mod, &quot;sum.bc&quot;) != 0) {
        fprintf(stderr, &quot;error writing bitcode to file, skipping\n&quot;);
    }

    LLVMDisposeBuilder(builder);
    LLVMDisposeExecutionEngine(engine);
}
</code></pre>
<p>See the <a href="https://github.com/paulsmith/getting-started-llvm-c-api">GitHub repo</a> for the Makefile and details on how to build the example
on your machine.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>T-shirt retirement</title>
      <link>https://pauladamsmith.com/blog/2014/07/tshirt-retirement.html</link>
      <guid>https://pauladamsmith.com/blog/2014/07/tshirt-retirement.html</guid>
      <pubDate>Mon, 28 Jul 2014 05:37:53 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>I said “smell you later” to some printed t-shirts in my possession today.</p>
<h3>Texas &quot;Tremodillo&quot;</h3>
<p><img src="/images/tshirts/IMG_20140726_124258.jpg" alt="Texas Tremodillo t-shirt front" /></p>
<p><img src="/images/tshirts/IMG_20140726_124310.jpg" alt="Texas Tremodillo t-shirt back" /></p>
<p>I stopped attending <a href="http://www.smcm.edu/">college</a> after my freshman year, in part to go be the
<a href="http://en.wikipedia.org/wiki/Guitar_technician">guitar tech</a> for a <a href="http://articles.baltimoresun.com/1998-04-02/entertainment/1998092019_1_play-guitar-foam-big-windshield">band</a>. This was 1996. Some friends of mine from
high school had formed it, and a major label had signed them to produce
a record. The band had me spend a week with César Díaz at his home and workshop
in Pennsylvania. César had been the tech for guitarists like Stevie Ray Vaughn
and Eric Clapton. Vaughn was, I think, César’s idol. He was a guitar player
himself, and even had his guitar set up like Stevie’s, with heavy strings and
high action. César had stopped touring, and made guitar amps and effects pedals
instead. He also would <a href="https://www.youtube.com/watch?v=VluwmN-GRAA">teach new guitar techs the secrets of the
profession</a>. The Texas Tremodillo was one of two pedals he made. It had
a <a href="https://www.youtube.com/watch?v=9yPmotQ2kKw">tremolo</a> effect, which is like a fast, regular wobble. We would spend
the day in his workshop, tinkering on an amp that had blown a capacitor, or on
a pedal that was buzzing. He was soft-spoken, reserved, and didn’t have a lot of
patience for others. At night we’d get dinner at a local Indian place, or eat
with his wife, whom I remember being kind, and their young son. One day we drove
in to New York, somewhere in the Lower East Side or Village, to peck around at
a used guitar show in an auditorium. We ran into <a href="http://en.wikipedia.org/wiki/Jimmy_Vivino">Jimmy Vivino</a> outside. He
was wearing a pork pie hat and we walked around with him, looking at
guitars—they were close friends. Later that night, Vivino and a bunch of other
guys showed up at César’s workshop to jam. They played “The Weight” and many
other classic rock songs. César soloed on an Stevie Ray Vaughn song.  César was
kind to me in the end. He wasn’t too annoyed when I would later call him to ask
what to do about a failing pickup, or if I could use these tubes instead of
those in this amp. I stopped working for the band after a year or so, and forgot
about César, except when I would see the Tremodillo shirt in my closet. I rarely
wore it—it was too big, and I felt like I should preserve it somehow, which
I never made an attempt to do. Sometime later I read that he died in the early
2000s. He had been sick with liver failure not long after I visited him. At one
point he had a transplant, but he only lived another couple of years. RIP,
César.</p>
<h3>Vote, F*cker</h3>
<p><img src="/images/tshirts/IMG_20140726_124417.jpg" alt="Vote, F*cker t-shirt" /></p>
<p>This is a shirt Ben Helphand gave me after a trip he took to Oregon. This would
have been around 2004. The <a href="http://en.wikipedia.org/wiki/Bus_Project#.22Vote.2C_F.2Acker.22">Bus Project</a> had made the shirt. Ben and
I became friends after we went to Minnesota in 2002, volunteering on the senate
campaign of the late <a href="http://en.wikipedia.org/wiki/Paul_Wellstone#Death">Paul Wellstone</a>. After that, we would scheme up ways
to try to improve small-d democratic participation. The Bus Project was a big
part of the inspiration that led to the creation of the <a href="http://electioncalendar.net/">Election Day Advent
Calendar</a> in 2006. I’ll miss the Vote, F*cker shirt but many washes
rendered it too shrunken and I looked like a sausage in it.</p>
<h3>Pope Benedict’s Army</h3>
<p><img src="/images/tshirts/IMG_20140726_124533.jpg" alt="Pope Benedict’s Army t-shirt" /></p>
<p>In 2005, I played on a co-ed softball team with my then-girlfriend, now-wife
Michelle. We had been dating for a few months, and exploring a new group of
friends in common together, some of whom were on the team. A conclave had
elected Joseph Ratzinger pope on April 19. Three weeks later, the team
organizer, John Pick, wrote the potential players an email:</p>
<blockquote>
I pulled the trigger on the league and put it on my
credit card.<br>
<br>
Belmont and California<br>
6:15- 7:15<br>
Thurs nights<br>
starting June 2<br>
80 per person.<br>
<br>
we're called Pope Benedict's Army
</blockquote>
<p>IIRC, we did not do the pontiff proud. I think we might have won one game?</p>
<p>Molly Sircher, who was our ringer, conceived and produced the shirt. She had
played softball at DePaul, and was by far the best player on our team. I wanted
to hang on to the shirt, but the shape was a little boxy, and it had a musty
smell it picked up probably during that summer and never got rid of.</p>
<h3>PyCon US 2012</h3>
<p><img src="/images/tshirts/IMG_20140726_124615.jpg" alt="PyCon US 2012 t-shirt" /></p>
<p>In 2012, I gave a <a href="http://pyvideo.org/video/680/spatial-data-and-web-mapping-with-python">talk</a> at <a href="https://us.pycon.org/2012/">PyCon</a> for the first and so far only
time. The U.S. conference was in Santa Clara, CA, near San Jose. At the time,
I was <a href="/blog/2011/09/dnc.html">the deputy director of technology at the DNC</a>. I was eager to attend
tech conferences during my tenure there, to try to recruit software engineers to
work with me on the campaign. This trip was a bust on that score, and I felt
afterwards that my presentation had been lackluster. In retrospect, it was
a stressed time. I felt like we didn’t have enough engineering help at the DNC
to get through the campaign, and that it was difficult to attract any. I always
liked <a href="http://gazit.me/">Idan Gazit’s</a> official PyCon snake logo that year and thought it was
a handsome shirt. But it’s too small now, into the donation pile you go.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>quickserver</title>
      <link>https://pauladamsmith.com/blog/2014/06/quickserver.html</link>
      <guid>https://pauladamsmith.com/blog/2014/06/quickserver.html</guid>
      <pubDate>Fri, 20 Jun 2014 02:44:10 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>Everyone knows <code>python -m SimpleHTTPServer</code> to start a quick webserver in
a directory, it’s pretty awesome. It either mounts on port 8000 by default or
you can give it an alternate port as an command line argument. But if you’re
like me, you have lots of server processes running at once, you often get
conflicts where the port is already in use, or you have to hunt and peck for
a free one.</p>
<p>It’s much better to just let the OS assign an unused port to this quick
webserver process, since you don’t really care where it goes. You can do this by
passing 0 as the port argument, and that totally works: Python prints out the
port it started the HTTP server on. There’s just one problem that trips me up:
it prints out this new port number in such a way that you have to either mouse
over, select, copy, open a new tab, paste, after typing in “localhost” or
“0.0.0.0”, or you have to eyeball it and type it in with the new tab:</p>
<pre><code>$ python -m SimpleHTTPServer 0
Serving HTTP on 0.0.0.0 port 61200 ...
</code></pre>
<p>See what I mean, you have to snag that 61200 somehow. I just want to start
a webserver and have it immediately open to that address in my browser! That
output should be clickable or hook into OS X’s <code>open</code>.</p>
<p>So <a href="https://github.com/paulsmith/quickserver/blob/master/quickserver">this shell script</a> does that.</p>
<pre><code>$ ./quickserver
Serving HTTP on 0.0.0.0 port 61209 ...
http://0.0.0.0:61209/
127.0.0.1 - - [19/Jun/2014 17:21:56] &quot;GET / HTTP/1.1&quot; 200 -
127.0.0.1 - - [19/Jun/2014 17:21:57] code 404, message File not found
127.0.0.1 - - [19/Jun/2014 17:21:57] &quot;GET /favicon.ico HTTP/1.1&quot; 404 -
</code></pre>
<p><img src="https://i.imgur.com/0eb9q9Q.png" alt="" /></p>
<p>Probably too small to deserve it’s own repo but I figured someone might want to
make it work on Ubuntu or whatever. <a href="https://github.com/paulsmith/quickserver">Here it is on GitHub</a>.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Things of recent interest</title>
      <link>https://pauladamsmith.com/blog/2014/05/recent-interests.html</link>
      <guid>https://pauladamsmith.com/blog/2014/05/recent-interests.html</guid>
      <pubDate>Thu, 22 May 2014 01:23:10 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>Here are a few things that have kept my interest lately:</p>
<ul>
<li>
<p><strong><a href="https://mollyrocket.com/861">Immediate-Mode Graphical User Interfaces</a></strong> Immediate-mode GUI is
a straight-forward way of rendering a UI. It’s so simple, in fact, that I had to
watch the video twice and do a <a href="http://sol.gfxile.net/imgui/index.html">tutorial</a> to understand it. (I did the
tutorial using SDL on OS X, and then ported it to <a href="/p/imgui/">JavaScript and
canvas</a>. Incidentally, I also used the <a href="https://github.com/paulsmith/pauladamsmith.com/blob/master/p/imgui/Makefile">C preprocessor on my
JavaScript file</a>, following <a href="http://www.nongnu.org/espresso/js-cpp.html">this</a>, to get statically-generated
IDs for the widgets, it worked well.) The short explanation of immediate-mode
GUI is, in your render() function that’s called for each frame of your
application (ala requestAnimationFrame), you call functions that handle
everything needed to draw, handle events, change state, and trigger other
events, for your UI’s widgets. Your code looks something like <code>if (button(id, x, y)) buttonWasPressed();</code>, and that’s the entirety of rendering a button widget
to the screen and handling click events on it. (In most cases, the widget
functions return a boolean of whether the button was pressed, text field was
changed, etc.) There are no callbacks or separate bindings. You maintain a tiny
bit of global state that helps coordinate all the action. The upside is you have
total control over your UI’s appearance and behavior. The downside is, you have
to implement all of your UI’s appearance and behavior yourself. My feeling so
far is that it is not something you would do if you were just implementing
a typical UI in a web browswer, because you have all the browser’s widgets
already at your disposal (not to mention HTML and CSS layout). You’d be
reinventing the wheel.  But it seems an ideal approach for a game UI (which is
where I believe the idea originated, in the game development world), on
platforms where you don’t already have a core UI or widget library available, in
a native mobile application where performance is paramount, or any kind of
custom application, even on the web, where you want or need complete over the
UI, because, for instance, the supplied browser form elements don’t suffice. For
example, immediate-mode GUI would fit something like
<a href="http://soundslice.com">Soundslice</a>’s custom interface perfectly.
(<a href="http://jlongster.com/Removing-User-Interface-Complexity,-or-Why-React-is-Awesome">via</a>)</p>
</li>
<li>
<p><strong><a href="https://www.youtube.com/watch?v=XRYN2xt11Ek">Functional reactive programming</a></strong> This was an eye-opening talk
for me. FRP could show us the way out of the fly bottle of complicated,
callback-knotted async JavaScript UIs in the browser. The core idea is to treat
events not as isolated occurances to be handled on a per-callback basic, but
instead as collections, and once you do that, you have the power of higher-order
functions like map, reduce, filter, and merge to describe complex behaviors as
sort of a pipeline of collection processing. If you took Python’s list and
generator comprehensions to browser events, you start to get the idea.
<a href="https://github.com/Reactive-Extensions/RxJS">RxJS</a> is the tool highlighted in the talk, but <a href="https://github.com/baconjs/bacon.js">bacon.js</a> also
seems to be a popular FRP library for JavaScript (haven’t tried it myself).
There’s also a <a href="https://jhusain.github.io/learnrx/">browser-based FRP tutorial</a> to work through.</p>
</li>
<li>
<p><strong><a href="https://github.com/google/traceur-compiler">Traceur</a></strong> Programming FRP in JavaScript becomes a lot more
pleasant with the new anonymous function syntax (<code>(x) =&gt; x + 1</code> instead of
<code>function(x) { return x + 1; }</code>) coming in JavaScript 6, or ES6. Traceur
compiles ES6 to JavaScript that will run in current browsers, so you can code
and get the benefit of the new syntax and other upcoming language features now.
I have it as a build step in a Makefile, alongside minification. Then
presumably, barring language-breaking changes, you’d be able to remove the build
step at some future date when ES6 has become widely adopted.</p>
</li>
<li>
<p><strong><a href="http://elm-lang.org/">Elm</a></strong> Elm is an entire language built around FRP which targets the
browser. It is a Haskell or OCaml-like language that compiles down to HTML, CSS,
and JavaScript. It seems to rely on the <a href="http://blog.jle.im/entry/inside-my-world-ode-to-functor-and-monad">functor</a> concept, which it
calls ‘lift’, to convert browser events into something its built-in higher-order
functions can process. It’s arguable that because it is a functional language
like Haskell, it’s more naturally suited for dealing with the sorts of
concurrency issues in UIs that libraries like RxJS were created to address in
JavaScript. I’m still just in playground mode with it.</p>
</li>
<li>
<p><strong><a href="http://confreaks.com/events/gophercon2014">GopherCon talks</a></strong> It says something about the Go community how
uniformly excellent and entertaining these talks are. Interesting and dense with
practical knowledge.</p>
</li>
</ul>
<p>I also recently tried to teach myself <a href="http://research.swtch.com/acme">Acme</a>. You can certainly glimpse
the power of a system like that. But ultimately I decided editing speed is more
important to me, and I’m pretty fast in Vim, so I abandoned the effort.</p>
<p><a href="http://coreos.com/blog/zero-downtime-frontend-deploys-vulcand/">CoreOS</a> seems like it could become pretty important.</p>
<p>Programming a computer, still a fun thing to do.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Fixing healthcare.gov</title>
      <link>https://pauladamsmith.com/blog/2014/03/fixing-healthcare.gov.html</link>
      <guid>https://pauladamsmith.com/blog/2014/03/fixing-healthcare.gov.html</guid>
      <pubDate>Mon, 03 Mar 2014 01:04:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>The <a href="http://time.com/10228/obamas-trauma-team/" title="Obama’s Trauma Team">story of how HealthCare.gov was fixed</a> is told by Stephen
Brill in the cover story of the March 10, 2014 issue of TIME magazine.</p>
<p><a href="http://time.com/10228/obamas-trauma-team/"><img src="/images/time-cover.jpg" width="770"></a></p>
<p>I also appeared on All In with Chris Hayes on MSNBC on February 28,
2014 to <a href="http://www.msnbc.com/all-in/watch/the-nerds-who-saved-obamacare-175808579562" title="The nerds who saved Obamacare">talk about it and my time with the ad hoc team</a>.</p>
<p><a href="http://www.msnbc.com/all-in/watch/the-nerds-who-saved-obamacare-175808579562"><img src="/images/all-in.jpg" width="770"></a></p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Helping healthcare.gov</title>
      <link>https://pauladamsmith.com/blog/2013/10/healthcare.gov.html</link>
      <guid>https://pauladamsmith.com/blog/2013/10/healthcare.gov.html</guid>
      <pubDate>Thu, 31 Oct 2013 22:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>For the past 18 days, I have been part of the so-called “<a href="http://www.hhs.gov/digitalstrategy/blog/2013/10/more-on-the-tech-surge.html">tech surge</a>” that
is helping to fix <a href="https://healthcare.gov/">healthcare.gov</a>.</p>
<p>We have already <a href="http://www.hhs.gov/digitalstrategy/blog/2013/11/healthcare-gov-progress-update.html">improved performance and stability</a> of the site, and have
helped to establish better processes for getting things done. There is still a
lot of work to do to make the site the stable platform it needs to be.</p>
<p>There is much talk about, but for now I am staying focused on the work ahead.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>healthcare.gov and ACA marketplace sites from the perspective of a software engineer</title>
      <link>https://pauladamsmith.com/blog/2013/10/healthcare.gov-from-programmers-perspective.html</link>
      <guid>https://pauladamsmith.com/blog/2013/10/healthcare.gov-from-programmers-perspective.html</guid>
      <pubDate>Fri, 04 Oct 2013 21:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p><em>Cross-posted at <a href="http://talkingpointsmemo.com/cafe/a-programmer-s-perspective-on-healthcare-gov-and-aca-marketplaces">Talking Points Memo</a></em></p>
<p>Full disclosure: my wife works at the Centers for Medicare and
Medicaid Services (and this post is entirely my views, not hers), I
worked on the president’s re-election campaign, and politically, I
wish to see the PPACA law in general and the new marketplaces
specifically succeed.</p>
<p>This has been an important week in the history of health care in the
United States and for technology professionals working in government
and on related services. Here are some thoughts on
<a href="https://healthcare.gov/">healthcare.gov</a> and the state-based
marketplace websites from my perspective as someone who was been
developing and deploying web-based software applications for many
years and who has experience with large systems and high-traffic
sites.</p>
<p>As I write this there is a weird mixture of angst, elation,
anticipation, control-freakery, sympathetic embarassment, hope, and
generalized anxiety about healthcare.gov and the state-based
marketplace sites among supporters of Obamacare and also among
left-leaning technologists. On the one hand, affordable health
insurance is now available to any American; on the other, availability
doesn’t necessarily mean you can get it, due to errors during the
sign-up process on healthcare.gov and the state-based marketplace
sites which have been widely reported. There is a sense that, while
this is primarily a technology problem to be fixed, the political
problem is larger and may risk the implementation and success of the
overall law—if enough people perceive the marketplace sites to be
broken, support for the law—already tenuous according to some
polls—will erode, and the law’s opponents’ argument that
implementation needs to be delayed or even defunded will be
persuasive.</p>
<p>It is natural for technologists to go into crisis mode and immediately
start triaging problems and brainstorming solutions. They are smart
and want to help and believe they can fix things. This is a totally
appropriate attitude, and their nervous feelings are valid. The people
implementing the marketplace sites have all the problems of developing
large-scale, integrated, enterprise software, plus delivering a
high-quality consumer experience. I think we should also have some
perspective on what’s happening, and I would caution against
panic. There are a number of things to bear in mind:</p>
<p><strong>Architecture.</strong> Caveat: I don’t have direct experience with the
marketplace sites, only second-hand knowledge about how they’re
implemented. That said, I know some details. The main thing to
understand is there is no one, single Obamacare site—there is
healthcare.gov, which is home to the federal marketplace and a portal
to the state-based marketplaces, and there are the 14 state-based
sites. The federal marketplace is for all Americans for whom their
states either chose not to implement their own marketplace or their
site isn’t ready yet.</p>
<p>The user interface, or frontend, of healthcare.gov is quite
interesting. It’s design has been compared favorably with top
commercial sites. It was implemented using modern web development
techniques, working well across browsers and on mobile devices. We
used similar techniques on the president’s campaign: generate static
files from templates with Jekyll, serve them from behind a CDN
(Akamai, in the case of healthcare.gov). This gives you a very fast,
low-latency user experience that’s very durable in the face of
high-traffic loads. <a href="http://developmentseed.org/blog/new-healthcare-gov-is-open-and-cms-free/">Dave Cole has
written</a>
about the process by which the frontend was developed, it’s
fascinating to read if you have any experience with how government
sites have typically been built. And you’ll notice, no one has
complained about being able to access the site itself. healthcare.gov
itself has been up continuously since October 1st. It’s submitting
forms back to the server that’s been the issue.</p>
<p>About the backend server: having a great frontend experience means
little if you can’t complete a transaction with the
service. (Although, not nothing—many important informational consumer
resources reside on the frontend and have been wholly unaffected by
the reported outages.) People may not realize that a major part of
PPACA was the streamlining the rules surrounding Medicaid
eligibility. healthcare.gov serves then as a portal, routing people to
the appropriate resource they need to help them get covered. This
means not only sending you to your state-based marketplace site if
your state has one, but directing you to Medicaid instead of the
marketplaces, if you are eligible, or determining that you meet
requirements for a subsidy on the marketplace. In order to do these
things, the system verifies your identity, income, and other personal
data with new and existing government databases. In other words, so
that it may route you to the correct entity that will be offering or
providing you health insurance, healthcare.gov looks up your
information online (i.e., during the course of a request-response
cycle with the site). The architecture of healthcare.gov is an example
of both the challenges of integration—different software services
working together—and distributed systems—independent systems that may
or may not be available or meeting certain service-level agreements or
standards.</p>
<p>An alternative to an online lookup of personal data or account
creation would be to store the request for later processing. This is
commonly referred to as queuing. It turns an online process into an
offline one: the system goes from being synchronous—waiting for a
response from another system after making a request to it—to
asychronous—not waiting for the response and arranging to check the
result somehow later. This is not a trivial change, as people who have
implemented these systems will know. It requires a fairly fundamental
redesign of the flow of the software, the application of business
rules, and how certain operational details are carried out. However,
it is now widely established pattern for system development. For
example, when you buy a ticket from an airline reservation site, and
wait for your credit card to be processed and the whole transaction to
complete, that is an example of a synchronous, or online, system
(internally, the system may very well be composed of asynchronous
services, but the frontend interface that the user interacts with
presents a synchronous experience). When you place an order with
Amazon, on the other hand, you receive a response almost immediately
(“thank you for your order!”). If there is a problem with your
order—your card is expired, or was declined—you later receive a
notification, usually an email, asking you to update your payment
info. That is an example of an asynchronous system. Why does this
matter? Asynchrous, distributed systems have components that are
de-coupled—if one fails, it doesn’t necessarily bring the rest down
with that. You have to design your system to be resilient for such
failures, but it enables you to do things such as quickly store the
contents of a form submission and acknowledge the user with a
thank-you message when the system that looks up personal data or
creates new accounts is down. This introduces operational complexity:
you must have a functioning queue system, you must have programs that
process the queue, they need to be monitored and errors have to be
handled appropriately (since there is no online user that can respond
to them), and notification systems like email that are out-of-band of
the website may need to be employed (in case you need to ask the user
to come back and provide more information).</p>
<p>I don’t know to what extent healthcare.gov was designed with the
challenges of distributed systems in mind, but moving toward more
asynchronous data flows where possible will alleviate some of the poor
user experiences we’ve seen reported. It will also free them up to
still take in a high volume of requests while independently working to
fix bugs in the transactional or informational data services.</p>
<p><strong>Errors, user experience, and expectations.</strong> In the reports about
problems users have experienced with healthcare.gov and the
state-based marketplace sites, we’ve seen screenshots and descriptions
of ugly error messages. The quality of the healthcare.gov frontend,
with its attractive design that’s more like a retail site than a
government site, I think has primed users for an overall experience
experience reflective of that design. They expect the under-the-hood
to be as good as the hood appears. Ugly error messages, and
disappointment at not being able to complete the sign-up process,
frustrate expectations that were set by the site itself, and by its
champions, myself included, who encouraged people to go to the site on
day 1.</p>
<p>The ugly error messages have for the most part been replaced with
friendlier views, and we know that the backend engineers are working
to fix the sign-up process. A way to handle expectations at this point
for site users might be to remind them, at the point of a system error
or maintenance page, that they have until December 15th to enroll for
coverage beginning January 1st, 2014, and until March 31st to enroll
for coverage in 2014. Another mechanism to reassure a frustrated user
that couldn’t sign up might be a simple form that collect email
addresses to be notified when the system is back online.</p>
<p><strong>Unprecedented environmental hostility and limited time.</strong> Ever since
PPACA was passed, I’ve heard griping about would it take so long for
Obamacare to come online. In reality, given the scope of the changes
to the regulatory framework for health insurance markets, changes to
Medicaid eligibility, and the implementation of the federal and
state-based marketplaces, there was a huge amount of work to deliver a
major new social insurance program in such a short amount of
time. It’s natural that there would be bugs, and the president, HHS,
and CMS teams have said as much. Going back, many regulatory and
technical fixes to the law have been prevented from being taken up by
Congress by the law’s opponents. And now of course the federal
government is shutdown due in part to opposition to the law. While
little of this hostility is new information to implementers, it is
nonetheless remarkable what they were able to achieve in this
environment. A suspected denial-of-service attack on New York’s site
only compounds the outside forces set against this fledgling program.</p>
<p><strong>State-based marketplaces.</strong> It is a joke among Medicaid staff that
you’ve seen one state’s Medicaid system, you’ve seen one state’s
Medicaid system. 14 states chose to implement their own
marketplace. While their sites will share some common services with
the federal marketplace, and some large contractors worked on multiple
sites, these are independently developed and administered sites with
their own architectures, infrastructure, designs, and staff.</p>
<p><strong>Time.</strong> My strong belief is that these early problems will be
largely forgotten very soon. People will get covered. People are
getting enrolled, now, despite the problems. It’s worth remembering
what happened during the implementation of Medicare Part D. There were
many of the same types of reports, from pharmacies that couldn’t
connect to government data services, to seniors that were temporarily
unable to receive their benefit. Do we think about those stories now
when we think about Part D? Of course not. Part D is just as strong
and beloved piece of the social safety net firmament as any other. So
it will be with Obamacare.</p>
<p>None of this is to excuse the problems healthcare.gov has had this
week. October 1st was a known deadline, major sites have been launched
under hostile or constrained circumstances before. But I think if we
understand a bit more everything involved, we might not be so quick to
condemn or dismiss out of hand.</p>
<p><em>Update: my original post incorrectly stated there were 24 state-based
marketplaces; there are 14.</em></p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Public Good Software and me</title>
      <link>https://pauladamsmith.com/blog/2013/07/pgs.html</link>
      <guid>https://pauladamsmith.com/blog/2013/07/pgs.html</guid>
      <pubDate>Wed, 24 Jul 2013 00:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>Software that helps civil society organizations—non-profits, NGOs,
charities—do their work should be better. It can be better. I want to help
make it better. That’s why <a href="http://www.chicagogrid.com/reviews/tech/obamas-tech-team-citys-geeks-in-residence/">I’ve started, along with two colleagues from the
2012 election</a>, a new company, called <a href="https://publicgoodsoftware.com/">Public Good Software</a>.</p>
<p>If you survey the kind of technology that <abbr title="civil society
organizations">CSOs</abbr> use to support their missions, it’s a sorry sight.
It’s full of complex interfaces and complicated experiences, thin layers over
old systems, aging and poorly-supported applications, and disconnected data.
Worse, the companies that develop and sell this software seem to have
stagnated—their websites often feel frozen in time from 10 years ago. There
isn’t a lot of innovation happening here.</p>
<p>This is frustrating. These organizations are increasingly counted on to
confront our most serious challenges, like hunger, climate change,
conservation, joblessness, homelessness, affordable housing, poverty, public
health, literacy and education, and yet the technology tools they need are not
keeping up with them. Why shouldn’t people who work at CSOs expect software
every bit as good and as powerful as what they use on their smartphones
everyday?</p>
<p>The situation is not much better if you are a supporter of these
organizations. Let’s say you give $100 a year to your local public radio
station, volunteer regularly at a community garden, and write your
congressperson on behalf of an animal rights advocacy campaign. You should be
able to keep track of all you do, and if you choose, share it with your
community. You should be able to find new opportunities that you might not
have been aware of, based on the kinds of organizations you support.  You have
a civic profile, based on how you help others, that you should be able to
claim and control.</p>
<p>The first problem to tackle, and the first product that PGS will be developing
to help solve, is the problem of disconnected data. It’s a fundamental problem
that impacts CSOs and their supporters. Information about donors is in one
database, volunteers in another, email subscribers in a third, then there’s
Facebook likers and Twitter followers and you don’t know if they’re in the
other databases … Think of Mint.com, the way that service in its early days
brought sanity to your financial life. We want to connect these disparate
databases in much the same way and provide CSOs with a new, high-level view of
their data, with more complete pictures of their supporters. We’ll do this
through the use of statistical models, summaries, and visualizations that let
CSOs track how they are doing on the goals they set for themselves. This will
become a platform on which, over time, we’ll create and add new products.</p>
<p>We aren’t setting out to reinvent the wheel. We’re not building YACRM (yet
another CRM). We’re not even aiming to replace the technology CSOs currently
use. We want to provide new tools and experiences that reflect the new needs
of these organizations and their supporters. And it will be great, modern
software: fast, a pleasure to use, designed and built for mobile devices, with
maps and geo data throughout, and ready for international users. This is what
CSOs and their supporters deserve.</p>
<p>We decided early on that we wanted to be aligned with our customers in a way
that was sustainable, that built trust, and held us as a company accountable
to ensure that a double-bottom-line isn’t just a convenience to be discarded
when the “real” pressure (i.e., financial) builds up. At the same time, we
knew that the best way to grow the company the way we believed it should was
through traditional capital investment. That led us to become a <a href="http://www.ilga.gov/legislation/BillStatus.asp?DocNum=2897&amp;GAID=11&amp;DocTypeID=SB&amp;LegId=63455&amp;SessionID=84">benefit
corporation</a>. <a href="#fn" id="fnr">*</a> This is new legislation,
found in a dozen or so states, and we think we’re one of the first software
startups to go that route. Essentially what this means is that we are in all
other respects like a normal for-profit company (we are a C corp under the
hood), but that we have a social mission, stated right in our corporate
by-laws (ours is roughly “to return more capital to organizations that provide
a benefit to the public”), and there are two mechanisms ensuring that the
social mission is not discarded if it becomes inconvenient. One is that there
is a board-level position called the social benefit director, whose job is to
ensure that the company is sticking to the social mission. The other is that
our fiduciary responsibility to our shareholders does not override that social
mission. This is where the rubber meets the road—you won’t see PGS suddenly
pivot to sell software to the NRA to return a few more percentage points to
our investors.</p>
<p>All this comes at an interesting time for the public sector.  Executive
directors and supporters alike are demanding more accountability and better
ways of measuring success or failure. At the same time, demand for CSO
services is up, while capital—in the form of dollars and volunteer time—is
flat, or even declining slightly. There is a small but increasingly vocal
minority of development directors saying CSOs need to be less obsessed with
converting every dollar to program, and to find new ways to expand and be more
effective. All this leads to an increasing need for better data and analysis,
and better tools—for fundraising, communications, volunteer mobilization—that
build on it. We think there is an enormous opportunity here.</p>
<p>So it will be fun. I’m the CTO. My co-founders <a href="http://jdkunesh.com/">Jason</a> and <a href="http://www.danratner.com/">Dan</a>
were director of UX and director of development, respectively, in the OFA 2012
technology department. We’ve also got two more OFA tech alums,
<a href="http://www.chrisgansen.com/">Chris</a> and <a href="http://www.aaronsalmon.com/">Aaron</a>, as part of the founding team. Our current
status is, talking with potential investors, meeting with a handful of CSOs
who’ve agreed to pilot the software as we build it, and making prototypes and
getting our basic infrastructure running. We’re using <a href="http://golang.org/">Go</a> for our server
software, which is a fun language. Incidentally, it should go without saying
that we’re big believers in open source, but most of what we develop will be
available under an open source license, and I’ll write more about that in
another later post.  But I’ve already released some open source software that
was developed on PGS time, <a href="http://paulsmith.github.io/gogeos/">gogeos</a>, a small Go library for working
with geospatial data.  We’ll be hiring software engineers soon, so if any of
this sounds interesting to you, <a href="mailto:paul@publicgoodsoftware.com">drop me a line</a>.</p>
<p class="fn"><a id="fn">*</a> Not to be confused with the <a
href="http://www.bcorporation.net/">B Corp certification</a>, which is related
but is not a corporate structure. <a href="#fnr">↩</a></p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Announcing gogeos, a spatial data library for Go</title>
      <link>https://pauladamsmith.com/blog/2013/06/gogeos.html</link>
      <guid>https://pauladamsmith.com/blog/2013/06/gogeos.html</guid>
      <pubDate>Wed, 12 Jun 2013 21:00:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>I am announcing the initial release of <a href="http://paulsmith.github.io/gogeos/">gogeos</a>, a library for the Go
programming language. gogeos provides spatial data operations and geometric
algorithms. While it is a Go library, the hard work is done by the
<a href="http://geos.osgeo.org/">GEOS</a> C library.</p>
<p>The kinds of things you can do with gogeos include:</p>
<ul>
<li><strong>set-theoretic operations</strong>, such as computing the intersection, union, or
difference of two geometries,</li>
<li><strong>topological operations</strong>, such as computing buffers and convex hulls,</li>
<li><strong>binary predicates</strong>, such as whether two geometries intersect or are disjoint,</li>
<li><strong>validity checking</strong>, and</li>
<li><strong><a href="http://paulsmith.github.io/gogeos/#overview">much more</a></strong>.</li>
</ul>
<p>It also provides interoperability with other spatial data processing systems
like <a href="http://postgis.org/">PostGIS</a> by decoding and encoding geometries as Well-Known Text
(WKT) and Well-Known Binary (WKB).</p>
<p>I started working on gogeos because I looked at the landscape of GIS and
spatial data libraries for Go, and found it lacking. Binding to the GEOS
library with <a href="http://golang.org/cmd/cgo/">cgo</a> was a way to get started quickly. Relying on GEOS has
its drawbacks, for instance, it creates a large binary dependency, and cgo
doesn’t allow for cross-platform compiles.</p>
<p>In the long term, I would like to create a pure Go library that implements
functionality such as GEOS and the <a href="http://www.vividsolutions.com/jts/main.htm">JTS</a> provide. That would allow for use
on platforms that don’t or can’t support C shared libraries, such as Google
App Engine, and make it easier for developers to get started working with it.</p>
<p>In the meantime, I hope that gogeos enables more developers who are working
with spatial data or GIS to get involved in the Go ecosystem.</p>
<p>gogeos is a <a href="https://github.com/paulsmith/gogeos">fully open-source project</a>, and I welcome contributors
and feedback.</p>
<p>—<a href="https://twitter.com/paulsmith">@paulsmith</a></p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Democratic Party’s voter registration app is now free and open-source software</title>
      <link>https://pauladamsmith.com/blog/2013/01/dnc_voter_reg_foss.html</link>
      <guid>https://pauladamsmith.com/blog/2013/01/dnc_voter_reg_foss.html</guid>
      <pubDate>Tue, 29 Jan 2013 03:30:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>We (the <a href="http://democrats.org/">DNC</a>) have <a href="https://github.com/democrats/voter-registration/issues/12#issuecomment-12804999">relicensed the Democratic Party’s voter registration
application</a> under a standard MIT license, and accompanied the source
code with an advisory notice regarding the use of the software. I wanted to
explain why we did this.</p>
<p>The Democratic Party initially released <a href="https://github.com/democrats/voter-registration">the source code to its online voter
registration app</a> late last summer, with the intent of making it
available for all the standard reasons people and organizations choose when
they open-source code: so that it can be improved, so that bugs can be fixed,
so others can take it and build further new applications on top of it.</p>
<p>However, it was quickly apparent we had a problem with the open source
community. <a href="https://github.com/democrats/voter-registration/issues/12">The issue was with the license</a>. It contained a clause that
placed restrictions on its use. The reason this clause was included was to
address our concerns regarding the highly regulated and closely monitored
nature of voting and voter registration. We wanted to avoid a scenario where,
either inadvertently or through malice, someone set up a site based on the
code, and without following state and federal guidelines and rules, defrauded
or disenfranchised a voter. Now, regardless of our good intentions on this
matter, the fact that we had taken a standard open source license and amended
it with this restrictive clause meant that we did not pass “free and open
source” muster, with emphasis on the “free” as in “speech”.</p>
<p>We needed a solution that addressed both the problematic license and our
concerns regarding the good-faith use of the software that protected voters. A
member of the open source community, <a href="http://www.red-bean.com/kfogel/">Karl Fogel</a>, stepped forward
with a proposal: change the license to an unmodified standard
<a href="http://opensource.org/licenses/index.html">OSI</a>-approved license, and include along with the source code an
advisory document that outline these legal concerns. The notice would not be
binding or otherwise modify the license and therefore terms of use; however,
like any piece of open source software, people are “free” to use it illegally,
and free to suffer the consequences if they do. The important thing is to
remind users of their responsibility to act in accordance with the law,
especially when it comes to something as precious and beseiged as our
franchise. We feel the combination of a standard FOSS license and a
non-binding advisory document expressing the intent of the copyright holder is
a way forward for political organizations to release potentially sensitive
soure code while at the same time communicating the vital issues animating and
conditioning that release.</p>
<p>Now, some observers may not see this as remarkable. There was a bad license,
it’s been changed, what’s the fuss? I want to acknowledge the hard work across
the organization, from software engineers to lawyers, to find a way to give
back to the open source community and satisfy the concerns of both. There are
many reasons why organizations don’t release their software as open source. We
want to set an example, however small, that there are non-license ways to
state any reservations or guiding principles your organization that ordinarily
would have prevented a release. Key among these are engaging with the
community. As we have learned time and again, good solutions often originate
through trust and dialogue.</p>

      ]]></content:encoded>
    </item>
    <item>
      <title>Lexing Oscar</title>
      <link>https://pauladamsmith.com/blog/2013/01/lexing-oscar.html</link>
      <guid>https://pauladamsmith.com/blog/2013/01/lexing-oscar.html</guid>
      <pubDate>Fri, 11 Jan 2013 10:01:00 -0000</pubDate>
      <author>paulsmith@pobox.com (Paul Smith)</author>
      <content:encoded><![CDATA[
        <p>For the past <em>n</em> years, I’ve built and hosted a web app that lets my film
buff friends and me compete by guessing who will win the Academy Awards by
voting for nominees in each category. I do a new one from scratch each time.
It’s a fun diversion, but it’s also a playground for me to try out new
skills picked up in the past year or new tools or techniques I’ve been wanting
to fool around with.</p>
<p>The first thing I need to do each time is get a list of that year’s nominees
in some machine-readable format. Being a lazy programmer, I’m not going to
type in the 100+ nominees into a spreadsheet or text file, so I wind up
writing a short throwaway script to coax some list I’ve found online into the
form I need for importing. This sort of script is the meat-and-potatoes of the
workaday programmer, the ones you whip up in a few minutes as an intermediate
step in a larger task. Ordinarily, they’re hardly worth commenting on. They
have a vanishingly short half-life, since there is rarely any generality to be
derived from them: they only work on the exact input given.</p>
<p>This year, I wanted to try out a new way of getting the nominee list together.
Sure, for a small task like this, there’s no compelling reason not to go with
the same kind of quick throwaway script as before. But again, the point of the
Oscars app is to exercise new or different muscles.</p>
<p>My goal was to generate a representation of the list of nominees in a format
such as CSV suitable for importing into a database. I found a source list of
nominees, formatted as follows: the name of the category is on the first line,
then a list of nominees comes next, each requiring two lines, one being the
name of the film and the other a name or list of names associated with the
nomination, all followed by a blank line, then the subsequent category starts
on the next line and we repeat. I wanted to read in and parse text formatted
like this:</p>
<pre><code>Directing
Amour
Michael Haneke
Beasts Of The Southern Wild
Benh Zeitlin
Life Of Pi
Ang Lee
Lincoln
Steven Spielberg
Silver Linings Playbook
David O. Russell

Actor in a Leading Role
Lincoln
Daniel Day-Lewis
…
</code></pre>
<p>And convert it to this:</p>
<pre><code>Directing,Amour,Michael Haneke
Directing,Beasts Of The Southern Wild,Benh Zeitlin
Directing,Life Of Pi,Ang Lee
Directing,Lincoln,Steven Spielberg
Directing,Silver Linings Playbook,David O. Russell
Actor in a Leading Role,Lincoln,Daniel Day-Lewis
</code></pre>
<p>Normally, to scan and parse this type of input, I would write a program to
loop over each line of the input, with a number of global state variables,
keeping track of what tokens I was currently processing. In this case, I might
have global state variables indicating whether I was currently processing a
category and what the current film is, and I would have a set of if/elif/else
statements for tests of various combinations of those variables, including for
the contents of the current line (a blank line or EOF indicating the end of a
category).</p>
<p>Each time through the loop, then, we get a line from the text and check to see
what state we’re in. While this approach is easy to get started with, it leads
to fragile code and requires a lot of mental bookkeeping. Worse, each time
through the loop, the state of where we are and what we just did is forgotten.
That accounts for the proliferation of state variables to be checked in order
to restore the state of the processing.  Think about it, we are marching
sequentially through this text, wouldn't it be nice if we could just pick up
where we left off with the last action?</p>
<p>My approach this time is inspired by <a href="http://www.youtube.com/watch?v=HxaD_trXwRE" title="Lexical Scanning in Go - Rob Pike">Rob Pike’s talk on lexical scanning</a>.
Instead of a loop where we get the next bit of text to examine and restore the
state of the processing by examing a number of state variables, we instead
have a loop where a function is called that returns the next function to be
called. In other words, a function is called which does a bit of processing of
the text, advancing the pointer or consuming from a stream, maybe emitting
some tokens, and then returns to the caller the function that should proceed
from where the returning function just left off. For instance, we just scanned
a category, which means we know we are ready to scan a film, so call the film
scan function. That next function can just carry on its processing without any
state-checking preliminaries. The loop of our system therefore is very
concise, just calling functions and getting the next one to call the
subsequent time around. Roughly:</p>
<pre><code>def run():
    state = start_state
    while state:
        state = state()
</code></pre>
<p>When we are done processing input, say, EOF is reached, the state function
currently executing can return <code>None</code> to the caller, which will end the while
loop and shut down the machine.</p>
<p>The advantage to the programmer is that instead of building up a complicated
switch of control to determine what state our machine is in, we simply write
functions that proceed naturally from the last state, and then hand off
control to the subsequent function. It’s clean and helps keep the complexity
of the system manageable. Any time you can reduce the number of control flow
statements and replace them with simple functions is a win in my book.</p>
<p>So back to the Oscars. This year, I opened the <a href="http://cdn.media.oscar.abc.com/media/2013/pdf/2013/nominees.pdf">official nominee list</a> from
the Academy’s site, a PDF. I selected the text, copied and pasted it into a
text document. The only manual editing I did was to add a blank line between
each group of nominees by category, and I also joined lines in categories like
Music (Original Song) where the title of the song and the name of the composer
is split across multiple lines—these were quick changes that simplified the
scanning logic.</p>
<p>There are three state functions in my program, one for each of category, film,
and name (or list of names):</p>
<pre><code>def lex_category(lexer):
    lexer.emit(CATEGORY, title(getline()))
    return lex_film

def lex_film(lexer):
    line = getline()
    if line == '':
        lexer.emit(BLANK, '')
        return lex_category
    elif line is None: # EOF, shut down lex machine
        return None
    lexer.emit(FILM, title(line))
    return lex_names

def lex_names(lexer):
    lexer.emit(NAMES, title(getline()))
    return lex_film
</code></pre>
<p>(<code>title()</code> handles some odd case formatting in the source text by converting
strings to title case.)</p>
<p><code>lex_film</code> is the most complex, having to handle the possibilities of a blank
line, meaning we’re moving on to the next category, EOF, which shuts down
scanning, and the film itself. But in all cases we merely return the next
state function to called (or <code>None</code>).</p>
<p>Admittedly, this is more sophistication than normally appears in my yearly
nominee list parsing. But I have to say that I was able to write the program
in about the same amount of time, found it ran correctly the first time, and
was actually kind of fun to do. And while this was a silly example, you can
start to see the power you can get from this approach when lexing different
kinds of input with more and more complex tokens. When you lift the flow of
control up a level and let your functions focus on the task at hand, the
result I think is a more elegant and more obviously correct program.</p>
<p>The script and input text are <a href="https://gist.github.com/4507999">here</a>, and the output list of nominees is
<a href="https://docs.google.com/spreadsheet/ccc?key=0AviXLd8uXec3dHRtenJGcUs5aTBXUEY4cWs2WHNpS3c#gid=0">here</a>.</p>

      ]]></content:encoded>
    </item>
    
  </channel>
</rss>