<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en" xml:base="http://blog.ezyang.com/wp-atom.php">
	<title type="text">Inside 245s</title>
	<subtitle type="text">Existential Pontification and Generalized Abstract Digressions</subtitle>

	<updated>2010-09-06T13:00:27Z</updated>

	<link rel="alternate" type="text/html" href="http://blog.ezyang.com" />
	<id>http://blog.ezyang.com/feed/atom/</id>
	

	<generator uri="http://wordpress.org/" version="3.0.1">WordPress</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/ezyang" /><feedburner:info uri="ezyang" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Embracing Windows]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/4s9eBaAe_NY/" />
		<id>http://blog.ezyang.com/?p=2900</id>
		<!-- ezyang: omitted update time -->
		<published>2010-09-06T13:00:27Z</published>
		<category scheme="http://blog.ezyang.com" term="Meta" /><category scheme="http://blog.ezyang.com" term="Toolbox" />		<summary type="html"><![CDATA[Some things come round full circle. As a high schooler, I was a real Windows enthusiast. A budding programmer, I accumulated a complete development environment out of necessity, a mix of Cygwin, handwritten batch scripts, PuTTY, LogMeIn, a homegrown set of PHP build scripts and Notepad++. I was so devoted to the cause I even [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/09/embracing-window/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;&lt;em&gt;Some things come round full circle.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As a high schooler, I was a real Windows enthusiast. A budding programmer, I accumulated a complete development environment out of necessity, a mix of Cygwin, handwritten batch scripts, PuTTY, LogMeIn, a homegrown set of PHP build scripts and Notepad++.  I was so devoted to the cause I even got a &lt;a class="reference external" href="http://repo.or.cz/w/git.git/commit/36ad53ffee6ed5b7c277cde660f526fd8ce3d68f"&gt;single patch into Git&lt;/a&gt;, for the purpose of making Git play nicely with plink on Windows. The setup worked, but it always felt like a patchwork of different components, all not quite seeing eye-to-eye with each other. When I discovered that Linux was able to offer me an unbelievably coherent development environment, I jumped ship and said goodbye to Windows.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Some things come round full circle.&lt;/em&gt; Windows has a way of coming back to you eventually. The &lt;a class="reference external" href="http://www.galois.com/technology/communications_security/cryptol"&gt;product I worked on over the summer&lt;/a&gt; at Galois had to support Windows, and I consequently devoted days of effort getting my changes to build properly on Windows. I then went on to &lt;a class="reference external" href="http://blog.ezyang.com/2010/08/interrupting-ghc/"&gt;hacking GHC&lt;/a&gt;, and Simon Marlow asked me to implement the equivalent feature in Windows.&lt;/p&gt;
&lt;p&gt;I’ve decided that I should stop shunning Microsoft Windows as the developer’s black sheep of the operating systems. Like it or not, Windows is here to stay; even if I never boot my laptop into Windows, as a developer it is good practice to think about and test my code on Windows. It might even be the case that Windows is a &lt;em&gt;perfectly reasonable&lt;/em&gt; underlying platform to develop on.&lt;/p&gt;
&lt;p&gt;There seem to be two reasons why developers might find targeting other platforms to be annoying:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;They don’t have access to a computer running that operating system, which makes debugging the problems extremely annoying—after all, this is why a reproduceable test-case is the gold standard of bug reporting. We should have easy to access and easy to use build servers setup to let people play in these different environments. This involves putting down some money to buy the appropriate licenses, which open-source authors might be reluctant to do: people at places with site licenses might be able to help by donating boxes for these people to play in (the same way companies and universities donate disk space and bandwidth for mirrors).&lt;/li&gt;
&lt;li&gt;They have to learn about another platform, with all of its intricacies and gotchas. On the one hand, this is annoying because “I already know how to do this in Unix, and now I have to spend N minutes to figure out how to do it on Windows, and spend another N minutes figuring out why it doesn’t work in some edge case.” On the other hand, learning a platform that does something you already know how to do can be kind of fun: you get to see different design decisions and develop multiple perspectives on the same problem, which I have found has always helped me out for problem solving.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There remain parts of Windows programming that I continue to have no interest in: for example, I find the vagaries of manifest files to be fairly uninteresting. But then again, I find packaging in Linux distributions to be uninteresting. Stop blaming Windows!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/4s9eBaAe_NY" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/09/embracing-window/#comments" thr:count="1" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/09/embracing-window/feed/atom/" thr:count="1" />
		<thr:total>1</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/09/embracing-window/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Annotating slides]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/rm1u2NGbJ9A/" />
		<id>http://blog.ezyang.com/?p=2821</id>
		<!-- ezyang: omitted update time -->
		<published>2010-09-03T13:00:53Z</published>
		<category scheme="http://blog.ezyang.com" term="Toolbox" />		<summary type="html"><![CDATA[A little trick for your toolbox: after you’ve generated your slide deck and printed it out to PDF, you might want to annotate the slides with comments. These is a good idea for several reasons: If you’ve constructed your slides to be text light, they might be optimized for presentation but not for reading later [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/09/annotating-slides/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;A little trick for your toolbox: after you’ve generated your slide deck and printed it out to PDF, you might want to annotate the slides with comments. These is a good idea for several reasons:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;If you’ve constructed your slides to be text light, they might be optimized for presentation but not for reading later on. (“Huh, here is this diagram, I sure wish I knew what the presenter was saying at that point.”)&lt;/li&gt;
&lt;li&gt;Writing out a dialog to go along the slides is a nonvocal way of practicing your presentation!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But how do you interleave the slide pages with your annotations? With the power of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;enscript&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;pdftk&lt;/span&gt;&lt;/tt&gt;, you can do this entirely automatically, without even having to leave your terminal! Here’s the recipe.&lt;/p&gt;
&lt;ol class="arabic"&gt;
&lt;li&gt;&lt;p class="first"&gt;Create an “annotations” text file (we’ll refer to it as &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;annot.txt&lt;/span&gt;&lt;/tt&gt;).  This will contain your text commentary to accompany the slides.  Write the text explaining your first slide, and then insert a &lt;em&gt;form feed&lt;/em&gt; (&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;^L&lt;/span&gt;&lt;/tt&gt;, you can do so by pressing &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;C-l&lt;/span&gt;&lt;/tt&gt; in vim (insert mode) or &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;C-q&lt;/span&gt; &lt;span class="pre"&gt;C-l&lt;/span&gt;&lt;/tt&gt; in emacs.) Write the text for your second slide. Rinse and repeat.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;We now want to render this into a PDF file, with the same dimensions as your slide deck.  Figure out what the size of your slides are in pixels, and then edit your &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;~/.enscriptrc&lt;/span&gt;&lt;/tt&gt; to contain the following line:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Media: Slide width height llx lly urx ury
&lt;/pre&gt;
&lt;p&gt;where ll stands for lower left and ur stands for upper right: these four numbers denote the bounding box for the text. One possible combination for these might be:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Media: Slide 576 432 18 17 558 415
&lt;/pre&gt;
&lt;p&gt;We can now invoke enscript to generate a nicely formatted PostScript file of our annotations in the right dimensions, with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;enscript&lt;/span&gt; &lt;span class="pre"&gt;annot.txt&lt;/span&gt; &lt;span class="pre"&gt;-p&lt;/span&gt; &lt;span class="pre"&gt;annot.ps&lt;/span&gt; &lt;span class="pre"&gt;-M&lt;/span&gt; &lt;span class="pre"&gt;Slide&lt;/span&gt; &lt;span class="pre"&gt;-B&lt;/span&gt; &lt;span class="pre"&gt;-f&lt;/span&gt; &lt;span class="pre"&gt;Palatino-Roman14&lt;/span&gt;&lt;/tt&gt; (pick a different font, if you like.)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Convert the resulting PostScript file into a PDF, with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;ps2pdf&lt;/span&gt; &lt;span class="pre"&gt;annot.ps&lt;/span&gt;&lt;/tt&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Now, with pdftk, we will split our annotations PDF and our slides PDF into individual pages, and then merge them back together into one PDF. We can use &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;burst&lt;/span&gt;&lt;/tt&gt; to output the pages, suggestively naming the output files so they interleave correctly:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
mkdir stage
pdftk slides.pdf burst output stage/%02da.pdf
pdftk annot.pdf burst output stage/%02db.pdf
&lt;/pre&gt;
&lt;p&gt;and then we join them back together:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
pdftk stage/*.pdf cat output annotated-slides.pdf
&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here’s the full script:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
#!/bin/sh
set -e
ANNOT=&amp;quot;$1&amp;quot;
SLIDES=&amp;quot;$2&amp;quot;
OUTPUT=&amp;quot;$3&amp;quot;
if [ -z &amp;quot;$3&amp;quot; ]
then
    echo &amp;quot;usage: $0 annot.txt slides.pdf output.pdf&amp;quot;
    exit 1
fi
TMPDIR=&amp;quot;$(mktemp -d)&amp;quot;
enscript &amp;quot;$ANNOT&amp;quot; -p &amp;quot;$ANNOT.ps&amp;quot; -M Slide -B -f Palatino-Roman14
ps2pdf &amp;quot;$ANNOT.ps&amp;quot; &amp;quot;$ANNOT.pdf&amp;quot;
pdftk &amp;quot;$SLIDES&amp;quot; burst output &amp;quot;$TMPDIR/%03da.pdf&amp;quot;
pdftk &amp;quot;$ANNOT.pdf&amp;quot; burst output &amp;quot;$TMPDIR/%03db.pdf&amp;quot;
pdftk &amp;quot;$TMPDIR&amp;quot;/*.pdf cat output &amp;quot;$OUTPUT&amp;quot;
rm -Rf &amp;quot;$TMPDIR&amp;quot;
&lt;/pre&gt;
&lt;p&gt;Don’t forget to define &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Slide&lt;/span&gt;&lt;/tt&gt; in your &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;.enscriptrc&lt;/span&gt;&lt;/tt&gt;, and happy annotating!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/rm1u2NGbJ9A" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/09/annotating-slides/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/09/annotating-slides/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/09/annotating-slides/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[My type signature overfloweth]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/k6hxELx6zDY/" />
		<id>http://blog.ezyang.com/?p=2844</id>
		<!-- ezyang: omitted update time -->
		<published>2010-09-01T13:00:35Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[I’ve recently started researching the use of session types for practical coding, a thought that has been in the back of my mind ever since I was part of a team that built a networked collaborative text editor and spent a lot of time closely vetting the server and the client to ensure that they [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/09/my-type-signature-overfloweth/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;I’ve recently started researching the use of &lt;em&gt;session types&lt;/em&gt; for practical coding, a thought that has been in the back of my mind ever since I was part of a team that built a networked collaborative text editor and spent a lot of time closely vetting the server and the client to ensure that they had implemented the correct protocols. The essence of such protocols is often relatively simple, but can quickly become complicated in the presence of error flow (for example, resynchronizing after a disconnection). Error conditions also happen to be difficult to automatically test! Thus, static types seem like an attractive way of tackling this task.&lt;/p&gt;
&lt;p&gt;There are three implementations of session types in Haskell: &lt;a class="reference external" href="http://hackage.haskell.org/package/sessions"&gt;sessions&lt;/a&gt;, &lt;a class="reference external" href="http://hackage.haskell.org/package/full-sessions"&gt;full-sessions&lt;/a&gt; and &lt;a class="reference external" href="http://hackage.haskell.org/package/simple-sessions"&gt;simple-sessions&lt;/a&gt;. If you were feeling particularly naive, you might try going to the &lt;a class="reference external" href="http://hackage.haskell.org/packages/archive/sessions/2008.7.18/doc/html/Control-Concurrent-Session.html"&gt;Haddock page&lt;/a&gt; to get a feel for what the API looks like. Before you continue reading, please inspect that page.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Done gouging your eyes out? Let’s proceed.&lt;/p&gt;
&lt;p&gt;In an interview in &lt;em&gt;Coders at Work&lt;/em&gt;, Simon Peyton Jones mentioned that one of the notable benefits of types is that it gives a concise, crisp description of what a function might do. That API is anything from concise and crisp, and it’s certainly not something that I could figure out just by looking at the corresponding function definition. Accordingly, one of the key selling points of current encodings of session types is that they do not break type inference: we give up on our user understanding what the gaggle of typeclasses means, and only expect to transfer one bit of information, “Do the protocols match?”&lt;/p&gt;
&lt;p&gt;This is not a problem that is fundamental to session types: any functionality that makes extensive use typeclasses can easily fall prey to these long type signatures. I have two (rather half-baked) thoughts on how this complexity might be rendered more nicely to the user, although not eliminated:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;A favorite pastime of type system hackers is a type-level encoding of naturals, using Peano numbers &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Z&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;S&lt;/span&gt; &lt;span class="pre"&gt;a&lt;/span&gt;&lt;/tt&gt;, attached to something like &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;(S&lt;/span&gt; &lt;span class="pre"&gt;(S&lt;/span&gt; &lt;span class="pre"&gt;Z))&lt;/span&gt;&lt;/tt&gt;. Vector is a type constructor of kind &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;*&lt;/span&gt; &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pre"&gt;*&lt;/span&gt;&lt;/tt&gt;. However, since there is only one primitive kind in Haskell, we could actually pass any type to Vector, say &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;Int&lt;/span&gt;&lt;/tt&gt;, which would be a nonsensical. One way to prevent this from occurring is to declare our Peano numbers instances of a typeclass &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Nat&lt;/span&gt;&lt;/tt&gt;, and then declare &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Nat&lt;/span&gt; &lt;span class="pre"&gt;a&lt;/span&gt; &lt;span class="pre"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;a&lt;/span&gt;&lt;/tt&gt;. But, since &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;a&lt;/span&gt;&lt;/tt&gt; is used precisely once in any such a statement, wouldn’t it be great if instead we could write &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;::&lt;/span&gt; &lt;span class="pre"&gt;Nat&lt;/span&gt; &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pre"&gt;*&lt;/span&gt;&lt;/tt&gt;? If you need to specify type equality, you could imagine some sort of type pattern matching &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;concat&lt;/span&gt; &lt;span class="pre"&gt;::&lt;/span&gt; &lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;a&lt;/span&gt; &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;b&lt;/span&gt; &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pre"&gt;Vector&lt;/span&gt; &lt;span class="pre"&gt;c&lt;/span&gt; &lt;span class="pre"&gt;with&lt;/span&gt; &lt;span class="pre"&gt;c&lt;/span&gt; &lt;span class="pre"&gt;~&lt;/span&gt; &lt;span class="pre"&gt;a&lt;/span&gt; &lt;span class="pre"&gt;:+:&lt;/span&gt; &lt;span class="pre"&gt;b&lt;/span&gt;&lt;/tt&gt;. &lt;a class="reference external" href="http://byorgey.wordpress.com/2010/08/05/typed-type-level-programming-in-haskell-part-iv-collapsing-types-and-kinds/"&gt;Collapsing types and kinds&lt;/a&gt; is an interesting step in this direction.&lt;/li&gt;
&lt;li&gt;When mathematicians present proofs, they might explicitly specify “for all F such that F is a field...”, but more frequently, they’ll say something like, “in the following proof, assume the following variable naming conventions.” With this, they get to avoid having to repeatedly explicitly redeclare what all of their variable names mean. An analogous system for type variables would go a long way towards reducing long type signatures.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But actually, that has nothing to do with what I’m currently looking at.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Here’s what I am looking at: session types suffer from another type signature explosion phenomenon: any function in the protocol contains, in its type, a complete specification of the entire protocol continuing from that point in time. As &lt;a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.70.7370&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;Neubauer and Thiemann admit (PDF)&lt;/a&gt;, the “session type corresponding to full SMTP is quite unreadable.” The two lines of inquiry I am pursuing are as follows:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Can building exception support into session types (currently an open problem) allow for much simpler session types by allowing most cases to elide the session types corresponding to error cases?&lt;/li&gt;
&lt;li&gt;Can we use &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;type&lt;/span&gt;&lt;/tt&gt; to permit a single global specification of the protocol, which individual functions then simply refer to? Do we need something a little more powerful?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At this point, I’ve just been doing thinking and paper reading, but I hope to start hacking on code soon. I’d love to hear your thoughts though.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/k6hxELx6zDY" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/09/my-type-signature-overfloweth/#comments" thr:count="7" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/09/my-type-signature-overfloweth/feed/atom/" thr:count="7" />
		<thr:total>7</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/09/my-type-signature-overfloweth/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Defining “Haskelly”]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/xu4IOPIp_1w/" />
		<id>http://blog.ezyang.com/?p=2825</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-30T13:00:05Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[At risk of sounding like a broken record, the topic of this post also sprang from abcBridge. John Launchbury asked a question during my presentation that got me thinking about API design in Haskell. (By the way, the video for the talk is out! Unfortunately, the second half had to be cut out due to [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/defining-haskelly/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;At risk of sounding like a broken record, the topic of this post also sprang from &lt;a class="reference external" href="http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/"&gt;abcBridge&lt;/a&gt;. John Launchbury asked a question during my presentation that got me thinking about API design in Haskell. (By the way, &lt;a class="reference external" href="http://vimeo.com/14432112"&gt;the video for the talk&lt;/a&gt; is out! Unfortunately, the second half had to be cut out due to technical difficulties, but you can still check out the slides.)&lt;/p&gt;
&lt;p&gt;His question was this:&lt;/p&gt;
&lt;blockquote&gt;
You’ve presented this in a very imperative style, where you’ve got this AIG structure in the ABC tool, and what you’ve really done is given me a nicely typed Haskell typed interface that allows you to go in a put a new gate or grab a structure, and I’m left wondering, what is the reason for needing this tight tie-in with what’s going on in that space? Here is a thought experiment: I could imagine myself having a purely functional data structure that is describing the data structure...and you end up with a functional description of what you want your graph to look like, and then you tell ABC to go and build the graph in one go.&lt;/blockquote&gt;
&lt;p&gt;I had claimed that abcBridge was a “functional API” for manipulating and-inverter graphs; perhaps I was lying! Is abcBridge—with its close correspondence to the underlying imperative code—truly “functional?” Or, if it’s not functional, does it at least have a “Haskelly” API? (What does it even mean for an API to be Haskelly?) Why does the purely functional interface seem morally better than the imperative interface? It’s a question of philosophical import as well as practical import—why do we prefer the functional interface which might require a more complex underlying implementation?&lt;/p&gt;
&lt;p&gt;My conjecture is that the faithfulness of an API to its host language is based on how much it utilizes particular features that a language makes easy to use. Haskell is frequently introduced as a “purely functional, lazy, strongly statically typed programming language.” Looking at each of these terms in turn (informally)...&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;em&gt;Purely functional&lt;/em&gt; indicates APIs that eschew destructive updates, instead opting for immutability and persistent data structures. Language features that make it easier to write in this style include the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;final&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;const&lt;/span&gt;&lt;/tt&gt; keywords, algebraic data types, pattern-matching and a library of persistent data structures to write more persistent data structures with.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Lazy&lt;/em&gt; indicates APIs that utilize laziness to build custom control structures and generate infinite data structures. The poster child language feature for laziness is, well, lazy evaluation by default, but explicit laziness annotations in a strict language or even a convenient lambda abstraction encourages lazy style. (Python does not have a convenient lambda, which is why asynchronous frameworks like Twisted are so painful!)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Strongly statically typed&lt;/em&gt; indicates APIs that encode invariants of all shapes and sizes into the static type system, so that programmer errors can be caught at compile time, not run time. The type system is the obvious language feature here, with its expressiveness defining what you can easily add to your system.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We associate programs that take advantage of these language features as “Haskelly” because Haskell makes it easy—both syntactically and conceptually—to use them! But at the same time, these are all (mostly) orthogonal language features, and for any given API you might write, you may opt to ditch some of these properties for others: maybe the feature just doesn’t matter in your problem domain, maybe the feature imposes an unacceptable performance penalty or is an insufficiently sealed abstraction.&lt;/p&gt;
&lt;p&gt;With abcBridge as our concrete example, here is how you might make such classifications in practice:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The monadic interface for constructing networks is about as far from purely functional as you can get, which was an explicit design choice in the name of performance and control. (Fortunately, we can build a nicer API on top of this one—in fact, I did an experimental implementation of one.) However, when you’re dealing with fully constructed networks the API takes a purely functional style, doing copying and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;unsafePerformIO&lt;/span&gt;&lt;/tt&gt; behind the scenes to preserve this illusion.&lt;/li&gt;
&lt;li&gt;abcBridge does not directly use laziness: in particular the monadic code is very structured and doesn’t have a lot of flow control in it.&lt;/li&gt;
&lt;li&gt;The static type system is a huge part of abcBridge, since it is operating so close to the bare metal: it has two monads, with an intricate set of functions for running and converting the monads, and the low level FFI bindings make every attempt to enhance the existing C-based type system. Notice the interesting play between the types and a functional interface: if we had a purely functional interface, we probably could have ditched most of these complicated types! (Imperative code, it seems, needs stronger type system tricks.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As far as pure Haskell libraries go, abcBridge is very un-Haskelly: I would certainly expect more from an equivalent library implemented in pure Haskell. But it is leaps and bounds better than the C library it sprang from. How far should one push the envelope? It is all about striking the right balance—that is why API design is an art.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/xu4IOPIp_1w" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/defining-haskelly/#comments" thr:count="1" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/defining-haskelly/feed/atom/" thr:count="1" />
		<thr:total>1</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/defining-haskelly/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Interrupting GHC]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/PYbYvNxM7pc/" />
		<id>http://blog.ezyang.com/?p=2808</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-27T13:00:27Z</published>
		<category scheme="http://blog.ezyang.com" term="C" /><category scheme="http://blog.ezyang.com" term="GHC" /><category scheme="http://blog.ezyang.com" term="Hack" /><category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[In my tech talk about abcBridge, one of the “unsolved” problems I had with making FFI code usable as ordinary Haskell code was interrupt handling. Here I describe an experimental solution involving a change to the GHC runtime system as suggested by Simon Marlow. The introductory section may be interesting to practitioners looking for working [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/interrupting-ghc/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;In my &lt;a class="reference external" href="http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/"&gt;tech talk about abcBridge&lt;/a&gt;,
one of the “unsolved” problems I had with making FFI code usable as
ordinary Haskell code was interrupt handling. Here I describe an
experimental solution involving a change to the GHC runtime system as
suggested by &lt;a class="reference external" href="http://permalink.gmane.org/gmane.comp.lang.haskell.glasgow.user/18771"&gt;Simon Marlow&lt;/a&gt;.
The introductory section may be interesting to practitioners looking for working examples of code that catches signals; the later section is a proof of concept that I hope will turn into a fully fleshed out patch.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; {-# LANGUAGE ForeignFunctionInterface #-}
&amp;gt; {-# LANGUAGE DeriveDataTypeable #-}
&amp;gt; {-# LANGUAGE ScopedTypeVariables #-}
&amp;gt;
&amp;gt; import qualified Control.Exception as E
&amp;gt;
&amp;gt; import Foreign.C.Types (CInt)
&amp;gt;
&amp;gt; import Control.Monad
&amp;gt; import Control.Concurrent (threadDelay, myThreadId, throwTo, forkIO)
&amp;gt; import Control.Concurrent.MVar (newEmptyMVar, putMVar, readMVar)
&amp;gt;
&amp;gt; import System.IO (hPutStrLn, stderr)
&amp;gt; import System.Posix.Signals (installHandler, sigINT, Handler(..))
&lt;/pre&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;In many interactive applications (especially for REPLs), you would like
to be able to catch when a user hits &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;^C&lt;/span&gt;&lt;/tt&gt; and terminate just the
current computation, not the entire program.  &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;fooHs&lt;/span&gt;&lt;/tt&gt; is some function
that may take a long time to run (in this case, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;fooHs&lt;/span&gt;&lt;/tt&gt; never
terminates).&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; fooHs :: Int -&amp;gt; IO Int
&amp;gt; fooHs n = do
&amp;gt;     putStrLn $ &amp;quot;Arf HS &amp;quot; ++ show n
&amp;gt;     threadDelay 1000000
&amp;gt;     fooHs n
&lt;/pre&gt;
&lt;p&gt;By default, GHC generates an asynchronous exception which we can catch
using the normal exception handling facilities to say “don’t exit yet”:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; reallySimpleInterruptible :: a -&amp;gt; IO a -&amp;gt; IO a
&amp;gt; reallySimpleInterruptible defaultVal m = do
&amp;gt;     let useDefault action =
&amp;gt;             E.catch action
&amp;gt;                 (\(e :: E.AsyncException) -&amp;gt;
&amp;gt;                     return $ case e of
&amp;gt;                         E.UserInterrupt -&amp;gt; defaultVal
&amp;gt;                         _ -&amp;gt; E.throw e
&amp;gt;                         )
&amp;gt;     useDefault m
&amp;gt;
&amp;gt; reallySimpleMain = do
&amp;gt;     r &amp;lt;- reallySimpleInterruptible 42 (fooHs 1)
&amp;gt;     putStrLn $ &amp;quot;Finished with &amp;quot; ++ show r
&lt;/pre&gt;
&lt;p&gt;Sometimes, you don’t want an exception generated at all and would like
to deliberate on the signal as soon as it arrives.  You might be
in some critical section of the program that should not be interrupted!
In such a case, you can install a signal handler with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;installHandler&lt;/span&gt;&lt;/tt&gt;
from &lt;a class="reference external" href="http://www.haskell.org/ghc/docs/6.12-latest/html/libraries/unix-2.4.0.1/System-Posix-Signals.html"&gt;System.Posix.Signals&lt;/a&gt;.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; installIntHandler :: Handler -&amp;gt; IO Handler
&amp;gt; installIntHandler h = installHandler sigINT h Nothing
&lt;/pre&gt;
&lt;p&gt;Care should be taken to make sure you restore the original signal handler
when you’re done.&lt;/p&gt;
&lt;p&gt;If you do decide you want to generate an exception from inside a signal
handler, a little care must be taken: if we try to do just a simple
throw, our exception will seemingly vanish into the void!  This is
because the interrupt handler is run on a different thread, and we have
to use &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;throwTo&lt;/span&gt;&lt;/tt&gt; from &lt;a class="reference external" href="http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-Concurrent.html"&gt;Control.Concurrent&lt;/a&gt;
to ensure our exception is sent to the right thread.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; simpleInterruptible :: a -&amp;gt; IO a -&amp;gt; IO a
&amp;gt; simpleInterruptible defaultVal m = do
&amp;gt;     tid &amp;lt;- myThreadId
&amp;gt;     let install = installIntHandler (Catch ctrlc)
&amp;gt;         ctrlc = do
&amp;gt;             -- This runs in a different thread!
&amp;gt;             hPutStrLn stderr &amp;quot;Caught signal&amp;quot;
&amp;gt;             E.throwTo tid E.UserInterrupt
&amp;gt;         cleanup oldHandler = installIntHandler oldHandler &amp;gt;&amp;gt; return ()
&amp;gt;         useDefault action =
&amp;gt;             E.catch action
&amp;gt;                 (\(e :: E.AsyncException) -&amp;gt;
&amp;gt;                     return $ case e of
&amp;gt;                         E.UserInterrupt -&amp;gt; defaultVal
&amp;gt;                         _ -&amp;gt; E.throw e
&amp;gt;                         )
&amp;gt;     useDefault . E.bracket install cleanup $ const m
&amp;gt;
&amp;gt; simpleMain = do
&amp;gt;     r &amp;lt;- simpleInterruptible 42 (fooHs 1)
&amp;gt;     putStrLn $ &amp;quot;Finished with &amp;quot; ++ show r
&lt;/pre&gt;
&lt;p&gt;This code works fine for pure Haskell work.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;However, our question is whether or not we can interrupt Haskell threads
that are inside the FFI, not just pure Haskell code.  That is, we’d
like to replace &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;fooHs&lt;/span&gt;&lt;/tt&gt; with:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; foreign import ccall &amp;quot;foo.h&amp;quot; foo :: CInt -&amp;gt; IO ()
&lt;/pre&gt;
&lt;p&gt;where &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;foo.h&lt;/span&gt;&lt;/tt&gt; contains:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
void foo(int);
&lt;/pre&gt;
&lt;p&gt;and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;foo.c&lt;/span&gt;&lt;/tt&gt; contains:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;quot;foo.h&amp;quot;

void foo(int d) {
    while (1) {
        printf(&amp;quot;Arf C %d!\n&amp;quot;, d);
        sleep(1);
    }
}
&lt;/pre&gt;
&lt;p&gt;In real practice, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;foo&lt;/span&gt;&lt;/tt&gt; will be some highly optimized function written
in C that may take a long time to run.  We also can’t kill functions willy nilly: we should be able to forcibly terminate it at any
time without corrupting some global state.&lt;/p&gt;
&lt;p&gt;If we try our existing &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;interruptible&lt;/span&gt;&lt;/tt&gt; functions, we find they don’t
work:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;reallySimpleInterruptible&lt;/span&gt;&lt;/tt&gt; registers the SIGINT, but the
foreign call continues.  On the second SIGINT, the program
terminates.  This is the &lt;a class="reference external" href="http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals"&gt;default behavior of the runtime system&lt;/a&gt;:
the RTS will attempt to gracefully abort the computation, but has
no way of killing an FFI call, and forcibly terminates the program
when the second SIGINT arrives.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;simpleInterruptible&lt;/span&gt;&lt;/tt&gt; fares even worse: without the “exit on
the second signal” behavior, we find that we can’t kill the
program by pressing &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;^C&lt;/span&gt;&lt;/tt&gt;!  The thread that requested the FFI
call is ignoring our exceptions.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;Nota bene.&lt;/em&gt; Please let the author know of any factual inaccuracies in
this section.&lt;/p&gt;
&lt;p&gt;Time to dive into the runtime system!  The code that manages
asynchronous exception lives in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;RaiseAsync.c&lt;/span&gt;&lt;/tt&gt; in the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;rts&lt;/span&gt;&lt;/tt&gt;
directory.  In particular, there is the function:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
nat throwToMsg (Capability *cap, MessageThrowTo *msg)
&lt;/pre&gt;
&lt;p&gt;Which is called when a thread invokes &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;throwTo&lt;/span&gt;&lt;/tt&gt; to create an
exception in another thread.&lt;/p&gt;
&lt;p&gt;It’s instructive to first look at what happens when there
is no funny business going along, that is, when the thread
is not blocked:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
case NotBlocked:
{
    if ((target-&amp;gt;flags &amp;amp; TSO_BLOCKEX) == 0) {
        // It's on our run queue and not blocking exceptions
        raiseAsync(cap, target, msg-&amp;gt;exception, rtsFalse, NULL);
        return THROWTO_SUCCESS;
    } else {
        blockedThrowTo(cap,target,msg);
        return THROWTO_BLOCKED;
    }
}
&lt;/pre&gt;
&lt;p&gt;If the thread is running normally, we use &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;raiseAsync&lt;/span&gt;&lt;/tt&gt; to raise the
exception and we’re done!  However, the thread may have called &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;block&lt;/span&gt;&lt;/tt&gt;
(from &lt;a class="reference external" href="http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-Exception.html#v%3Ablock"&gt;Control.Exception&lt;/a&gt;),
in which case we add the exception to the target’s blocked exceptions
queue, and wait for the target to become unblocked.&lt;/p&gt;
&lt;p&gt;Another state that a Haskell thread can be in is this:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
case BlockedOnCCall:
case BlockedOnCCall_NoUnblockExc:
{
    blockedThrowTo(cap,target,msg);
    return THROWTO_BLOCKED;
}
&lt;/pre&gt;
&lt;p&gt;The runtime system waits for the thread to stop being blocked on the FFI
call before delivering the exception—it will get there eventually!  But
if the FFI call takes a long time, this will be too late.  We could
replace this call with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;raiseAsync&lt;/span&gt;&lt;/tt&gt;, but what we find is that, while
the exception gets raised and the Haskell thread resumes normal
execution, the &lt;em&gt;FFI computation continues&lt;/em&gt;!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;If this seems mysterious, it’s useful to review how &lt;a class="reference external" href="http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Scheduler"&gt;the multithreaded
scheduler&lt;/a&gt; in
the GHC runtime system works.  Haskell threads are light-weight, and
don’t have a one-to-one corresponding with OS threads. Instead, Haskell
threads, represented with a TSO (thread-state object), are scheduled on
a smaller number of OS threads, abstracted in the RTS as Tasks.  Each OS
thread is associated with a CPU core, abstracted in the RTS as a
Capability.&lt;/p&gt;
&lt;p&gt;At the very start of execution, the number of OS threads is the same as
the number of virtual cores (as specified by the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;-N&lt;/span&gt;&lt;/tt&gt; RTS option): in
terms of Haskell code, we gain parallelism by having multiple
capabilities, &lt;em&gt;not&lt;/em&gt; multiple tasks!  A capability can only belong to one
task at a time.  However, if a task blocks on the operating system,
it may give up it’s capability to another task, which can continue
running Haskell code, thus we frequently refer to these tasks as worker
threads.&lt;/p&gt;
&lt;p&gt;A Task (OS thread) does work by executing InCalls requested by a TSO
(Haskell thread) in the run queue, scheduling them in a round-robin
fashion. During the course of this execution, it may run across an FFI
call.  The behavior here diverges depending on whether or not the FFI
call is safe or unsafe.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;If the call is unsafe, we just make the call, without
relinquishing the capability!  This means no other Haskell code
can run this virtual core, which is bad news if the FFI call
takes a long time or blocks, but if it’s really fast, we don’t
have to give up the capability only to snatch it back again.&lt;/li&gt;
&lt;li&gt;If the call is safe, we release the capability (allowing other
Haskell threads to proceed), and the Haskell thread is
suspended as waiting on a foreign call.  The current OS thread
then goes and runs the FFI call.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thus, if we attempt to directly wake up the original Haskell thread
by throwing it an exception, it will end up getting scheduled on
a &lt;em&gt;different&lt;/em&gt; OS thread (while the original thread continues running
the FFI call!)&lt;/p&gt;
&lt;p&gt;The trick is to kill the OS thread that is running the FFI call.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
   case BlockedOnCCall:
   case BlockedOnCCall_NoUnblockExc:
   {
#ifdef THREADED_RTS
       Task *task = NULL;
       if (!target-&amp;gt;bound) {
           // walk all_tasks to find the correct worker thread
           for (task = all_tasks; task != NULL; task = task-&amp;gt;all_link) {
               if (task-&amp;gt;incall-&amp;gt;suspended_tso == target) {
                   break;
               }
           }
           if (task != NULL) {
               raiseAsync(cap, target, msg-&amp;gt;exception, rtsFalse, NULL);
               pthread_cancel(task-&amp;gt;id);
               task-&amp;gt;cap = NULL;
               task-&amp;gt;stopped = rtsTrue;
               return THROWTO_SUCCESS;
           }
       }
#endif
       blockedThrowTo(cap,target,msg);
       return THROWTO_BLOCKED;
   }
&lt;/pre&gt;
&lt;p&gt;Which OS thread is it, anyhow?  It couldn’t possibly be thread
attempting to throw the exception and it doesn’t have anything to do
with the suspended Haskell thread, which is waiting to be woken up but
doesn’t know what it’s waiting to be woken up from.  However, the task
running the FFI call knows which Haskell thread is waiting on it, so we
can just walk the list of all tasks looking for the one that matches up
with the target of our exception.  Once we find it, we kill the thread
with fire (&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;pthread_cancel&lt;/span&gt;&lt;/tt&gt;) and wakeup the orignating Haskell thread
with an exception.&lt;/p&gt;
&lt;p&gt;There is one subtlety that Marlow pointed out: we do not want to
destroy bound threads, because they may contain thread local state.
Worker threads are identical and thus expendable, but bound threads
cannot be treated so lightly.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;We’ve been a bit mean: we haven’t given the library a chance to clean
up when it got interrupted.  Fortunately, the library can use
&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;pthread_setcancelstate&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;pthread_setcanceltype&lt;/span&gt;&lt;/tt&gt;, to give it
a chance to cleanup before exiting.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;It turns out that even with the RTS patch, we still aren’t quite
able to interrupt FFI calls.  If we add in an explicit new Haskell
thread, hwoever, things work:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt; interruptible :: a -&amp;gt; IO a -&amp;gt; IO a
&amp;gt; interruptible defaultVal m = do
&amp;gt;     mresult &amp;lt;- newEmptyMVar -- transfer exception to caller
&amp;gt;     mtid    &amp;lt;- newEmptyMVar
&amp;gt;     let install = installIntHandler (Catch ctrlc)
&amp;gt;         cleanup oldHandler = installIntHandler oldHandler &amp;gt;&amp;gt; return ()
&amp;gt;         ctrlc = do
&amp;gt;             hPutStrLn stderr &amp;quot;Caught signal&amp;quot;
&amp;gt;             tid &amp;lt;- readMVar mtid
&amp;gt;             throwTo tid E.UserInterrupt
&amp;gt;         bracket = reportBracket . E.bracket install cleanup . const
&amp;gt;         reportBracket action = do
&amp;gt;             putMVar mresult =&amp;lt;&amp;lt; E.catches (liftM Right action)
&amp;gt;                 [ E.Handler (\(e :: E.AsyncException) -&amp;gt;
&amp;gt;                     return $ case e of
&amp;gt;                         E.UserInterrupt -&amp;gt; Right defaultVal
&amp;gt;                         _ -&amp;gt; Left (E.toException e)
&amp;gt;                     )
&amp;gt;                 , E.Handler (\(e :: E.SomeException) -&amp;gt; return (Left e))
&amp;gt;                 ]
&amp;gt;     putMVar mtid =&amp;lt;&amp;lt; forkIO (bracket m)
&amp;gt;     either E.throw return =&amp;lt;&amp;lt; readMVar mresult -- one write only
&amp;gt;
&amp;gt; main = main' 3
&amp;gt;
&amp;gt; main' 0 = putStrLn &amp;quot;Quitting&amp;quot;
&amp;gt; main' n = do
&amp;gt;     interruptible () $ do
&amp;gt;         (r :: Either E.AsyncException ()) &amp;lt;- E.try $ foo n
&amp;gt;         putStrLn $ &amp;quot;Thread &amp;quot; ++ show n ++ &amp;quot; was able to catch exception&amp;quot;
&amp;gt;     main' (pred n)
&lt;/pre&gt;
&lt;p&gt;The output of this literate Haskell file, when compiled with
&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;-threaded&lt;/span&gt;&lt;/tt&gt; on the patched RTS is as follows:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Arf C 3!
Arf C 3!
^CCaught signal
Thread 3 was able to catch exception
Arf C 2!
Arf C 2!
Arf C 2!
^CCaught signal
Thread 2 was able to catch exception
Arf C 1!
Arf C 1!
^CCaught signal
Thread 1 was able to catch exception
Quitting
&lt;/pre&gt;
&lt;p&gt;Proof of concept accomplished!  Now to make it work on Windows...&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/PYbYvNxM7pc" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/interrupting-ghc/#comments" thr:count="4" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/interrupting-ghc/feed/atom/" thr:count="4" />
		<thr:total>4</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/interrupting-ghc/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[abcBridge: Functional interfaces for AIGs and SAT solving]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/a6JbwzQ2te8/" />
		<id>http://blog.ezyang.com/?p=2800</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-25T13:00:48Z</published>
		<category scheme="http://blog.ezyang.com" term="Galois Tech Talk" />		<summary type="html"><![CDATA[Yesterday I gave a Galois Tech Talk about abcBridge, a set of bindings in Haskell for ABC that I built over the summer as part of my internship. There should be a video soon, but until then, you can download my annotated slides. The software’s not public yet, but hopefully it will be soon.]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Yesterday I gave a &lt;a class="reference external" href="http://www.galois.com/blog/2010/08/19/tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/"&gt;Galois Tech Talk&lt;/a&gt; about abcBridge, a set of bindings in Haskell for &lt;a class="reference external" href="http://www.eecs.berkeley.edu/~alanmi/abc/"&gt;ABC&lt;/a&gt; that I built over the summer as part of my internship.&lt;/p&gt;
&lt;p&gt;There should be a video soon, but until then, you can &lt;a class="reference external" href="http://web.mit.edu/~ezyang/Public/galois.pdf"&gt;download my annotated slides&lt;/a&gt;. The software’s not public yet, but hopefully it will be soon.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/a6JbwzQ2te8" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/feed/atom/" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/galois-tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Type Kata: Distinguishing different data with the same underlying representation]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/ov1svSY6Jvg/" />
		<id>http://blog.ezyang.com/?p=2766</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-23T13:00:36Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Punning is the lowest form of humor. And an endless source of bugs. The imperative. In programming, semantically different data may have the same representation (type). Use of this data requires manually keeping track of what the extra information about the data that may be in a variable. This is dangerous when the alternative interpretation [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/type-kata-newtypes/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;div class="center container"&gt;
&lt;em&gt;Punning is the lowest form of humor. And an endless source of bugs.&lt;/em&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;The imperative.&lt;/em&gt; In programming, semantically different data may have the same representation (type). Use of this data requires manually keeping track of what the extra information about the data that may be in a variable. This is dangerous when the alternative interpretation is right &lt;em&gt;most&lt;/em&gt; of the time; programmers who do not fully understand all of the extra conditions are lulled into a sense of security and may write code that seems to work, but actually has subtle bugs. Here are some real world examples where it is particularly easy to confuse semantics.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Variables and literals.&lt;/em&gt; The following is a space efficient representation of boolean variables (&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x,&lt;/span&gt; &lt;span class="pre"&gt;y,&lt;/span&gt; &lt;span class="pre"&gt;z&lt;/span&gt;&lt;/tt&gt;) and boolean literals (&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x&lt;/span&gt;&lt;/tt&gt; or &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;not&lt;/span&gt; &lt;span class="pre"&gt;x&lt;/span&gt;&lt;/tt&gt;). Boolean variables are simply counted up from zero, but boolean literals are shifted left and least significant bit is used to store complement information.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
int Gia_Var2Lit( int Var, int fCompl )  { return Var + Var + fCompl; }
int Gia_Lit2Var( int Lit )              { return Lit &amp;gt;&amp;gt; 1;           }
&lt;/pre&gt;
&lt;p&gt;Consider, then, the following function:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
int Gia_ManHashMux( Gia_Man_t * p, int iCtrl, int iData1, int iData0 )
&lt;/pre&gt;
&lt;p&gt;It is not immediately obvious whether or not the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;iCtrl&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;iData1&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;iData0&lt;/span&gt;&lt;/tt&gt; arguments correspond to literals or variables: only an understanding of what this function does (it makes no sense to disallow muxes with complemented inputs) or an inspection of the function body is able to resolve the question for certain (the body calls &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Gia_LitNot&lt;/span&gt;&lt;/tt&gt;). Fortunately, due to the shift misinterpreting a literal as a variable (or vice versa) will usually result in a spectacular error. (Source: &lt;a class="reference external" href="http://www.eecs.berkeley.edu/~alanmi/abc/"&gt;ABC&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Pointer bits.&lt;/em&gt; It is well known that the lower two bits of a pointer are usually unused: on a 32-bit system, 32-bit integers are the finest granularity of alignment, which force any reasonable memory address to be divisible by four. Space efficient representations may use these two extra bits to store extra information but need to mask out the bits when dereferencing the pointer. Building on our previous example, consider a pointer representation of variables and literals: if a vanilla pointer indicates a variable, then we can use the lowest bit to indicate whether or not the variable is complemented or not, to achieve a literal representation.&lt;/p&gt;
&lt;p&gt;Consider the following function:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Gia_Obj_t *  Gia_ObjFanin0( Gia_Obj_t * pObj );
&lt;/pre&gt;
&lt;p&gt;where &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;iDiff0&lt;/span&gt;&lt;/tt&gt; is an &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;int&lt;/span&gt;&lt;/tt&gt; field in the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Gia_Obj_t&lt;/span&gt;&lt;/tt&gt; struct. It is not clear whether or not the input pointer or the output pointer may be complemented or not. In fact, the input pointer must not be complemented and the output pointer will never be complemented.&lt;/p&gt;
&lt;p&gt;Misinterpreting the output pointer as possibly complemented may seem harmless at first: all that happens is the lower two bits are masked out, which is a no-op on a normal pointer. However, it is actually a critical logic bug: it assumes that the returned pointer’s LSB says anything about whether or not the fanin was complemented, when in fact the returned bit will always be zero. (Source: &lt;a class="reference external" href="http://www.eecs.berkeley.edu/~alanmi/abc/"&gt;ABC&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Physical and virtual memory.&lt;/em&gt; One of the steps on the road to building an operating system is memory management. When implementing this, a key distinction is the difference between physical memory (what actually is on the hardware) and virtual memory (which your MMU translates from). The following code comes from a toy operating system skeleton that students build upon:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
/* This macro takes a kernel virtual address -- an address that points above
 * KERNBASE, where the machine's maximum 256MB of physical memory is mapped --
 * and returns the corresponding physical address.  It panics if you pass it a
 * non-kernel virtual address.
 */
#define PADDR(kva)                                          \
({                                                          \
        physaddr_t __m_kva = (physaddr_t) (kva);            \
        if (__m_kva &amp;lt; KERNBASE)                                     \
                panic(&amp;quot;PADDR called with invalid kva %08lx&amp;quot;, __m_kva);\
        __m_kva - KERNBASE;                                 \
})

/* This macro takes a physical address and returns the corresponding kernel
 * virtual address.  It panics if you pass an invalid physical address. */
#define KADDR(pa)                                           \
({                                                          \
        physaddr_t __m_pa = (pa);                           \
        uint32_t __m_ppn = PPN(__m_pa);                             \
        if (__m_ppn &amp;gt;= npage)                                       \
                panic(&amp;quot;KADDR called with invalid pa %08lx&amp;quot;, __m_pa);\
        (void*) (__m_pa + KERNBASE);                                \
})
&lt;/pre&gt;
&lt;p&gt;Note that though the code distinguishes with a type synonym &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;uintptr_t&lt;/span&gt;&lt;/tt&gt; (virtual addresses) from &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;physaddr_t&lt;/span&gt;&lt;/tt&gt; (physical addresses), the compiler will not stop the student from mixing the two up. (Source: &lt;a class="reference external" href="http://pdos.csail.mit.edu/6.828/2009/overview.html"&gt;JOS&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;em&gt;String encoding.&lt;/em&gt; Given an arbitrary sequence of bytes, there is no canonical interpretation of what the bytes are supposed to mean in human language. A decoder determines what the bytes probably mean (from out-of-band data like HTTP headers, or in-band data like meta tags) and then converts a byte stream into a more structured internal memory representation (in the case of Java, UTF-16). However, in many cases, the original byte sequence was the most efficient representation of the data: consider the space-difference between UTF-8 and UCS-32 for Latin text. This encourages developers to use native bytestrings to pass data around (PHP’s string type is just a bytestring), but has caused &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Mojibake"&gt;endless headaches&lt;/a&gt; if the appropriate encoding is not also kept track of. This is further exacerbated by the existence of Unicode normalization forms, which preclude meaningful equality checks between Unicode strings that may not be in the same normalization form (or may be completely un-normalized).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Endianness.&lt;/em&gt; Given four bytes corresponding to a 32-bit integer, there is no canonical “number” value that you may assign to the bytes: what number you get out is dependent on the endianness of your system. The sequence of bytes &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;0A&lt;/span&gt; &lt;span class="pre"&gt;0B&lt;/span&gt; &lt;span class="pre"&gt;0C&lt;/span&gt; &lt;span class="pre"&gt;0D&lt;/span&gt;&lt;/tt&gt; may be interpreted as &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;0x0A0B0C0D&lt;/span&gt;&lt;/tt&gt; (big endian) or &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;0x0D0C0B0A&lt;/span&gt;&lt;/tt&gt; (little endian).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Data validation.&lt;/em&gt; Given a data structure representing a human, with fields such as “Real name”, “Email address” and “Phone number”, there are two distinct interpretations that you may have of the data: the data is trusted to be correct and may be used to directly perform an operation such as send an email, or the data is unvalidated and cannot be trusted until it is processed. The programmer must remember what status the data has, or force a particular representation to never contain unvalidated data. “Taint” is a language feature that dynamically tracks the validated/unvalidated status of this data.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The kata.&lt;/em&gt; Whenever a data structure (whether simple or complex) could be interpreted multiple ways, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;newtype&lt;/span&gt;&lt;/tt&gt; it once for each interpretation.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
newtype GiaLit = GiaLit { unGiaLit :: CInt }
newtype GiaVar = GiaVar { unGiaVar :: CInt }

-- accessor functions omitted for brevity; they should be included

newtype CoGia_Obj_t = CoGia_Obj_t (Gia_Obj_t)

newtype PhysAddr a = PhysAddr (Ptr a)
newtype VirtualAddr a = VirtualAddr (Ptr a)

newtype RawBytestring = RawBytestring ByteString
-- where e is some Encoding
newtype EncodedBytestring e = EncodedBytestring ByteString
-- where n is some Normalization
newtype UTF8Bytestring n = UTF8Bytestring ByteString
type Text = UTF8Bytestring NFC

-- where e is some endianness
newtype EndianByteStream e = EndianByteStream ByteString

newtype Tainted c = Tainted c
newtype Clean c = Clean c
&lt;/pre&gt;
&lt;p&gt;Identifying when data may have multiple interpretations may not be immediately obvious. If you are dealing with underlying representations you did not create, look carefully at variable naming and functions that appear to interconvert between the same type. If you are designing a high-performance data structure, identify &lt;em&gt;your&lt;/em&gt; primitive data types (which are distinct from &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;int&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;char&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bool&lt;/span&gt;&lt;/tt&gt;, the primitives of a general purpose programming language.) Multiple interpretations can creep in over time as new features are added to code: be willing to refactor (possibly breaking API compatibility) or speculatively newtype important user-visible data.&lt;/p&gt;
&lt;p&gt;A common complaint about newtypes is the wrapping and unwrapping of the type. While some of this is a necessary evil, it should not be ordinarily necessary for end-users to wrap and unwrap the newtypes: the internal representation should stay hidden! (This is a closely related but orthogonal property that newtypes help enforce.) Try not to export newtype constructors; instead, export smart constructors and destructors that do runtime sanity checks and are prefixed with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;unsafe&lt;/span&gt;&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;When an underlying value is wrapped in the newtype, you are indicating to the compiler that you believe that the value has a meaningful interpretation under that newtype: do your homework when you wrap something! Conversely, you should assume that an incoming newtype has the appropriate invariants (it’s a valid UTF-8 string, its least significant bit is zero, etc.) implied by that newtype: let the static type checker do that work for you! Newtypes have no runtime overhead: they are strictly checked at compile time.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Applicability.&lt;/em&gt; A newtype is no substitute for an appropriate data structure: don’t attempt to do DOM transformations over a bytestring of HTML. Newtypes can be useful even when there is only one interpretation of the underlying representation—however, the immediate benefit derives primarily from encapsulation. However, newtypes are &lt;em&gt;essential&lt;/em&gt; when there are multiple interpretations of a representation: don’t leave home without them!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/ov1svSY6Jvg" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/type-kata-newtypes/#comments" thr:count="5" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/type-kata-newtypes/feed/atom/" thr:count="5" />
		<thr:total>5</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/type-kata-newtypes/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Day in the life of a Galois intern]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/HT9UedjGx8o/" />
		<id>http://blog.ezyang.com/?p=2712</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-20T13:00:11Z</published>
		<category scheme="http://blog.ezyang.com" term="Personal" />		<summary type="html"><![CDATA[Vrrmm! Vrrmm! Vrrmm! It's 9:00AM, and the cell phone next to my pillow is vibrating ominously. I rise and dismiss the alarm before it starts ringing in earnest and peek out the window of my room. Portland summer is a fickle thing: the weather of the first month of my internship was marked by mist [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/day-in-the-life-of-a-galois-intern/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;div class="center container"&gt;
&lt;em&gt;Vrrmm! Vrrmm! Vrrmm!&lt;/em&gt;&lt;/div&gt;
&lt;p&gt;It's 9:00AM, and the cell phone next to my pillow is vibrating ominously. I rise and dismiss the alarm before it starts ringing in earnest and peek out the window of my room.&lt;/p&gt;
&lt;p&gt;Portland summer is a fickle thing: the weather of the first month of my internship was marked by mist and rain (a phenomenon, Don tells me, which is highly unusual for Portland), while the weather of the second month was a sleepy gray in the mornings. &amp;quot;Is it summer yet?&amp;quot; was the topic of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;#galois&lt;/span&gt;&lt;/tt&gt; for most of July. But in the heart of August, summer has finally arrived, and the sun greets my gaze. Shorts and a T-shirt, no sweater necessary! I silently go &amp;quot;Yes!&amp;quot;&lt;/p&gt;
&lt;p&gt;I finish getting dressed, say goodbye to Pixie, the white cat who is curled up in my desk chair, skip breakfast, grab my bike, and head off in the direction of downtown Portland. (Warning: The rest of this post is similarly free of any sort of technical details (except at the very end)! Also, I'm a pretty terrible photographer with no idea how to post-process images.)&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;Commuting.&lt;/em&gt; I bike to work every day. The ride is about 30 minutes.&lt;/p&gt;
&lt;div class="center container"&gt;
&lt;iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=d&amp;amp;source=s_d&amp;amp;saddr=NE+Fremont+St&amp;amp;daddr=N+Fremont+St+to:N+Interstate+Ave%2FNE+Occident+Ave+to:Bikepath+to:45.520857,-122.673855+to:Galois+Inc,+Portland,+OR&amp;amp;hl=en&amp;amp;geocode=FeQCtwId7lSw-A%3BFeQCtwIdXDyw-A%3BFV62tgId6kSw-A%3BFR2etgIdZjaw-A%3B%3BFayXtgIdahWw-CHXJ17HlXiFNilfuhA-BAqVVDHPrxoxeoakMA&amp;amp;mra=dvme&amp;amp;mrcr=0&amp;amp;mrsp=4&amp;amp;sz=16&amp;amp;via=1,2,3,4&amp;amp;dirflg=b&amp;amp;sll=45.521158,-122.673533&amp;amp;sspn=0.008765,0.015879&amp;amp;ie=UTF8&amp;amp;ll=45.53389,-122.663727&amp;amp;spn=0.042086,0.072956&amp;amp;z=13&amp;amp;output=embed"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;p&gt;Along the way, I cross the Willamette:&lt;/p&gt;
&lt;img alt="/img/galois/bridge1.png" src="/img/galois/bridge1.png" /&gt;
&lt;img alt="/img/galois/walkway.png" src="/img/galois/walkway.png" /&gt;
&lt;p&gt;This is a double-decker bridge, and the bottom platform lifts up when boats need to pass through. Which, unfortunately, is occasionally during my morning commute, in which case I have to bike further down the Riverside Esplanade.&lt;/p&gt;
&lt;img alt="/img/galois/farfaraway.png" src="/img/galois/farfaraway.png" /&gt;
&lt;p&gt;At the end of my journey, I am greeted with the familiar face of the &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Commonwealth_Building_(Portland,_Oregon)"&gt;Commonwealth Building&lt;/a&gt;.&lt;/p&gt;
&lt;img alt="/img/galois/commonwealth.png" src="/img/galois/commonwealth.png" /&gt;
&lt;p&gt;It's a pretty famous building as far as buildings go: it was one of the first glass skyscrapers ever built. I hear architecture students from universities around the city come by to look at the building.&lt;/p&gt;
&lt;img alt="/img/galois/elevator.png" src="/img/galois/elevator.png" /&gt;
&lt;p&gt;Galois is on the third floor.&lt;/p&gt;
&lt;img alt="/img/galois/entrance.png" src="/img/galois/entrance.png" /&gt;
&lt;p&gt;I park my bike in one of the handy bikeracks in the office:&lt;/p&gt;
&lt;img alt="/img/galois/bikerack.png" src="/img/galois/bikerack.png" /&gt;
&lt;p&gt;and head off to my desk. (Crazy deskmate included. :-)&lt;/p&gt;
&lt;img alt="/img/galois/desk.png" src="/img/galois/desk.png" /&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;Office.&lt;/em&gt; Now that we're at Galois, perhaps it's time for a quick tour of the office. The Galois office is a single floor, with various rooms of note. Of ever-present importance is the kitchen:&lt;/p&gt;
&lt;img alt="/img/galois/kitchen1.png" src="/img/galois/kitchen1.png" /&gt;
&lt;p&gt;from which coffee can be acquired (Portlanders are very serious about their coffee! It makes me almost wish that I was a coffee-drinker):&lt;/p&gt;
&lt;img alt="/img/galois/coffee.png" src="/img/galois/coffee.png" /&gt;
&lt;p&gt;The kitchen is where the all-hands meeting takes place (Galois is small enough that you can fit all of the company's employees in a single room—only Ksplice, the startup I interned at, has also earned this distinction). One of the really good reflections on Galois’ culture is the practice of &lt;em&gt;appreciations&lt;/em&gt;, during which Galwegians (our name for Galois employees) appreciate one another for things that happened during the week.&lt;/p&gt;
&lt;p&gt;There is a small library (a nice quiet place to chill out if some particularly hard thinking is merited):&lt;/p&gt;
&lt;img alt="/img/galois/library.png" src="/img/galois/library.png" /&gt;
&lt;p&gt;A conference room:&lt;/p&gt;
&lt;img alt="/img/galois/conference.png" src="/img/galois/conference.png" /&gt;
&lt;p&gt;And even a little room where you can take a nap!&lt;/p&gt;
&lt;img alt="/img/galois/sleep.png" src="/img/galois/sleep.png" /&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;By twelve, we Galwegians are hungry, so we head out to get lunch.&lt;/p&gt;
&lt;img alt="/img/galois/food-carts1.png" src="/img/galois/food-carts1.png" /&gt;
&lt;p&gt;There is one tremendous advantage to being in downtown Portland: the &lt;a class="reference external" href="http://www.foodcartsportland.com/"&gt;food carts&lt;/a&gt;. I've never seen anything quite like it: blocks literally have fleets of carts lined up to serve you, whatever style of food you like.&lt;/p&gt;
&lt;img alt="/img/galois/food-carts2.png" src="/img/galois/food-carts2.png" /&gt;
&lt;p&gt;Portland is also famously vegan friendly. You can get Vegan Bacon Cheeseburgers! (They are quite delicious, speaking as a carnivore.)&lt;/p&gt;
&lt;img alt="/img/galois/vegan-bacon.png" src="/img/galois/vegan-bacon.png" /&gt;
&lt;p&gt;Or a fruit smoothie.&lt;/p&gt;
&lt;img alt="/img/galois/smoothies.png" src="/img/galois/smoothies.png" /&gt;
&lt;p&gt;After we get our food, it's back to the office to chow down.&lt;/p&gt;
&lt;img alt="/img/galois/lunch2.png" src="/img/galois/lunch2.png" /&gt;
&lt;img alt="/img/galois/lunch3.png" src="/img/galois/lunch3.png" /&gt;
&lt;p&gt;Our chief scientist and the engineer who sits across from me are having a post-lunch game of ping pong! (I've played a few rounds: they are quite good—back spin, top spin, it’s more than I can keep track of!)&lt;/p&gt;
&lt;img alt="/img/galois/pingpong2.png" src="/img/galois/pingpong2.png" /&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;Offer building.&lt;/em&gt; On Tuesdays, instead of converging on the kitchen, many of converge to the conference room: it’s the MOB lunch!&lt;/p&gt;
&lt;img alt="/img/galois/mob2.png" src="/img/galois/mob2.png" /&gt;
&lt;p&gt;MOB stands for “Merged Offer Building”, though the name itself has a nice flavor: “The MOB makes you an offer you can’t refuse.” Unlike traditional product companies, in which you have an engineering department which makes a product and then a sales department who finds clients and convinces them they want to buy your product, at Galois, for many contracts the engineers &lt;em&gt;are&lt;/em&gt; the salespeople: they are the ones responsible for writing the proposal we submit for funding. The MOB meeting coordinates all of the various offer building efforts—though it’s had no direct bearing to my internship, sitting in on MOB lunches has been a fascinating peek into the world of SBIRs, procurements, EC&amp;amp;A and many, many more acronyms.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;tl;dr&lt;/em&gt; Interning at Galois this summer has been a blast, and I’m very sorry that there is only one week left. I’ll miss all of you! ♥&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Postscript.&lt;/em&gt; After a summer of Tech Talk writeups, I'll be giving a &lt;a class="reference external" href="http://www.galois.com/blog/2010/08/19/tech-talk-abcbridge-functional-interfaces-for-aigs-and-sat-solving/"&gt;Galois Tech Talk&lt;/a&gt; myself, this coming Tuesday! It will get into the nitty gritty of abcBridge, the Haskell library I built over the summer. If you're in the area, come check it out!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/HT9UedjGx8o" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/day-in-the-life-of-a-galois-intern/#comments" thr:count="5" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/day-in-the-life-of-a-galois-intern/feed/atom/" thr:count="5" />
		<thr:total>5</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/day-in-the-life-of-a-galois-intern/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Type kata: Controlled sharing of references]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/9Bt-w-ozd1g/" />
		<id>http://blog.ezyang.com/?p=2693</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-18T13:00:51Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[The imperative. Mutable data structures with many children frequently force any given child to be associated with one given parent data structure: class DOMNode { private DOMDocument $ownerDocument; // ... public void appendNode(DOMNode n) { if (n.ownerDocument != this.ownerDocument) { throw DOMException(&#34;Cannot append node that &#34; &#34;does not belong to this document&#34;); } // ... [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/type-kata-controlled-sharing-of-references/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;&lt;em&gt;The imperative.&lt;/em&gt; Mutable data structures with many children frequently force any given child to be associated with one given parent data structure:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class DOMNode {
  private DOMDocument $ownerDocument;
  // ...
  public void appendNode(DOMNode n) {
    if (n.ownerDocument != this.ownerDocument) {
      throw DOMException(&amp;quot;Cannot append node that &amp;quot;
        &amp;quot;does not belong to this document&amp;quot;);
    }
    // ...
  }
}
&lt;/pre&gt;
&lt;p&gt;Client code must be careful not to mix up children that belong to different owners. An object can be copied from one owner to another via a special function.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class DOMDocument {
  public DOMNode importNode(DOMNode node) {
    // ...
  }
}
&lt;/pre&gt;
&lt;p&gt;Sometimes, a function of this style can only be called in special circumstances. If a mutable data structure is copied, and you would like to reference to a child in the new structure but you only have a reference to its original, an implementation may let you forward such a pointer, but only if the destination structure was the most recent copy.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class DOMNode {
  private DOMNode $copy;
}
&lt;/pre&gt;
&lt;p&gt;&lt;em&gt;The kata.&lt;/em&gt; Phantom types in the style of the &lt;a class="reference external" href="http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Control-Monad-ST.html"&gt;ST monad&lt;/a&gt; permit statically enforced separation of children from different monadic owners.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{-# LANGUAGE Rank2Types #-}
-- s is the phantom type
newtype DOM s a = ...
newtype Node s = ...
runDom :: (forall s. DOM s ()) -&amp;gt; Document
getNodeById :: Id -&amp;gt; DOM s (Node s)
deleteNode :: Node s -&amp;gt; DOM s ()

-- Does not typecheck, the second runDom uses a fresh
-- phantom variable which does not match node's
runDom $ do
  node &amp;lt;- getNodeById &amp;quot;myNode&amp;quot;
  let alternateDocument = runDom $ do
    deleteNode node
&lt;/pre&gt;
&lt;p&gt;To permit a value of any monad to be used in another monad, implement a function that is polymorphic in both phantom types:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
importNode :: Node s -&amp;gt; DOM s2 (Node s2)
setRoot :: Node s -&amp;gt; DOM s ()

-- This now typechecks
runDom $ do
  node &amp;lt;- getNodeById &amp;quot;myNode&amp;quot;
  let alternateDocument = runDom $ do
    node' &amp;lt;- importNode node
    setRoot node'
&lt;/pre&gt;
&lt;p&gt;The function will probably be monadic, because the implementation will need to know what owner the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Node&lt;/span&gt;&lt;/tt&gt; is being converted to.&lt;/p&gt;
&lt;p&gt;To only permit translation under certain circumstances, use a type constructor (you can get these using empty data declarations) on the phantom type:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{-# LANGUAGE EmptyDataDecls #-}
data Dup n
getNewNode :: Node s -&amp;gt; DOM (Dup s) (Node (Dup s))
dupDom :: DOM s () -&amp;gt; DOM s (DOM (Dup s) ())

-- This typechecks, and does not recopy the original node
runDom $ do
  node &amp;lt;- getNodeById &amp;quot;myNode&amp;quot;
  dupDom $ do
    node' &amp;lt;- getNewNode node
    ...
&lt;/pre&gt;
&lt;p&gt;&lt;em&gt;Applicability.&lt;/em&gt; Practitioners of Haskell are encouraged to implement and use pure data structures, where sharing renders this careful book-keeping of ownership unnecessary. Nevertheless, this technique can be useful when you are interfacing via the FFI with a library that requires these invariants.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/9Bt-w-ozd1g" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/type-kata-controlled-sharing-of-references/#comments" thr:count="4" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/type-kata-controlled-sharing-of-references/feed/atom/" thr:count="4" />
		<thr:total>4</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/type-kata-controlled-sharing-of-references/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[What high school Algebra quizzes and NP-complete problems have in common]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/rjFKU3PpnnM/" />
		<id>http://blog.ezyang.com/?p=2467</id>
		<!-- ezyang: omitted update time -->
		<published>2010-08-16T13:00:03Z</published>
		<category scheme="http://blog.ezyang.com" term="Computer Science" /><category scheme="http://blog.ezyang.com" term="Personal" />		<summary type="html"><![CDATA[What I did for my summer internship at Galois World of algebra quizzes. As a high schooler, I was using concepts from computer science long before I even knew what computer science was. I can recall taking a math quiz—calculators banned—facing a difficult task: the multiplication of large numbers. I was (and still am) very [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2010/08/what-high-school-algebra-quizzes-and-np-complete-problems-have-in-common/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;div class="center container"&gt;
&lt;em&gt;What I did for my summer internship at Galois&lt;/em&gt;&lt;/div&gt;
&lt;p&gt;&lt;em&gt;World of algebra quizzes.&lt;/em&gt; As a high schooler, I was using concepts from computer science long before I even knew what computer science was. I can recall taking a math quiz—calculators banned—facing a difficult task: the multiplication of large numbers. I was (and still am) very sloppy when it came to pencil-and-paper arithmetic—if I didn’t check my answers, I would invariably lose points because of “stupid mistakes.” Fortunately, I knew the following trick: if I summed together the digits of my factors (re-summing if the result was ten or more), the product of these two numbers should match the sum of the digits of the result. If not, I knew I had the wrong answer. It wasn’t until much later that I discovered that this was a very rudimentary form of the checksum.&lt;/p&gt;
&lt;p&gt;In fact, most of the tricks I rediscovered were motivated by a simple academic need: Was my answer correct or not? Indeed, while I didn’t know it at the time, this question would become the &lt;em&gt;fundamental basis&lt;/em&gt; for my internship at Galois this summer.&lt;/p&gt;
&lt;p&gt;At about the time I started learning algebra, I began to notice that my tricks for checking arithmetic had become insufficient. If a teacher asked me to calculate the expanded form of the polynomial &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;(x&lt;/span&gt; &lt;span class="pre"&gt;+&lt;/span&gt; &lt;span class="pre"&gt;2)(x&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;3)(x&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;5)&lt;/span&gt;&lt;/tt&gt;, I had to carry out multiple arithmetic steps before I arrived at an answer. Checking each step was tedious and prone to error—I knew too well that I would probably be blind to errors in the work I had just written. I wanted a different way to check that my answer was correct.&lt;/p&gt;
&lt;p&gt;Eventually, I realized that all I had to do was pick a value of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x&lt;/span&gt;&lt;/tt&gt; and substitute it into the original question and the answer &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x³&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;6x²&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;x&lt;/span&gt; &lt;span class="pre"&gt;+&lt;/span&gt; &lt;span class="pre"&gt;30&lt;/span&gt;&lt;/tt&gt;. If the values matched, I would be fairly confident in my answer. I also realized that if I picked a number like &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x&lt;/span&gt; &lt;span class="pre"&gt;=&lt;/span&gt; &lt;span class="pre"&gt;-2&lt;/span&gt;&lt;/tt&gt;, I wouldn’t even have to calculate the value of the original problem: the answer was obviously zero! I had “invented” unit testing, and at the hand of this technique, many symbolic expressions bent to my pencil. (I independently learned about unit testing as a teething programmer, but since a PHP programmer never codes very much math, I never made the connection.)&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;World of practical software testing.&lt;/em&gt; Here, we pass from the world of algebra quizzes to the world of software testing. The expressions being tested are more complicated than &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x³&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;6x²&lt;/span&gt; &lt;span class="pre"&gt;-&lt;/span&gt; &lt;span class="pre"&gt;x&lt;/span&gt; &lt;span class="pre"&gt;+&lt;/span&gt; &lt;span class="pre"&gt;30&lt;/span&gt;&lt;/tt&gt;, but most people still adopt the strategy of the high school me: they hand pick a few inputs to test that will give them reasonable confidence that their new implementation is correct. How does one know that the output of the program is the correct one? For many simple programs, the functionality being tested is simple enough that the tester mentally “knows” what the correct result is, and write it down manually—akin to picking inputs like &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;x&lt;/span&gt; &lt;span class="pre"&gt;=&lt;/span&gt; &lt;span class="pre"&gt;-2&lt;/span&gt;&lt;/tt&gt; that are particularly easy for a human to infer the answer to. For more complex programs, a tester may use a reference implementation to figure out what the expected behavior is supposed to be.&lt;/p&gt;
&lt;p&gt;Testing like this can only show the presence of bugs, not the absence of them. But, as many software companies have discovered, this is good enough! If the programmer misses an important test case and a bug report comes in, he fixes the bug and adds a regression test to deal with that buggy input. So, as pragmatists, we have settled for this state of affairs: manual case-by-case testing (which hopefully is automated). &lt;em&gt;The state of the art of conventional software testing is fundamentally the same as how a high-schooler checks his answers on an algebra quiz.&lt;/em&gt; Anything better lies beyond the dragons of theoretical computer science research.&lt;/p&gt;
&lt;blockquote&gt;
&lt;em&gt;Aside.&lt;/em&gt; As anyone who has written automated tests before can attest, automated tests are characterized by two primary chores: getting your code to be automatically testable in the first place (much easier if it’s arithmetic than if it’s a kernel driver) and coming up with interesting situations to test your code in. For the latter, it turns out that while humans can come up with decent edge-cases, they’re really &lt;em&gt;bad&lt;/em&gt; at coming up with random test-cases. Thus, some extremely practical high-tech testing techniques involve having a computer generate random inputs. &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Fuzz_testing"&gt;Fuzz testing&lt;/a&gt; and &lt;a class="reference external" href="http://en.wikipedia.org/wiki/QuickCheck"&gt;QuickCheck&lt;/a&gt; style testing are both characterized by this methodology, though fuzz testing prides itself in nonsensical inputs, while QuickCheck tries hard to generate sensible inputs.&lt;/blockquote&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;World of theoretical computer science.&lt;/em&gt; The teacher grading your algebra quiz doesn’t do something so simple as pick a few random numbers, substitute them into your answer, and see if she gets the right answer. Instead, she compares your answer (the program itself) against the one she has in the answer key (a reference implementation), and marks you correct if she is able to judge that the answers are the same. If you phrase your answer in terms of Fermat’s last theorem, she’ll mark you off for being cheeky.&lt;/p&gt;
&lt;p&gt;The reference implementation may be wrong (bug in the answer key), but in this case it’s our best metric for whether or not a program is “correct.” Since we’ve wandered into the land of theoretical computer science, we might ask this question to the &lt;a class="reference external" href="http://tvtropes.org/pmwiki/pmwiki.php/Main/LiteralGenie"&gt;Literal Genie&lt;/a&gt;: &lt;em&gt;Is it possible, in general, to determine if two programs are equivalent?&lt;/em&gt; The Literal Genie responds, “No!” The question is undecidable: there is no algorithm that can answer this question for all inputs. If you could determine if two programs were equivalent, you could solve the halting problem (the canonical example of an unsolvable problem): just check if a program was equivalent to an infinitely looping one.&lt;/p&gt;
&lt;p&gt;While the working theoretician may tame uncountably huge infinities on a regular basis, for a working programmer, the quantities handled on a regular basis are very much finite—the size of their machine integer, the amount of memory on their system, the amount of time a program is allowed to run. When you deal with infinity, all sorts of strange results appear. For example, &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Rice's_theorem"&gt;Rice’s theorem&lt;/a&gt; states that figuring out whether or not a program has &lt;em&gt;any&lt;/em&gt; non-trivial property (that is, there exists some program that has the property and some program that doesn’t) is undecidable! If we impose some reasonable constraints, such as “the program terminates in polynomial time for all inputs”, the answer to this question is yes! But can we do so in a way that is better than testing that the programs do the same thing on every input?&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;&lt;em&gt;World of more practical computer science.&lt;/em&gt; We’ve relinquished enough theoretical purity to make our question interesting again for software engineers, but it is still very difficult for the programmer to prove to himself that the algorithm is equivalent to his reference implementation. In contrast, it's easy for a user to show that the algorithm is wrong: all they have to do is give the programmer an input for which his implementation and the reference implementation disagree.&lt;/p&gt;
&lt;p&gt;Computer scientists have a name for this situation: problems for which you can verify their solutions (in this case, more of an anti-solution: a counter-example) in polynomial time are NP. Even if both programs run in constant time, as a combinational logic circuit might (to simulate such a circuit, we only need to propagate the inputs through as many gates as they are in the circuit: there is no dependence on the input), it still takes exponential time to brute-force an equivalence check. Every time we &lt;em&gt;add&lt;/em&gt; another bit to the input, we &lt;em&gt;double&lt;/em&gt; the amount of possible inputs to check.&lt;/p&gt;
&lt;p&gt;In fact, the question of circuit non-equivalence is NP-complete. We’ve been talking about program equivalence, but we can also talk about &lt;em&gt;problem&lt;/em&gt; equivalence, for which you can translate one problem (graph coloring) into another one (traveling salesman). In the seventies, computer scientists spent a lot of time proving that a lot of problems that required “brute force” were actually all the same problem. Stephen Cook introduced the idea that there were problems that were NP-complete: problems in NP for which we could translate all other problems in NP into. The most famous example of an NP-complete problem is SAT, in which given a logical formula with boolean variables, you ask whether or not there is a satisfying assignment of variables, variables that will cause this formula to be true.&lt;/p&gt;
&lt;p&gt;To show that circuit non-equivalence is NP-complete, we need to show that it is in NP (which we’ve done already) and show that we can translate some other NP-complete problem into this problem. This is quite easy to do with SAT: write a program that takes the boolean variables of SAT as inputs and outputs the result of the logical formula and then see if it’s equivalent to a program that always returns &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;false&lt;/span&gt;&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;The other direction is only slightly less trivial, but important practically speaking: if we can reduce our problem into an instance of SAT, I can chuck it a highly optimized SAT solver. A satisfiability problem is isomorphic to a logic circuit that outputs a single bit. We can translate a circuit equivalence problem into SAT by combining the circuits into what is called a “miter”: we combine the inputs of the two original logic circuits into a single set that feeds into both circuits, and then test the corresponding output bits between the two circuits for equality (XOR), ORing the entire result together. The resulting circuit outputs 0 if the outputs were the same between the two circuits (all of the XORs returned 0), and outputs 1 if there is a mismatch.&lt;/p&gt;
&lt;p&gt;“Great,” you may be thinking, “but I’m a programmer, not a hardware designer. Most of my programs can’t be expressed just in terms of logic gates!” That is true: to encode state, you also need latches, and input/output needs to be simulated with special input and output “ports”. However, there are many important problems that are purely combinational: the shining example of which is cryptography, which protects your money, employs a lot of complicated math and is ruthlessly optimized.&lt;/p&gt;
&lt;p&gt;But there still is one standing complaint: even if my programs are just logic circuits, I wouldn’t want to write them in terms of ANDs, ORs and NOTs. That just seems painful!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Enter &lt;a class="reference external" href="http://www.galois.com/technology/communications_security/cryptol"&gt;Cryptol&lt;/a&gt;, the project that I am working on at Galois. Cryptol bills itself as follows:&lt;/p&gt;
&lt;blockquote&gt;
Cryptol is a language for writing specifications for cryptographic algorithms. It is also a tool set for producing high-assurance, efficient implementations in VHDL, C, and Haskell. The Cryptol tools include the ability to equivalence check the reference specification against an implementation, whether or not it was compiled from the specifications.&lt;/blockquote&gt;
&lt;p&gt;But what really makes it notable, in my humble intern opinion, is the fact that it can take programs written in programming languages like C, VHDL or Cryptol and convert them into logic circuits, or, as we call them, “formal models”, which you can chuck at a SAT solver which will do something more sensible than brute-force all possible inputs. At one point, I thought to myself, “It’s a wonder that Cryptol even works at all!” But it does, and remarkably well for its problem domain of cryptographic algorithms. The state of the art in conventional software testing is manually written tests that can only show the presence of bugs in an implementation; &lt;em&gt;the state of the art in Cryptol is a fully automatic test that gives assurance that an implementation has no bugs.&lt;/em&gt; (Of course, Cryptol could be buggy, but such is the life of high assurance.)&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;SAT solvers are perhaps one of the most under-utilized high-tech tools that a programmer has at their fingertips. An industrial strength SAT solver can solve most NP-complete problems in time for lunch, and there are many, many problems in NP with wide-ranging practical applications. However, the usual roadblocks to using a SAT solver include:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;No easy way to translate your problem into SAT and then run it on one of the highly optimized solvers, which are frequently poorly documented, library-unfriendly projects in academia,&lt;/li&gt;
&lt;li&gt;Generating friendly error messages when your SAT solver passes or fails (depending on what is an “error”), and&lt;/li&gt;
&lt;li&gt;Convincing your team that, no really, you want a SAT solver (instead of building &lt;a class="reference external" href="http://algebraicthunk.net/~dburrows/blog/entry/package-management-sudoku/"&gt;your own, probably not-as-efficient implementation&lt;/a&gt;.)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;My primary project was addressing issue one, in Haskell, by building a set of bindings for &lt;a class="reference external" href="http://www.eecs.berkeley.edu/~alanmi/abc/"&gt;ABC, a System for Sequential Synthesis and Verification&lt;/a&gt; called &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;abcBridge&lt;/span&gt;&lt;/tt&gt;. One might observe that Haskell already has a number of SAT solving libraries: ABC is notable because it employs an alternative formulation of SAT in the form of And-Inverter Graphs (NAND gates are capable of simulating all boolean logic) as well as some novel technology for handling AIGs such as fraiging, which is a high-level strategy that looks for functionally equivalent subsets of your circuits.&lt;/p&gt;
&lt;p&gt;The project itself has been a lot of fun: since I was building this library from scratch, I had a lot of flexibility with API decisions, but at the same time got my hands into the Cryptol codebase, which I needed to integrate my bindings with. With any luck, we’ll be releasing the code as open source at the end of my internship. But I’m going to miss a lot more than my project when my internship ends in two weeks. I hope to follow up with a non-technical post about my internship. Stay tuned!&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Post factum.&lt;/em&gt; Hey, this is my hundredth post. Sweet!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/rjFKU3PpnnM" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2010/08/what-high-school-algebra-quizzes-and-np-complete-problems-have-in-common/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2010/08/what-high-school-algebra-quizzes-and-np-complete-problems-have-in-common/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2010/08/what-high-school-algebra-quizzes-and-np-complete-problems-have-in-common/</feedburner:origLink></entry>
	</feed>
