<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en" xml:base="http://blog.ezyang.com/wp-atom.php">
	<title type="text">Inside 233</title>
	<subtitle type="text">Existential Pontification and Generalized Abstract Digressions</subtitle>

	<updated>2012-01-28T13:30:59Z</updated>

	<link rel="alternate" type="text/html" href="http://blog.ezyang.com" />
	<id>http://blog.ezyang.com/feed/atom/</id>
	

	<generator uri="http://wordpress.org/" version="3.3.1">WordPress</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/ezyang" /><feedburner:info uri="ezyang" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[POPL]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/Bh8toscpgtg/" />
		<id>http://blog.ezyang.com/?p=6442</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-28T13:30:59Z</published>
		<category scheme="http://blog.ezyang.com" term="Computer Science" />		<summary type="html"><![CDATA[Last night, I returned from my very first POPL, very exhausted, and very satisfied. It was great putting faces to names, chatting with potential PhD supervisors (both from the US and in the UK), and reveling in the atmosphere. Highlights from my files: Tony Hoare, on being awarded the ACM SIGPLAN Programming Languages Achievement Award, [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/popl/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Last night, I returned from my very first POPL, very exhausted, and very satisfied. It was great putting faces to names, chatting with potential PhD supervisors (both from the US and in the UK), and reveling in the atmosphere.&lt;/p&gt;
&lt;p&gt;Highlights from my files:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Tony Hoare, on being awarded the ACM SIGPLAN Programming Languages Achievement Award, even though he has received so many other awards, “...it makes me feel a little guilty. I didn’t ask for it!”&lt;/li&gt;
&lt;li&gt;Hoare: “I &lt;em&gt;like&lt;/em&gt; mistakes; if I find it’s my fault, I can rectify it. If it’s someone else’s fault, there’s not much you can do.”&lt;/li&gt;
&lt;li&gt;Hoare: “Simplicity is &lt;em&gt;not&lt;/em&gt; an engineering goal.”&lt;/li&gt;
&lt;li&gt;Tony had just described how he retracted an accepted paper because the reasoning on it was intricate, and he didn’t think it presented program proving in a good light. The interviewer complimented him for his principles, saying that if he had a paper accepted which he didn’t think was up to snuff, he probably wouldn’t have the courage to retract it. It was “so brave.” To which Tony replies, “Well, I wish you could!” [laughter] “Unfortunately, the pressure to publish has increased. We feel obliged to publish every year, and the quality of the average paper is not improved.”&lt;/li&gt;
&lt;li&gt;One recurring theme: Tony Hoare mentioned that proofs and tests were not rivals, really they were just the same thing... just different. In the “Run your Research” talk, these theme came up again, where this time the emphasis was executable papers (the “run” in “Run your Research”).&lt;/li&gt;
&lt;li&gt;“Don’t just &lt;em&gt;support&lt;/em&gt; local reasoning, &lt;em&gt;demand&lt;/em&gt; it!” (Brute force proofs not allowed!)&lt;/li&gt;
&lt;li&gt;On JavaScript: “null is sort of object-y, while undefined is sort of primitive-y.”&lt;/li&gt;
&lt;li&gt;The next set come from the invited speaker from the computer networks community. “During 9/11, on average, the Internet was more stable. The reason for this was the NetOps &lt;em&gt;went home&lt;/em&gt;.” (on the causes of network outages in practice.)&lt;/li&gt;
&lt;li&gt;“Cisco routers have twenty million lines of code: there’s actually an Ada interpreter in there, as well as a Lisp interpreter.”&lt;/li&gt;
&lt;li&gt;“It used to be acceptable to go down for 100ms... now we have video games.”&lt;/li&gt;
&lt;li&gt;Speaker: “I am the walrus.” (Goo goo g' joob.)&lt;/li&gt;
&lt;li&gt;“I believe we do not reuse theorems. We reuse proof methods, but not the actual theorems. When we write papers, we create very shallow models, and we don’t build on previous work. It’s OK. It’s the design. It doesn’t matter too much. The SML standard was distributed with a bug report, with 100+ mistakes in the original definition. Doesn’t detract from its impact.”&lt;/li&gt;
&lt;li&gt;“Put on the algebraic goggles.”&lt;/li&gt;
&lt;li&gt;“The Navy couldn’t install it [a system for detecting when classified words were being transmitted on insecure channels], because doing so would be admitting there was a mistake.”&lt;/li&gt;
&lt;li&gt;“Move to something really modern, like Scheme!” (It’s like investing one trillion, and moving from the thirteenth century to the fourteenth.)&lt;/li&gt;
&lt;li&gt;Strother Moore (co-creator of ACL2): “After retirement, I’ll work more on ACL2, and then I’ll die.”&lt;/li&gt;
&lt;li&gt;“What’s the best way to make sure your C program is conforming?” “Write deterministic code.” [laughter]&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/Bh8toscpgtg" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/popl/#comments" thr:count="6" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/popl/feed/atom/" thr:count="6" />
		<thr:total>6</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/popl/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Modelling IO: MonadIO and beyond]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/SNXxLRkNNx0/" />
		<id>http://blog.ezyang.com/?p=6348</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-24T18:31:06Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[The MonadIO problem is, at the surface, a simple one: we would like to take some function signature that contains IO, and replace all instances of IO with some other IO-backed monad m. The MonadIO typeclass itself allows us to transform a value of form IO a to m a (and, by composition, any function [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/modelling-io/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;The MonadIO problem is, at the surface, a simple one: we would like to take some function signature that contains &lt;tt class="docutils literal"&gt;IO&lt;/tt&gt;, and replace all instances of &lt;tt class="docutils literal"&gt;IO&lt;/tt&gt; with some other IO-backed monad &lt;tt class="docutils literal"&gt;m&lt;/tt&gt;. The MonadIO typeclass itself allows us to transform a value of form &lt;tt class="docutils literal"&gt;IO a&lt;/tt&gt; to &lt;tt class="docutils literal"&gt;m a&lt;/tt&gt; (and, by composition, any function with an &lt;tt class="docutils literal"&gt;IO a&lt;/tt&gt; as the result). This interface is uncontroversial and quite flexible; it’s been in the bootstrap libraries ever since it was &lt;a class="reference external" href="https://github.com/ghc/packages-base/commit/7f1f4e7a695c402ddd3a1dc2cc7114e649a78ebc"&gt;created in 2001&lt;/a&gt; (originally in base, though it migrated to transformers later). However, it was soon discovered that when there were many functions with forms like &lt;tt class="docutils literal"&gt;IO a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; IO a&lt;/tt&gt;, which we wanted to convert into &lt;tt class="docutils literal"&gt;m a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; m a&lt;/tt&gt;; MonadIO had no provision for handling arguments in the &lt;em&gt;negative&lt;/em&gt; position of functions. This was particularly troublesome in the case of exception handling, where these higher-order functions were &lt;em&gt;primitive&lt;/em&gt;. Thus, the community began searching for a new type class which captured more of IO.&lt;/p&gt;
&lt;p&gt;While the semantics of lift were well understood (by the transformer laws), it wasn’t clear what a more powerful mechanism looked like. So, early attacks at the problem took the approach of picking a few distinguished functions which we wanted, placing them in a typeclass, and manually implementing lifted versions of them.  This lead to the development of the already existing &lt;tt class="docutils literal"&gt;MonadError&lt;/tt&gt; class into a more specialized &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt; class. However, Anders Kaseorg realized that there was a common pattern to the implementation of the lifted versions of these functions, which he factored out into the &lt;tt class="docutils literal"&gt;MonadMorphIO&lt;/tt&gt; class. This approach was refined into the &lt;tt class="docutils literal"&gt;MonadPeelIO&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;MonadTransControlIO&lt;/tt&gt; typeclasses. However, only &lt;tt class="docutils literal"&gt;MonadError&lt;/tt&gt; was in the core, and it had failed to achieve widespread acceptance due to some fundamental problems.&lt;/p&gt;
&lt;p&gt;I believe it is important and desirable for the community of library writers to converge on one of these type classes, for the primary reason that it is important for them to implement exception handling properly, a task which is impossible to do if you want to export an interface that requires only &lt;tt class="docutils literal"&gt;MonadIO&lt;/tt&gt;. I fully expected monad-control to be the “winner”, being the end at a long lineage of type classes. However, I think it would be more accurate to describe &lt;tt class="docutils literal"&gt;MonadError&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt; as one school of thought, and &lt;tt class="docutils literal"&gt;MonadMorphIO&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;MOnadPeelIO&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;MonadTransControlIO&lt;/tt&gt; as another.&lt;/p&gt;
&lt;p&gt;In this blog post, I’d like to examine and contrast these two schools of thought. A type class is an interface: it defines operations that some object supports, as well as laws that this object abides by.  The utility in a type class is both in its generality (the ability to support multiple implementations with one interface) as well as its precision (the restriction on permissible implementations by &lt;em&gt;laws&lt;/em&gt;, making it easier to reason about code that uses an interface). This is the essential tension: and these two schools have very different conclusions about how it should be resolved.&lt;/p&gt;
&lt;div class="section" id="modelling-exceptions"&gt;
&lt;h3&gt;Modelling exceptions&lt;/h3&gt;
&lt;p&gt;This general technique can be described as picking a few functions to generalize in a type class. Since a type class with less functions is preferable to one with more (for generality reasons), &lt;tt class="docutils literal"&gt;MonadError&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt; have a very particular emphasis on exceptions:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class (Monad m) =&amp;gt; MonadError e m | m -&amp;gt; e where
  throwError :: e -&amp;gt; m a
  catchError :: m a -&amp;gt; (e -&amp;gt; m a) -&amp;gt; m a

class MonadIO m =&amp;gt; MonadCatchIO m where
  catch   :: Exception e =&amp;gt; m a -&amp;gt; (e -&amp;gt; m a) -&amp;gt; m a
  block   :: m a -&amp;gt; m a
  unblock :: m a -&amp;gt; m a
&lt;/pre&gt;
&lt;p&gt;Unfortunately, these functions are marred by some problems:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;MonadError encapsulates an abstract notion of errors which does not necessarily include asynchronous exceptions. That is to say, &lt;tt class="docutils literal"&gt;catchError undefined h&lt;/tt&gt; will not necessarily run the exception handler &lt;tt class="docutils literal"&gt;h&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;MonadError is inadequate for robust handling of asynchronous exceptions, because it does not contain an interface for &lt;tt class="docutils literal"&gt;mask&lt;/tt&gt;; this makes it difficult to write bracketing functions robustly.&lt;/li&gt;
&lt;li&gt;MonadCatchIO explicitly only handles asynchronous exceptions, which means any pure error handling is not handled by it. This is the “finalizers are sometimes skipped” problem.&lt;/li&gt;
&lt;li&gt;MonadCatchIO, via the &lt;tt class="docutils literal"&gt;MonadIO&lt;/tt&gt; constraint, requires the API to support lifting arbitrary IO actions to the monad (whereas a monad designer may create a restricted IO backed monad, limiting what IO actions the user has access to.)&lt;/li&gt;
&lt;li&gt;MonadCatchIO exports the outdated &lt;tt class="docutils literal"&gt;block&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;unblock&lt;/tt&gt; function, while modern code should use &lt;tt class="docutils literal"&gt;mask&lt;/tt&gt; instead.&lt;/li&gt;
&lt;li&gt;MonadCatchIO exports an instance for the &lt;tt class="docutils literal"&gt;ContT&lt;/tt&gt; transformer. However, continuations and exceptions are &lt;a class="reference external" href="http://hpaste.org/56921"&gt;known to have nontrivial interactions&lt;/a&gt; which require extra care to handle properly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In some sense, &lt;tt class="docutils literal"&gt;MonadError&lt;/tt&gt; is a non-sequitur, because it isn’t tied to IO at all; perfectly valid instances of it exist for non-IO backed monads as well. &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt; is closer; the latter three points are not fatal ones could be easily accounted for:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class MonadException m where
  throwM  :: Exception e =&amp;gt; e -&amp;gt; m a
  catch   :: Exception e =&amp;gt; m a -&amp;gt; (e -&amp;gt; m a) -&amp;gt; m a
  mask    :: ((forall a. m a -&amp;gt; m a) -&amp;gt; m b) -&amp;gt; m b
&lt;/pre&gt;
&lt;p&gt;(With a removal of the &lt;tt class="docutils literal"&gt;ContT&lt;/tt&gt; instance.) However, the “finalizers are sometimes skipped” problem is a bit more problematic. In effect, it is the fact that there may exist zeros which a given instance of &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt; may not know about. It has been argued that &lt;a class="reference external" href="http://www.haskell.org/pipermail/haskell-cafe/2010-October/085079.html"&gt;these zeros are none of MonadCatchIO’s business&lt;/a&gt;; one inference you might draw from this is that if you have short-circuiting which you would like to respect finalizers installed using &lt;tt class="docutils literal"&gt;MonadException&lt;/tt&gt;, it should be implemented using asynchronous exceptions. In other words, &lt;tt class="docutils literal"&gt;ErrorT&lt;/tt&gt; is a bad idea.&lt;/p&gt;
&lt;p&gt;However, there is another perspective you can take: &lt;tt class="docutils literal"&gt;MonadException&lt;/tt&gt; is not tied just to asynchronous exceptions, but any zero-like value which obeys the same laws that exceptions obey. The semantics of these exceptions are described in the paper &lt;a class="reference external" href="http://community.haskell.org/~simonmar/papers/async.pdf"&gt;Asynchronous Exceptions in Haskell&lt;/a&gt;. They specify exactly the interaction of masking, throw and catch, as well as how interrupts can be introduced by other threads. In this view, whether or not this behavior is prescribed by the RTS or by passing pure values around is an implementation detail: as long as an instance is written properly, zeros will be properly handled. This also means that it is no longer acceptable to provide a &lt;tt class="docutils literal"&gt;MonadException&lt;/tt&gt; instance for &lt;tt class="docutils literal"&gt;ErrorT e&lt;/tt&gt;, unless we also have an underlying &lt;tt class="docutils literal"&gt;MonadException&lt;/tt&gt; for the inner monad: we can’t forget about exceptions on the lower layers!&lt;/p&gt;
&lt;p&gt;There is one last problem with this approach: once the primitives have been selected, huge swaths of the standard library have to be redefined by “copy pasting” their definitions but having them refer to the generalized versions. This is a significant practical hurdle for implementing a library based on this principle: it’s simply not enough to tack a &lt;tt class="docutils literal"&gt;liftIO&lt;/tt&gt; to the beginning of a function!&lt;/p&gt;
&lt;p&gt;I think an emphasis on the semantics of the defined type class will be critical for the future of this lineage of typeclasses; this is an emphasis that hasn’t really existed in the past. From this perspective, we define with our typeclass not only a way to access otherwise inaccessible functions in IO, but also how these functions should behave. We are, in effect, modeling a subset of IO. I think Conal Elliott &lt;a class="reference external" href="http://conal.net/blog/posts/notions-of-purity-in-haskell"&gt;would be proud&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
There is a &lt;a class="reference external" href="http://comments.gmane.org/gmane.comp.lang.haskell.cafe/93834"&gt;lively debate&lt;/a&gt; going on about extensions to the original semantics of asynchronous exceptions, allowing for the notion of “recoverable” and “unrecoverable” errors. (It’s nearer to the end of the thread.)&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div class="section" id="threading-pure-effects"&gt;
&lt;h3&gt;Threading pure effects&lt;/h3&gt;
&lt;p&gt;This technique can be described as generalizing the a common implementation technique which was used to implement many of the original functions in &lt;tt class="docutils literal"&gt;MonadCatchIO&lt;/tt&gt;. These are a rather odd set of signatures:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
class Monad m =&amp;gt; MonadMorphIO m where
  morphIO :: (forall b. (m a -&amp;gt; IO b) -&amp;gt; IO b) -&amp;gt; m a

class MonadIO m =&amp;gt; MonadPeelIO m where
  peelIO :: m (m a -&amp;gt; IO (m a))

class MonadBase b m =&amp;gt; MonadBaseControl b m | m -&amp;gt; b where
  data StM m :: * -&amp;gt; *
  liftBaseWith :: (RunInBase m b -&amp;gt; b a) -&amp;gt; m a
  restoreM :: StM m a → m a
type RunInBase m b = forall a. m a -&amp;gt; b (StM m a)
&lt;/pre&gt;
&lt;p&gt;The key intuition behind these typeclasses is that they utilize &lt;em&gt;polymorphism&lt;/em&gt; in the IO function that is being lifted in order to &lt;em&gt;thread the pure effects&lt;/em&gt; of the monad stack on top of IO. You can see this as the universal quantification in &lt;tt class="docutils literal"&gt;morphIO&lt;/tt&gt;, the return type of &lt;tt class="docutils literal"&gt;peelIO&lt;/tt&gt; (which is &lt;tt class="docutils literal"&gt;IO (m a)&lt;/tt&gt;, not &lt;tt class="docutils literal"&gt;IO a&lt;/tt&gt;), and the &lt;tt class="docutils literal"&gt;StM&lt;/tt&gt; associated type in &lt;tt class="docutils literal"&gt;MonadBaseControl&lt;/tt&gt;. For example, &lt;tt class="docutils literal"&gt;Int &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; StateT s IO a&lt;/tt&gt;, is equivalent to the type &lt;tt class="docutils literal"&gt;Int &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; s &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; IO (s, a)&lt;/tt&gt;. We can partially apply this function with the current state to get &lt;tt class="docutils literal"&gt;Int &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; IO (s, a)&lt;/tt&gt;; it should be clear then that as long as the IO function we’re lifting lets us smuggle out arbitrary values, we can smuggle out our updated state and reincorporate it when the lifted function finishes. The set of functions which are amenable to this technique are precisely the ones for which this threaded is possible.&lt;/p&gt;
&lt;p&gt;As I described in &lt;a class="reference external" href="http://blog.ezyang.com/2012/01/monadbasecontrol-is-unsound/"&gt;this post&lt;/a&gt;, this means that you won’t be able to get any transformer stack effects if they aren’t returned by the function. So perhaps a better word for MonadBaseControl is not that it is unsound (although it does admit strange behavior) but that it is incomplete: it cannot lift all IO functions to a form where the base monad effects and the transformer effects always occur in lockstep.&lt;/p&gt;
&lt;p&gt;This has some interesting implications. For example, this forgetfulness is in fact precisely the reason why a lifted bracketing function will always run no matter if there are other zeros: &lt;tt class="docutils literal"&gt;finally&lt;/tt&gt; by definition is only aware of asynchronous exceptions. This makes monad-control lifted functions very explicitly only handling asynchronous exceptions: a lifted &lt;tt class="docutils literal"&gt;catch&lt;/tt&gt; function will not catch an ErrorT zero. However, if you manually implement &lt;tt class="docutils literal"&gt;finally&lt;/tt&gt; using lifted versions of the more primitive functions, finalizers may be dropped.&lt;/p&gt;
&lt;p&gt;It also suggests an alternate implementation strategy for monad-control: rather than thread the state through the return type of a function, it could instead be embedded in a hidden IORef, and read out at the end of the computation. In effect, we would like to &lt;em&gt;embed&lt;/em&gt; the semantics of the pure monad transformer stack inside IO. Some care must be taken in the &lt;tt class="docutils literal"&gt;forkIO&lt;/tt&gt; case, however: the IORefs need to also be duplicated appropriately, in order to maintain thread locality, or MVars used instead, in order to allow coherent non-local communication.&lt;/p&gt;
&lt;p&gt;It is well known that MonadBaseControl does not admit a reasonable instance for ContT. Mikhail Vorozhtsov has argued that this is too restrictive. The difficulty is that while unbounded continuations do not play nice with exceptions, limited use of continuation passing style can be combined with exceptions in a sensible way. Unfortunately, monad-control makes no provision for this case: the function it asks a user to implement is too powerful. It seems the typeclasses explicitly modeling a subset of IO are, in some sense, more general! It also highlights the fact that these type classes are first and foremost driven by an abstraction of a common implementation pattern, rather than any sort of semantics.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;I hope this essay has made clear why I think of MonadBaseControl as an implementation strategy, and not as a reasonable &lt;em&gt;interface&lt;/em&gt; to program against. MonadException is a more reasonable interface, which has a semantics, but faces significant implementation hurdles.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/SNXxLRkNNx0" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/modelling-io/#comments" thr:count="6" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/modelling-io/feed/atom/" thr:count="6" />
		<thr:total>6</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/modelling-io/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[monad-control is tricky]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/AlRNI77vf8s/" />
		<id>http://blog.ezyang.com/?p=6397</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-23T17:39:00Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Editor's note. I've toned down some of the rhetoric in this post. The original title was &#34;monad-control is unsound&#34;. MonadBaseControl and MonadTransControl, from the monad-control package, specify an appealing way to automatically lift functions in IO that take &#34;callbacks&#34; to arbitrary monad stacks based on IO. Their appeal comes from the fact that they seem [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/monadbasecontrol-is-unsound/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;&lt;em&gt;Editor's note.&lt;/em&gt; I've toned down some of the rhetoric in this post. The original title was &amp;quot;monad-control is unsound&amp;quot;.&lt;/p&gt;
&lt;p&gt;MonadBaseControl and MonadTransControl, from the &lt;a class="reference external" href="http://hackage.haskell.org/package/monad-control"&gt;monad-control&lt;/a&gt; package, specify an appealing way to automatically lift functions in IO that take &amp;quot;callbacks&amp;quot; to arbitrary monad stacks based on IO. Their appeal comes from the fact that they seem to offer a more general mechanism than the alternative: picking some functions, lifting them, and then manually reimplementing generic versions of all the functions built on top of them.&lt;/p&gt;
&lt;p&gt;Unfortunately, monad-control has rather surprising behavior for many functions you might lift.&lt;/p&gt;
&lt;p&gt;For example, it doesn't work on functions which invoke the callback multiple times:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{-# LANGUAGE FlexibleContexts #-}

import Control.Monad.Trans.Control
import Control.Monad.State

double :: IO a -&amp;gt; IO a
double m = m &amp;gt;&amp;gt; m

doubleG :: MonadBaseControl IO m =&amp;gt; m a -&amp;gt; m a
doubleG = liftBaseOp_ double

incState :: MonadState Int m =&amp;gt; m ()
incState = get &amp;gt;&amp;gt;= \x -&amp;gt; put (x + 1)

main = execStateT (doubleG (incState)) 0 &amp;gt;&amp;gt;= print
&lt;/pre&gt;
&lt;p&gt;The result is &lt;tt class="docutils literal"&gt;1&lt;/tt&gt;, rather than &lt;tt class="docutils literal"&gt;2&lt;/tt&gt; that we would expect. If you are unconvinced, suppose that the signature of double was &lt;tt class="docutils literal"&gt;Identity a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Identity a&lt;/tt&gt;, e.g. &lt;tt class="docutils literal"&gt;a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; a&lt;/tt&gt;. There is only one possible implementation of this signature: &lt;tt class="docutils literal"&gt;id&lt;/tt&gt;. It should be obvious what happens, in this case.&lt;/p&gt;
&lt;p&gt;If you look closely at the types involved in MonadBaseControl, the reason behind this should become obvious: we rely on the polymorphism of a function we would like to lift in order to pass &lt;tt class="docutils literal"&gt;StM m&lt;/tt&gt; around, which is the encapsulated “state” of the monad transformers. If this return value is discarded by &lt;tt class="docutils literal"&gt;IO&lt;/tt&gt;, as it is in our function &lt;tt class="docutils literal"&gt;double&lt;/tt&gt;, there is no way to recover that state. (This is even alluded to in the &lt;tt class="docutils literal"&gt;liftBaseDiscard&lt;/tt&gt; function!)&lt;/p&gt;
&lt;p&gt;My conclusion is that, while monad-control may be a convenient implementation mechanism for lifted versions of functions, the functions it exports suffer from serious semantic incoherency. End-users, take heed!&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Postscript.&lt;/em&gt; A similar injunction holds for the previous versions of MonadBaseControl/MonadTransControl, which went by the names MonadPeel and MonadMorphIO.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/AlRNI77vf8s" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/monadbasecontrol-is-unsound/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/monadbasecontrol-is-unsound/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/monadbasecontrol-is-unsound/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Mystery Hunt and the Scientific Endeavour]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/avdBstWPWXw/" />
		<id>http://blog.ezyang.com/?p=6381</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-16T21:12:52Z</published>
		<category scheme="http://blog.ezyang.com" term="Personal" /><category scheme="http://blog.ezyang.com" term="Philosophy" />		<summary type="html"><![CDATA[It can be hard to understand the appeal of spending three days, without sleep, solving what some have called “the hardest recreational puzzles in the world,”; but over this weekend, hundreds of people converged on the MIT campus to do just that, as part of MIT Mystery Hunt. To celebrate the finding of the coin, [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/mystery-hunt-and-the-scientific-endeavour/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;It can be hard to understand the appeal of spending three days, without sleep, solving what some have called “&lt;a class="reference external" href="http://www.thisamericanlife.org/radio-archives/episode/326/quiz-show?act=2"&gt;the hardest recreational puzzles in the world,&lt;/a&gt;”; but over this weekend, hundreds of people converged on the MIT campus to do just that, as part of &lt;a class="reference external" href="http://web.mit.edu/puzzle/www/"&gt;MIT Mystery Hunt&lt;/a&gt;. To celebrate the finding of the coin, I'd like to share this little essay that I found in my files, which compares Mystery Hunt and the scientific endeavour. (If you are not familiar with Mystery Hunt, I recommend listening to the linked &lt;em&gt;This American Life&lt;/em&gt; program.)&lt;/p&gt;
&lt;p&gt;Thomas Kuhn, in his famous book &lt;em&gt;The Structure of Scientific Revolutions&lt;/em&gt;, states that “normal science” is “puzzle solving”: what he means is that the every day endeavors of scientists are the addressing of small, tractable problems, these problems are “puzzles” not “grand mysteries of the universe.” Kuhn goes on to describe what is involved with normal science: generation of facts, increasing the fit between theory and observation, and paradigm articulation. We will see that, as one might expect, these activities are part of “normal” puzzle solving. But (perhaps more unexpectedly) Popperian falsification and Kuhnian revolutions also have something to say about this situation. There are limits to the analogy of puzzles to science, the perhaps most striking of which is that a puzzle has a single, definite solution. But because it is not possible to call up the puzzle writers midway through a puzzle and ask them, “Am I on the right track?” (you are only allowed to phone in the final answer) the intermediate steps still are somewhat informative of the practice of science. In this context, I argue that Popper assumes a microscopic view of scientific progress, whereas Kuhn assumes a macroscopic view of scientific progress.&lt;/p&gt;
&lt;p&gt;What is a Mystery Hunt puzzle? This is a question that, at first glance, may seem to defy answering: puzzles can vary from a crossword to an album of images of birds to a single audio file of seemingly random electronic pitches. The answer is always a  phrase, perhaps “KEPLERS THIRD LAW” or “BOOTS,” but a puzzle, like a scientific problem, doesn't come with instructions for how to solve it. Thus, every Mystery Hunt puzzle starts off in Kuhn's pre-science stage: without any theory about how the puzzle works, puzzlers roll up their sleeves and start collecting miscellaneous facts that may be relevant to the puzzle at hand. If there are pictures of birds, one starts off identifying what the birds are; if there are short video clips of people waving flags, one starts off decoding the semaphore messages. This is a stage that doesn't require very much insight: there is an obvious thing to do. Some of the information collected may be irrelevant (just as the Linnaean classification of species was broadly modified in light of modern information about observable characteristics of plants and animals), but all-in-all this information forms a useful basis for theory formation. But while Popper doesn’t have much to say about data collection, Kuhn’s concept of the theory-ladenness of observation is important. The theory-ladenness of observation states that it is impossible to make an observation without some preconceptions and theories about how the world works. For example, a list of images of birds may suggest that each bird needs to be identified, but in the process of this identification it may be realized that the images corresponded directly to watercolor engravings from Audubon's birds of America (in which case the new question is, which plates?) Even during pre-science, small theories are continually being invented and falsified.&lt;/p&gt;
&lt;p&gt;Once the data has been collected, a theory about how all the data is to be put together must be formed: this step is called “answer extraction”. In regular science, forming the right theory is something that can take many, many years; for a puzzle, the process is accelerated by a collection of pre-existing theories that an experienced puzzler may attempt (e.g. each item has a numbering which corresponds to a letter) as well as hints that a puzzle writer may place in a puzzle's flavor text and title (e.g. “Song of birds” refers to “tweeting” refers to “Twitter”, which means the team should take an extracted phrase to mean a Twitter account.) Naïve inductivism suggests that unprejudiced observation of the data should lead to a theory which explains all of the information present. However, puzzles are specifically constructed to resist this sort of straightforward observation: instead, what more frequently happens is akin to Popper's falsificationism, where theories are invented out of sheer creativity (or perhaps historical knowledge of previous puzzles) and then they are tested against the puzzle. To refer once again to the bird puzzle, one proposed theory may be that the first letter of the scientific names of the birds spells a word in the English language. When the scientific names are gathered and the first letters found not to form a word, the theory is falsified, and we go and look for something else to do. This makes the Popperian view highly individualistic: while only some people may come up with the “good ideas”, anyone can falsify a theory by going out and doing the necessary calculation. Sophisticated falsification allows for the fact that someone might go out and do the calculation incorrectly (thus sending the rest of the puzzlers on a wild goose hunt until someone comes back to the original idea and realizes it actually does work.) However, it only explains the scientific endeavor at very high resolution: it explains the process for a single theory; and we'll see that Kuhn's paradigms help broaden our perspective on the overall puzzle solving endeavor.&lt;/p&gt;
&lt;p&gt;Kuhn states that normal science organizes itself around paradigms, which are characterized by some fundamental laws (Maxwell's equations, Newton's laws) as well conventions for how these laws are to be used to solve problems. Unlike a theory, a paradigm cannot be “falsified”, at least in the Popperian sense: a paradigm can accommodate abnormalities, which may resolve themselves after further investigation. The difference is one of scope. So, in a puzzle, while we might have a straightforwardly falsifiable theory “the first letters form a word,” a more complicated, thematic idea such as “the puzzle is Dr. Who themed” is closer to a paradigm, in that the precise mechanism by which you get answer from “Dr. Who” is unspecified, to be resolved by “normal puzzling.” The paradigm is vague, but it has “the right idea”, and with sufficient effort, the details can be worked out. Which paradigm is to be worked on is a social process: if a puzzler comes in freshly to a puzzle and finds that a group of people is already working within one paradigm, he is more likely to follow along those lines. Of course, if this group is stuck, they may call in someone from the outside precisely to think &lt;em&gt;outside&lt;/em&gt; of the paradigm. In this manner, revolutions, as Kuhn describes them, occur. When a paradigm is failing to make progress (e.g. failing to admit an answer), the puzzlers will continue to attempt to apply it until a convincing alternative paradigm comes along, at which point they may all jump to this new method. The revolution is compressed, but it still carries many common traits: including the lone puzzler who still thinks the old method will admit an answer with “just a little more work.” (Sometimes they're right!)&lt;/p&gt;
&lt;p&gt;We see a striking correspondence between the activities of a scientist working in a lab, and the activities of a scientist working on a Mystery Hunt puzzle. If you'll allow me to indulge in a little psychoanalysis, I think this similarity is part of the reason why Mystery Hunt is so appealing to people with a science, technology, engineering and maths background: in your every day life, you are faced with the most vicious puzzles known to man, as they are by definition the ones that have resisted any attempt to resolution. Months can go by without any progress. In Mystery Hunt, you are once again faced with some of the most vicious puzzles known to man. But there's a difference. You see, in Mystery Hunt, there is an answer. And you can find it.&lt;/p&gt;
&lt;p&gt;Happy puzzling!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/avdBstWPWXw" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/mystery-hunt-and-the-scientific-endeavour/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/mystery-hunt-and-the-scientific-endeavour/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/mystery-hunt-and-the-scientific-endeavour/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Problem Set: The Codensity Transformation]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/SrbeBDehrxU/" />
		<id>http://blog.ezyang.com/?p=6365</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-07T08:00:20Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Have you ever wondered how the codensity transformation, a surprisingly general trick for speeding up the execution of certain types of monads, worked, but never could understand the paper or Edward Kmett's blog posts on the subject? Look no further: below is a problem set for learning how this transformation works. The idea behind these [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/problem-set-the-codensity-transformation/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Have you ever wondered how the &lt;em&gt;codensity transformation&lt;/em&gt;, a surprisingly
general trick for speeding up the execution of certain types of monads, worked,
but never could understand the paper or Edward Kmett's blog posts on the subject?&lt;/p&gt;
&lt;p&gt;Look no further: below is a &lt;em&gt;problem set&lt;/em&gt; for learning how this transformation works.&lt;/p&gt;
&lt;p&gt;The idea behind these exercises is to get you comfortable with the
types involved in the codensity transformation, achieved by using the
types to guide yourself to the only possible implementation.  We warm
up with the classic concrete instance for leafy trees, and then
generalize over all free monads (don't worry if you don't know what
that is: we'll define it and give some warmup exercises).&lt;/p&gt;
&lt;p&gt;Experience writing in continuation-passing style may be useful,
although in practice this amounts to &amp;quot;listen to the types!&amp;quot;&lt;/p&gt;
&lt;p&gt;Solutions and more commentary may be found in Janis Voigtlander's
paper &amp;quot;&lt;a class="reference external" href="http://www.iai.uni-bonn.de/~jv/mpc08.pdf"&gt;Asymptotic Improvement of Computations over Free Monads.&lt;/a&gt;&amp;quot;&lt;/p&gt;
&lt;p&gt;To read more, see Edward Kmett's excellent article which further
generalizes this concept:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="http://comonad.com/reader/2011/free-monads-for-less/"&gt;http://comonad.com/reader/2011/free-monads-for-less/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://comonad.com/reader/2011/free-monads-for-less-2/"&gt;http://comonad.com/reader/2011/free-monads-for-less-2/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://comonad.com/reader/2011/free-monads-for-less-3/"&gt;http://comonad.com/reader/2011/free-monads-for-less-3/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If there is a demand, I can add a hints section for the exercises.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{-# LANGUAGE Rank2Types, MultiParamTypeClasses, FlexibleInstances #-}
import Prelude hiding (abs)

_EXERCISE_ = undefined

-----------------------------------------------------------------------------
-- Warmup: Hughes lists
-----------------------------------------------------------------------------

-- Experienced Haskellers should feel free to skip this section.

-- We first consider the problem of left-associative list append.  In
-- order to see the difficulty, we will hand-evaluate a lazy language.
-- For the sake of being as mechanical as possible, here are the
-- operational semantics, where e1, e2 are expressions and x is a
-- variable, and e1[e2/x] is replace all instances of x in e1 with e2.
--
--        e1 ==&amp;gt; e1'
--   ---------------------
--     e1 e2 ==&amp;gt; e1' e2
--
--   (\x -&amp;gt; e1[x]) e2 ==&amp;gt; e1[e2/x]
--
-- For reference, the definition of append is as follows:
--
--      a ++ b = foldr (:) b a
--
-- Assume that, on forcing a saturated foldr, its third argument is
-- forced, as follows:
--
--                e1 ==&amp;gt; e1'
--    -----------------------------------
--      foldr f e2 e1 ==&amp;gt; foldr f e2 e1'
--
--  foldr f e2 (x:xs) ==&amp;gt; f x (foldr f e2 xs)
--
-- Hand evaluate this implementation by forcing the head constructor,
-- assuming 'as' is not null:

listsample as bs cs = (as ++ bs) ++ cs

-- Solution:
--
--        (as ++ bs) ++ cs
--      = foldr (:) cs (as ++ bs)
--      = foldr (:) cs (foldr (:) bs as)
--      = foldr (:) cs (foldr (:) bs (a:as'))
--      = foldr (:) cs (a : foldr (:) b as')
--      = a : foldr (:) cs (foldr (:) bs as')
--
-- Convince yourself that this takes linear time per append, and that
-- processing each element of the resulting tail of the list will also
-- take linear time.

-- We now present Hughes lists:

type Hughes a = [a] -&amp;gt; [a]

listrep :: Hughes a -&amp;gt; [a]
listrep = _EXERCISE_

append :: Hughes a -&amp;gt; Hughes a -&amp;gt; Hughes a
append = _EXERCISE_

-- Now, hand evaluate your implementation on this sample, assuming all
-- arguments are saturated.

listsample' a b c = listrep (append (append a b) c)

-- Solution:
--
--        listrep (append (append a b) c)
--      = (\l -&amp;gt; l []) (append (append a b) c)
--      = (append (append a b) c) []
--      = (\z -&amp;gt; (append a b) (c z)) []
--      = (append a b) (c [])
--      = (\z -&amp;gt; a (b z)) (c [])
--      = a (b (c []))
--
-- Convince yourself that the result requires only constant time per
-- element, assuming a, b and c are of the form (\z -&amp;gt; a1:a2:...:z).
-- Notice the left-associativity has been converted into
-- right-associative function application.

-- The codensity transformation operates on similar principles.  This
-- ends the warmup.

-----------------------------------------------------------------------------
-- Case for leafy trees
-----------------------------------------------------------------------------

-- Some simple definitions of trees

data Tree a = Leaf a | Node (Tree a) (Tree a)

-- Here is the obvious monad definition for trees, where each leaf
-- is substituted with a new tree.

instance Monad Tree where
    return = Leaf
    Leaf a   &amp;gt;&amp;gt;= f = f a
    Node l r &amp;gt;&amp;gt;= f = Node (l &amp;gt;&amp;gt;= f) (r &amp;gt;&amp;gt;= f)

-- You should convince yourself of the performance problem with this
-- code by considering what happens if you force it to normal form.

sample = (Leaf 0 &amp;gt;&amp;gt;= f) &amp;gt;&amp;gt;= f
    where f n = Node (Leaf (n + 1)) (Leaf (n + 1))

-- Let's fix this problem.  Now abstract over the /leaves/ of the tree

newtype CTree a = CTree { unCTree :: forall r. (a -&amp;gt; Tree r) -&amp;gt; Tree r }

-- Please write functions which witness the isomorphism between the
-- abstract and concrete versions of trees.

treerep :: Tree a -&amp;gt; CTree a
treerep = _EXERCISE_

treeabs :: CTree a -&amp;gt; Tree a
treeabs = _EXERCISE_

-- How do you construct a node in the case of the abstract version?
-- It is trivial for concrete trees.

class Monad m =&amp;gt; TreeLike m where
    node :: m a -&amp;gt; m a -&amp;gt; m a
    leaf :: a -&amp;gt; m a
    leaf = return

instance TreeLike Tree where
    node = Node

instance TreeLike CTree where
    node = _EXERCISE_

-- As they are isomorphic, the monad instance carries over too.  Don't
-- use rep/abs in your implementation.

instance Monad CTree where
    return = _EXERCISE_
    (&amp;gt;&amp;gt;=)  = _EXERCISE_ -- try explicitly writing out the types of the arguments

-- We now gain efficiency by operating on the /abstracted/ version as
-- opposed to the ordinary one.

treeimprove :: (forall m. TreeLike m =&amp;gt; m a) -&amp;gt; Tree a
treeimprove m = treeabs m

-- You should convince yourself of the efficiency of this code.
-- Remember that expressions inside lambda abstraction don't evaluate
-- until the lambda is applied.

Tample' = treeabs ((leaf 0 &amp;gt;&amp;gt;= f) &amp;gt;&amp;gt;= f)
    where f n = node (leaf (n + 1)) (leaf (n + 1))

-----------------------------------------------------------------------------
-- General case
-----------------------------------------------------------------------------

-- Basic properties about free monads

data Free f a = Return a | Wrap (f (Free f a))
instance Functor f =&amp;gt; Monad (Free f) where
    return = _EXERCISE_
    (&amp;gt;&amp;gt;=)  = _EXERCISE_ -- tricky!

-- Leafy trees are a special case, with F as the functor. Please write
-- functions which witness this isomorphism.

data F a = N a a

freeFToTree :: Free F a -&amp;gt; Tree a
freeFToTree = _EXERCISE_

treeToFreeF :: Tree a -&amp;gt; Free F a
treeToFreeF = _EXERCISE_

-- We now define an abstract version of arbitrary monads, analogous to
-- abstracted trees.  Witness an isomorphism.

newtype C m a = C { unC :: forall r. (a -&amp;gt; m r) -&amp;gt; m r }

rep :: Monad m =&amp;gt; m a -&amp;gt; C m a
rep = _EXERCISE_

abs :: Monad m =&amp;gt; C m a -&amp;gt; m a
abs = _EXERCISE_

-- Implement the monad instance from scratch, without rep/abs.

instance Monad (C m) where
    return = _EXERCISE_
    (&amp;gt;&amp;gt;=)  = _EXERCISE_ -- also tricky; if you get stuck, look at the
                        -- implementation for CTrees

-- By analogy of TreeLike for free monads, this typeclass allows
-- the construction of non-Return values.

class (Functor f, Monad m) =&amp;gt; FreeLike f m where
    wrap :: f (m a) -&amp;gt; m a

instance Functor f =&amp;gt; FreeLike f (Free f) where
    wrap = Wrap

instance FreeLike f m =&amp;gt; FreeLike f (C m) where
    -- Toughest one of the bunch. Remember that you have 'wrap' available for the
    -- inner type as well as functor and monad instances.
    wrap = _EXERCISE_

-- And for our fruits, we now have a fully abstract improver!

improve :: Functor f =&amp;gt; (forall m. FreeLike f m =&amp;gt; m a) -&amp;gt; Free f a
improve m = abs m

-- Bonus: Why is the universal quantification over 'r' needed?   What if
-- we wrote C r m a = ...?  Try copypasting your definitions for that
-- case.
&lt;/pre&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/SrbeBDehrxU" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/problem-set-the-codensity-transformation/#comments" thr:count="7" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/problem-set-the-codensity-transformation/feed/atom/" thr:count="7" />
		<thr:total>7</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/problem-set-the-codensity-transformation/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Why iteratees are hard to understand]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/98zPbVemgE0/" />
		<id>http://blog.ezyang.com/?p=6321</id>
		<!-- ezyang: omitted update time -->
		<published>2012-01-04T13:00:26Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[There are two primary reasons why the low-level implementations of iteratees, enumerators and enumeratees tend to be hard to understand: purely functional implementation and inversion of control. The strangeness of these features is further exacerbated by the fact that users are encouraged to think of iteratees as sinks, enumerators as sources, and enumeratees as transformers. [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2012/01/why-iteratees-are-hard-to-understand/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;There are two primary reasons why the low-level implementations of
iteratees, enumerators and enumeratees tend to be hard to understand:
&lt;em&gt;purely functional implementation&lt;/em&gt; and &lt;em&gt;inversion of control&lt;/em&gt;.
The strangeness of these features is further exacerbated by
the fact that users are encouraged to think of iteratees as sinks,
enumerators as sources, and enumeratees as transformers. This intuition works
well for clients of iteratee libraries but confuses people interested in
digging into the internals.&lt;/p&gt;
&lt;p&gt;In this article, I’d like to explain the strangeness imposed by the
&lt;em&gt;purely functional implementation&lt;/em&gt; by comparing it to an implementation
you might see in a traditional, &lt;em&gt;imperative&lt;/em&gt;, object-oriented language.
We’ll see that concepts which are obvious and easy in an imperative
setting are less-obvious but only slightly harder in a purely functional
setting.&lt;/p&gt;
&lt;div class="section" id="types"&gt;
&lt;h3&gt;Types&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;The following discussion uses nomenclature from the enumerator
library, since at the time of the writing it seems to be the most
popular implementation of iteratees currently in use.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The fundamental entity behind an iteratee is the &lt;tt class="docutils literal"&gt;Step&lt;/tt&gt;. The usual
intuition is that is represents the “state” of an iteratee, which is
either done or waiting for more input. But we’ve cautioned against
excessive reliance on metaphors, so let’s look at the types instead:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data Step a b = Continue (Stream a -&amp;gt; m (Step a b)) | Yield b
type Iteratee     a b =                  m (Step a b)
type Enumerator   a b = Step a b -&amp;gt;      m (Step a b)
type Enumeratee o a b = Step a b -&amp;gt; Step o (Step a b)
&lt;/pre&gt;
&lt;p&gt;I have made some extremely important simplifications from the enumerator
library, most of important of which is explicitly writing out the
&lt;tt class="docutils literal"&gt;Step&lt;/tt&gt; data type where we would have seen an &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt; instead and
making &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt; a pure function.  The goal of the next three
sections is to explain what each of these type signatures means; we’ll
do this by analogy to the imperative equivalents of iteratees.  The
imperative programs should feel intuitive to most programmers, and the
hope is that the pure encoding should only be a hop away from there.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="step-iteratee"&gt;
&lt;h3&gt;Step/Iteratee&lt;/h3&gt;
&lt;p&gt;We would like to design an object that is either waiting for input or
finished with some result.  The following might be a proposed
interface:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
interface Iteratee&amp;lt;A,B&amp;gt; {
  void put(Stream&amp;lt;A&amp;gt;);
  Maybe&amp;lt;B&amp;gt; result();
}
&lt;/pre&gt;
&lt;p&gt;This implementation critically relies on the identity of an object of
type &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt;, which maintains this identity across arbitrary calls
to &lt;tt class="docutils literal"&gt;put&lt;/tt&gt;.  For our purposes, we need to translate &lt;tt class="docutils literal"&gt;put :: IORef s &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt;
Stream a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; IO ()&lt;/tt&gt; (first argument is the Iteratee) into a purely
functional interface.  Fortunately, it’s not too difficult to see how to
do this if we understand how the &lt;tt class="docutils literal"&gt;State&lt;/tt&gt; monad works:  we replace the
old type with &lt;tt class="docutils literal"&gt;put :: s &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Stream a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; s&lt;/tt&gt;, which takes the original
state of the iteratee (&lt;tt class="docutils literal"&gt;s = Step a b&lt;/tt&gt;) and some input, and transforms it
into a new state.  The final definition &lt;tt class="docutils literal"&gt;put :: Step a b &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Stream a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt;
m (Step a b)&lt;/tt&gt; also accomodates the fact that an iteratee may have some
other side-effects when it receives data, but we are under no compulsion
to use this monad instance; if we set it to the identity monad our
iteratee has no side effects (&lt;tt class="docutils literal"&gt;StateT&lt;/tt&gt; may be the more apt term here).
In fact, this is precisely the accessor for the field in the
&lt;tt class="docutils literal"&gt;Continue&lt;/tt&gt; constructor.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="enumerator"&gt;
&lt;h3&gt;Enumerator&lt;/h3&gt;
&lt;p&gt;We would like to design an object that takes an iteratee and feeds it
input. It’s pretty simple, just a function which mutates its input:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
void Enumerator(Iteratee&amp;lt;A,B&amp;gt;);
&lt;/pre&gt;
&lt;p&gt;What does the type of an enumerator have to say on the matter?&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumerator a b = Step a b -&amp;gt; m (Step a b)
&lt;/pre&gt;
&lt;p&gt;If we interpret this as a state transition function, it’s clear that an
enumerator is a function that &lt;em&gt;transforms&lt;/em&gt; an iteratee from one state to
another, much like the &lt;tt class="docutils literal"&gt;put&lt;/tt&gt;.  Unlike the &lt;tt class="docutils literal"&gt;put&lt;/tt&gt;, however, the
enumerator takes no input from a stream and may possibly cause multiple
state transitions: it’s a big step, with all of the intermediate states
hidden from view.&lt;/p&gt;
&lt;p&gt;The nature of this transformation is not specified, but a common
interpretation is that the enumerator repeatedly feeds an input to the
continuation in step. An execution might unfold to something like this:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
-- s :: Step a b
-- x0, x1 :: Stream a
case s of
    Yield r -&amp;gt; return (Yield r)
    Continue k -&amp;gt; do
        s' &amp;lt;- k x0
        case s' of
            Yield r -&amp;gt; return (Yield r)
            Continue k -&amp;gt; do
                s'' &amp;lt;- k x1
                return s''
&lt;/pre&gt;
&lt;p&gt;Notice that our type signature is not:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumerator a b = Step a b -&amp;gt; m ()
&lt;/pre&gt;
&lt;p&gt;as the imperative API might suggest.  Such a function would manage to
run the iteratee (and trigger any of its attendant side effects), but
we’d lose the return result of the iteratee.  This slight modification
wouldn’t do either:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumerator a b = Step a b -&amp;gt; m (Maybe b)
&lt;/pre&gt;
&lt;p&gt;The problem here is that if the enumerator didn’t actually manage
to finish running the iteratee, we’ve lost the end state of the
iteratee (it was never returned!)  This means you can’t concatenate
enumerators together.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It should now be clear why I have unfolded all of the &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt;
definitions: in the &lt;tt class="docutils literal"&gt;enumerator&lt;/tt&gt; library, the simple correspondence
between enumerators and side-effectful state transformers
is obscured by an unfortunate type signature:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumerator a b = Step a b -&amp;gt; Iteratee a b
&lt;/pre&gt;
&lt;p&gt;Oleg’s original treatment is much clearer on this matter, as he defines
the steps themselves to &lt;em&gt;be&lt;/em&gt; the iteratees.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!-- Passing handle mechanism doesn't work.  Some enumerator needs to
know about /all/ iteratees --&gt;
&lt;/div&gt;
&lt;div class="section" id="enumeratee"&gt;
&lt;h3&gt;Enumeratee&lt;/h3&gt;
&lt;p&gt;At last, we are now prepared to tackle the most complicated structure,
the enumeratee.  Our imperative hat tells us a class like this might
work:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
interface Enumeratee&amp;lt;O,A,B&amp;gt; implements Iteratee&amp;lt;O,B&amp;gt; {
    Enumeratee(Iteratee&amp;lt;A,B&amp;gt;);
    bool done();
    // inherited from Iteratee&amp;lt;O,B&amp;gt;
    void put(Stream&amp;lt;O&amp;gt;);
    Maybe&amp;lt;B&amp;gt; result();
}
&lt;/pre&gt;
&lt;p&gt;Like our original &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt; class, it sports a &lt;tt class="docutils literal"&gt;put&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;result&lt;/tt&gt;
operation, but upon construction it wraps another &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt;: in this
sense it is an &lt;em&gt;adapter&lt;/em&gt; from elements of type &lt;tt class="docutils literal"&gt;O&lt;/tt&gt; to elements
of type &lt;tt class="docutils literal"&gt;A&lt;/tt&gt;.  A call to the outer &lt;tt class="docutils literal"&gt;put&lt;/tt&gt; with an object of
type &lt;tt class="docutils literal"&gt;O&lt;/tt&gt; may result in zero, one or many calls to put with an object
of type &lt;tt class="docutils literal"&gt;A&lt;/tt&gt; on the inside &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt;; the call to &lt;tt class="docutils literal"&gt;result&lt;/tt&gt; is
simply passed through.  An &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt; may also decide that it is
“done”, that is, it will never call &lt;tt class="docutils literal"&gt;put&lt;/tt&gt; on the inner iteratee again;
the &lt;tt class="docutils literal"&gt;done&lt;/tt&gt; method may be useful for testing for this case.&lt;/p&gt;
&lt;p&gt;Before we move on to the types, it’s worth reflecting what stateful
objects are involved in this imperative formulation: they are the outer
&lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt; and the inner &lt;tt class="docutils literal"&gt;Iteratee&lt;/tt&gt;.  We need to maintain &lt;em&gt;two&lt;/em&gt;,
not &lt;em&gt;one&lt;/em&gt; states.  The imperative formulation naturally manages these
for us (after all, we still have access to the inner iteratee even as
the enumeratee is running), but we’ll have to manually arrange for this
is in the purely functional implementation.&lt;/p&gt;
&lt;p&gt;Here is the type for &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumeratee o a b = Step a b -&amp;gt; Step o (Step a b)
&lt;/pre&gt;
&lt;p&gt;It’s easy to see why the first argument is &lt;tt class="docutils literal"&gt;Step a b&lt;/tt&gt;; this is the
inner iteratee that we are wrapping around.  It’s less easy to see
why &lt;tt class="docutils literal"&gt;Step o (Step a b)&lt;/tt&gt; is the correct return type.  Since our
imperative interface results in an object which implements the
&lt;tt class="docutils literal"&gt;Iteratee&amp;lt;O,B&amp;gt;&lt;/tt&gt; interface, we might be tempted to write this
signature instead:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumeratee o a b = Step a b -&amp;gt; Step o b
&lt;/pre&gt;
&lt;p&gt;But remember; we need to keep track of two states! We have the outer
state, but what of the inner one?  In a situation similar reminiscent of
our alternate universe &lt;tt class="docutils literal"&gt;Enumerator&lt;/tt&gt; earlier, the state of the inner
iteratee is lost forever.  Perhaps this is not a big deal if this
enumeratee is intended to be used for the rest of the input (i.e.
&lt;tt class="docutils literal"&gt;done&lt;/tt&gt; always returns false), but it is quite important if we need to
stop using the &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt; and then continue operating on the stream
&lt;tt class="docutils literal"&gt;Step a b&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;By the design of iteratees, we can only get a result out of an iteratee
once it finishes.  This forces us to return the state in the second
parameter, giving us the final type:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
type Enumeratee o a b = Step a b -&amp;gt; Step o (Step a b)
&lt;/pre&gt;
&lt;p&gt;“But wait!” you might say, “If the iteratee only returns a result at the
very end, doesn’t this mean that the inner iteratee only gets updated
at the end?”  By the power of &lt;em&gt;inversion of control&lt;/em&gt;, however, this is
not the case: as the enumeratee receives values and updates its own
state, it also executes and updates the internal iteratee.  The
intermediate inner states exist; they are simply not accessible to us.
(This is in contrast to the imperative version, for which we can peek at
the inner iteratee!)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Another good question is “Why does the &lt;tt class="docutils literal"&gt;enumerator&lt;/tt&gt; library have
an extra monad snuck in &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt;?”, i.e.&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Step a b -&amp;gt; m (Step o (Step a b))
&lt;/pre&gt;
&lt;p&gt;My understanding is that the monad is unnecessary, but may be useful
if your &lt;tt class="docutils literal"&gt;Enumeratee&lt;/tt&gt; needs to be able to perform a side-effect
prior to receiving any input, e.g. for initialization.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;Unfortunately, I can’t claim very much novelty here: all of these topics
are covered in &lt;a class="reference external" href="http://okmij.org/ftp/Haskell/Iteratee/IterateeIO-talk-notes.pdf"&gt;Oleg’s notes&lt;/a&gt;. I hope, however, that this
presentation with reference to the imperative analogues of iteratees
makes the choice of types clearer.&lt;/p&gt;
&lt;p&gt;There are some important implications of using this pure encoding,
similar to the differences between using IORefs and using the state
monad:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Iteratees can be forked and run on different threads while
preserving isolation of local state, and&lt;/li&gt;
&lt;li&gt;Old copies of the iteratee state can be kept around, and resumed
later as a form of backtracking (swapping a bad input for a
newer one).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These assurances would not be possible in the case of simple mutable
references. There is one important caveat, however, which is that while
the pure component of an iteratee is easily reversed, we cannot take
back any destructive side-effects performed in the monad.  In the case
of forking, this means any side-effects must be atomic; in the case of
backtracking, we must be able to rollback side-effects.  As far as I can
tell, the art of writing iteratees that take advantage of this style is
not well studied but, in my opinion, well worth investigating.  I’ll
close by noting that one of the theses behind the new conduits is that
purity is not important for supporting most stream processing. In my
opinion, the jury is still out.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/98zPbVemgE0" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2012/01/why-iteratees-are-hard-to-understand/#comments" thr:count="8" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2012/01/why-iteratees-are-hard-to-understand/feed/atom/" thr:count="8" />
		<thr:total>8</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2012/01/why-iteratees-are-hard-to-understand/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Bugs and Battleships]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/7KoT8lbmAAI/" />
		<id>http://blog.ezyang.com/?p=6147</id>
		<!-- ezyang: omitted update time -->
		<published>2011-12-19T16:04:51Z</published>
		<category scheme="http://blog.ezyang.com" term="Computer Science" />		<summary type="html"><![CDATA[Do you remember your first computer program? When you had finished writing it, what was the first thing you did? You did the simplest possible test: you ran it. As programs increase in size, so do the amount of possible tests. It’s worth considering which tests we actually end up running: imagine the children’s game [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2011/12/bugs-and-battleships/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/battleship.png" src="/img/testing/battleship.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Do you remember your first computer program?  When you had finished writing it, what was the first thing you did?  You did the simplest possible test: you ran it.&lt;/p&gt;
&lt;p&gt;As programs increase in size, so do the amount of possible tests. It’s worth considering which tests we actually end up running: imagine the children’s game Battleship, where the ocean is the space of all possible program executions, the battleships are the bugs that you are looking for, and each individual missile you fire is a test you run (white if the test passes, red if the test fails.) You don’t have infinite missiles, so you have to decide where you are going to send them.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/toy-programs.png" src="/img/testing/toy-programs.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;In the case of “your first computer program,” the answer seems pretty obvious: there’s only one way to run the program, only a few cases to test.&lt;/p&gt;
&lt;p&gt;But this fantasy is quickly blown away by an encounter with real software. Even if your program has no inputs, hardware, operating system, development environment, and other environmental factors immediately increase the space of tests. Add explicit inputs and nondeterminism to the application, and you’re looking at the difference between a swimming pool and an ocean.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/real-programs.png" src="/img/testing/real-programs.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;How do we decide what to test? What is our strategy—where do we send more missiles, where do we send less? Different testing strategies result in different distributions of tests on the space of all possible executions. Even though we may not be &lt;em&gt;thinking&lt;/em&gt; about the distribution of test cases when we write up tests or run the whole system in an integration test, different test strategies result in different coverage.&lt;/p&gt;
&lt;p&gt;For example, you might decide not to do any tests, and rely on your users to give you bug reports. The result is that you will end up with high coverage in &lt;em&gt;frequently used areas&lt;/em&gt; of your application, and much less coverage in the rarely used areas. In some sense, this is an optimal strategy when you have a large user base willing to tolerate failure—though anyone who has run into bugs using software in unusual circumstances might disagree!&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/field-testing.png" src="/img/testing/field-testing.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There is a different idea behind regression testing, where you add an automatic test for any bug that occurred in the past. Instead of focusing coverage on frequently used area, a regression test suite will end up concentrated on “tricky” areas of the application, the areas where the most bugs have been found in the past. The hypothesis behind this strategy is that regions of code that historically had bugs are more likely to have bugs in the future.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/regression-testing.png" src="/img/testing/regression-testing.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;You might even have some a priori hypotheses about where bugs in applications occur; maybe you think that boundary cases in the application are most likely to have bugs. Then you might reasonable focus your testing efforts on those areas on the outset.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/boundary-testing.png" src="/img/testing/boundary-testing.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Other testing strategies might focus specifically on the distribution of tests. This is especially important when you are concerned about &lt;em&gt;worst-case&lt;/em&gt; behavior (e.g. security vulnerabilities) as opposed to average-case behavior (ordinary bugs.) Fuzz testing, for example, involves randomly spattering the test space without any regard to such things as usage frequency: the result is that you get a lot more distribution on areas that are rarely used and don’t have many discovered bugs.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/fuzz-testing.png" src="/img/testing/fuzz-testing.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;You might notice, however, that while fuzz testing changes the distribution of tests, it doesn’t give any &lt;em&gt;guarantees.&lt;/em&gt; In order to guarantee that there aren’t any bugs, you’d have to test every single input, which in modern software engineering practice is impossible. Actually, there is a very neat piece of technology called the &lt;em&gt;model checker&lt;/em&gt;, designed specifically with all manner of tricks for speed to do this kind of exhaustive testing. For limited state spaces, anyway—there are also more recent research projects (e.g. Alloy) which perform this exhaustive testing, but only up to a certain depth.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/model-checking.png" src="/img/testing/model-checking.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Model checkers are “dumb” in some sense, in that they don’t really understand what the program is trying to do. Another approach we might take is to take advantage of the fact that we know how our program works, in order to pick a few, very carefully designed test inputs, which “generalize” to cover the entire test space. (We’ll make this more precise shortly.)&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/equivalence-partitioning.png" src="/img/testing/equivalence-partitioning.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The diagram above is a bit misleading, however: test-cases rarely generalize that readily. One might even say that the ability to generalize behavior of specific tests to the behavior of the program is precisely what distinguishes a good program from a bad one. A bad program is filled with many, many different cases, all of which must be tested individually in order to achieve assurance. A good program is economical in its cases, it tries to be as complex as the problem it tries to solve, and no more.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/bad-good-software.png" src="/img/testing/bad-good-software.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;What does it mean to say that a test-case generalizes? My personal belief is that chunks of the test input space which are said to be equivalent to each other correspond to a single case, part of a larger mathematical proof, which can be argued in a self-contained fashion. When you decompose a complicated program into parts in order to explain what it does, each of those parts should correspond to an equivalence partition of the program.&lt;/p&gt;
&lt;p&gt;The corollary of this belief is that &lt;em&gt;good programs are easy to prove correct.&lt;/em&gt;&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/certified-programs.png" src="/img/testing/certified-programs.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This is a long way from “running the program to see if it works.” But I do think this is a necessary transition for any software engineer interested in making correct and reliable software (regardless of whether or not they use any of the academic tools like model checkers and theorem provers which take advantage of this way of thinking.) At the end of the day, you will still need to write tests. But if you understand the underlying theory behind the distributions of tests you are constructing, you will be much more effective.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Postscript.&lt;/em&gt; The relationship between type checking and testing is frequently misunderstood. I think this diagram sums up the relationship well:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/testing/type-checking.png" src="/img/testing/type-checking.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Types eliminate certain regions of bugs and fail to affect others. The idea behind dependent types is to increase these borders until they cover all of the space, but the benefits are very tangible even if you only manage to manage a subset of the test space.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="http://i.creativecommons.org/l/by-sa/3.0/88x31.png" src="http://i.creativecommons.org/l/by-sa/3.0/88x31.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This work is licensed under a &lt;a class="reference external" href="http://creativecommons.org/licenses/by-sa/3.0/"&gt;Creative Commons Attribution-ShareAlike 3.0 Unported License&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/7KoT8lbmAAI" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2011/12/bugs-and-battleships/#comments" thr:count="17" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2011/12/bugs-and-battleships/feed/atom/" thr:count="17" />
		<thr:total>17</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2011/12/bugs-and-battleships/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[How to build i686 glibc on Ubuntu]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/fag7J4rHRAE/" />
		<id>http://blog.ezyang.com/?p=6289</id>
		<!-- ezyang: omitted update time -->
		<published>2011-12-18T23:03:44Z</published>
		<category scheme="http://blog.ezyang.com" term="Toolbox" />		<summary type="html"><![CDATA[An “easy”, two-step process: Apply this patch for i686. (Why they haven't fixed this in the trunk, I have no idea.) Configure with CFLAGS=&#34;-U_FORTIFY_SOURCE -fno-stack-protector -O2&#34; (this disables fortify source and stack protection which Ubuntu enables by default but interferes with glibc. You need to keep optimizations on, because glibc won't build without it.) You’ll [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2011/12/how-to-build-i686-glibc-on-ubuntu/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;An “easy”, two-step process:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="http://www.eglibc.org/archives/patches/msg00073.html"&gt;Apply this patch for i686&lt;/a&gt;. (Why they haven't fixed this in the trunk, I have no idea.)&lt;/li&gt;
&lt;li&gt;Configure with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;CFLAGS=&amp;quot;-U_FORTIFY_SOURCE&lt;/span&gt; &lt;span class="pre"&gt;-fno-stack-protector&lt;/span&gt; &lt;span class="pre"&gt;-O2&amp;quot;&lt;/span&gt;&lt;/tt&gt; (this disables fortify source and stack protection which Ubuntu enables by default but interferes with glibc. You need to keep optimizations on, because glibc won't build without it.) You’ll need to do the usual extra dance of creating a separate build directory and specifying a prefix.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hope this helps someone else. In case you were wondering why I was building glibc, it's because I was reporting these two bugs in iconv:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="http://sources.redhat.com/bugzilla/show_bug.cgi?id=13517"&gt;Bug 13517: iconv generates spurious warnings even with //IGNORE&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://sources.redhat.com/bugzilla/show_bug.cgi?id=13518"&gt;Bug 13518: iconv truncates input with //IGNORE&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/fag7J4rHRAE" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2011/12/how-to-build-i686-glibc-on-ubuntu/#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2011/12/how-to-build-i686-glibc-on-ubuntu/feed/atom/" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2011/12/how-to-build-i686-glibc-on-ubuntu/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Interactive Demo of Zero-Knowledge Proofs]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/FdamMolYvD8/" />
		<id>http://blog.ezyang.com/?p=6283</id>
		<!-- ezyang: omitted update time -->
		<published>2011-12-17T18:56:39Z</published>
		<category scheme="http://blog.ezyang.com" term="Computer Science" />		<summary type="html"><![CDATA[For the final project in our theoretical computer science and philosophy class taught by Scott Aaronson, Karen Sittig and I decided to create an interactive demonstration of zero-knowledge proofs. (Sorry, the picture below is not clickable.) For the actually interactive demonstration, click here: http://web.mit.edu/~ezyang/Public/graph/svg.html (you will need a recent version of Firefox or Chrome, since [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2011/12/interactive-demo-of-zero-knowledge-proofs/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;For the final project in our &lt;a class="reference external" href="http://stellar.mit.edu/S/course/6/fa11/6.893/index.html"&gt;theoretical computer science and philosophy class&lt;/a&gt; taught by &lt;a class="reference external" href="http://www.scottaaronson.com/blog/"&gt;Scott Aaronson&lt;/a&gt;, &lt;a class="reference external" href="https://plus.google.com/104657582733825681275"&gt;Karen Sittig&lt;/a&gt; and I decided to create an interactive demonstration of zero-knowledge proofs. (Sorry, the picture below is not clickable.)&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/interactive-zk.png" src="/img/interactive-zk.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;For the &lt;em&gt;actually&lt;/em&gt; interactive demonstration, click here: &lt;a class="reference external" href="http://web.mit.edu/~ezyang/Public/graph/svg.html"&gt;http://web.mit.edu/~ezyang/Public/graph/svg.html&lt;/a&gt; (you will need a recent version of Firefox or Chrome, since we did our rendering with SVG.)&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/FdamMolYvD8" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2011/12/interactive-demo-of-zero-knowledge-proofs/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2011/12/interactive-demo-of-zero-knowledge-proofs/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2011/12/interactive-demo-of-zero-knowledge-proofs/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Accessing lazy structures from C]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/nDqGr_tTtI8/" />
		<id>http://blog.ezyang.com/?p=6276</id>
		<!-- ezyang: omitted update time -->
		<published>2011-12-15T22:18:27Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Someone recently asked on haskell-beginners how to access an lazy (and potentially infinite) data structure in C. I failed to find some example code on how to do this, so I wrote some myself. May this help you in your C calling Haskell endeavours! The main file Main.hs: {-# LANGUAGE ForeignFunctionInterface #-} import Foreign.C.Types import [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2011/12/accessing-lazy-structures-from/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Someone &lt;a class="reference external" href="http://comments.gmane.org/gmane.comp.lang.haskell.beginners/9109"&gt;recently asked on haskell-beginners&lt;/a&gt; how to access an lazy (and potentially infinite) data structure in C. I failed to find some example code on how to do this, so I wrote some myself. May this help you in your C calling Haskell endeavours!&lt;/p&gt;
&lt;p&gt;The main file &lt;tt class="docutils literal"&gt;Main.hs&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign.C.Types
import Foreign.StablePtr
import Control.Monad

lazy :: [CInt]
lazy = [1..]

main = do
    pLazy &amp;lt;- newStablePtr lazy
    test pLazy -- we let C deallocate the stable pointer with cfree

chead = liftM head . deRefStablePtr
ctail = newStablePtr . tail &amp;lt;=&amp;lt; deRefStablePtr
cfree = freeStablePtr

foreign import ccall test :: StablePtr [CInt] -&amp;gt; IO ()
foreign export ccall chead :: StablePtr [CInt] -&amp;gt; IO CInt
foreign export ccall ctail :: StablePtr [CInt] -&amp;gt; IO (StablePtr [CInt])
foreign export ccall cfree :: StablePtr a -&amp;gt; IO ()
&lt;/pre&gt;
&lt;p&gt;The C file &lt;tt class="docutils literal"&gt;export.c&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
#include &amp;lt;HsFFI.h&amp;gt;
#include &amp;lt;stdio.h&amp;gt;
#include &amp;quot;Main_stub.h&amp;quot;

void test(HsStablePtr l1) {
    int x = chead(l1);
    printf(&amp;quot;first = %d\n&amp;quot;, x);
    HsStablePtr l2 = ctail(l1);
    int y = chead(l2);
    printf(&amp;quot;second = %d\n&amp;quot;, y);
    cfree(l2);
    cfree(l1);
}
&lt;/pre&gt;
&lt;p&gt;And a simple Cabal file to build it all:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Name:                export
Version:             0.1
Cabal-version:       &amp;gt;=1.2
Build-type:          Simple

Executable export
  Main-is:             Main.hs
  Build-depends:       base
  C-sources:           export.c
&lt;/pre&gt;
&lt;p&gt;Happy hacking!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/nDqGr_tTtI8" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2011/12/accessing-lazy-structures-from/#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2011/12/accessing-lazy-structures-from/feed/atom/" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2011/12/accessing-lazy-structures-from/</feedburner:origLink></entry>
	</feed>

