<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en-US" xml:base="http://blog.ezyang.com/wp-atom.php">
	<title type="text">Inside 206-105</title>
	<subtitle type="text">Existential Pontification and Generalized Abstract Digressions</subtitle>

	<updated>2013-05-22T04:22:48Z</updated>

	<link rel="alternate" type="text/html" href="http://blog.ezyang.com" />
	<id>http://blog.ezyang.com/feed/atom/</id>
	

	<generator uri="http://wordpress.org/" version="3.5.1">WordPress</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/ezyang" /><feedburner:info uri="ezyang" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Anatomy of an MVar operation]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/nmmQFXNcDMg/" />
		<id>http://blog.ezyang.com/?p=8282</id>
		<!-- ezyang: omitted update time -->
		<published>2013-05-20T00:00:37Z</published>
		<category scheme="http://blog.ezyang.com" term="GHC" />		<summary type="html"><![CDATA[Adam Belay (of Dune fame) was recently wondering why Haskell’s MVars are so slow. “Slow?” I thought, “aren’t Haskell’s MVars supposed to be really fast?” So I did some digging around how MVars worked, to see if I could explain. Let’s consider the operation of the function takeMVar in Control.Concurrent.MVar. This function is very simple, [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/05/anatomy-of-an-mvar-operation/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Adam Belay (of &lt;a class="reference external" href="http://dune.scs.stanford.edu/"&gt;Dune&lt;/a&gt; fame) was recently wondering why Haskell’s MVars are so slow. “Slow?” I thought, “aren’t Haskell’s MVars supposed to be really fast?” So I did some digging around how MVars worked, to see if I could explain.&lt;/p&gt;
&lt;p&gt;Let’s consider the operation of the function &lt;tt class="docutils literal"&gt;takeMVar&lt;/tt&gt; in &lt;a class="reference external" href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Concurrent-MVar.html#v:takeMVar"&gt;Control.Concurrent.MVar&lt;/a&gt;.  This function is very simple, it unpacks &lt;tt class="docutils literal"&gt;MVar&lt;/tt&gt; to get the underlying &lt;tt class="docutils literal"&gt;MVar#&lt;/tt&gt; primitive value, and then calls the primop &lt;tt class="docutils literal"&gt;takeMVar#&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
takeMVar :: MVar a -&amp;gt; IO a
takeMVar (MVar mvar#) = IO $ \ s# -&amp;gt; takeMVar# mvar# s#
&lt;/pre&gt;
&lt;p&gt;&lt;a class="reference external" href="http://hackage.haskell.org/trac/ghc/wiki/Commentary/PrimOps"&gt;Primops&lt;/a&gt; result in the invocation of &lt;tt class="docutils literal"&gt;stg_takeMVarzh&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;PrimOps.cmm&lt;/tt&gt;, which is where the magic happens. For simplicity, we consider only the &lt;em&gt;multithreaded&lt;/em&gt; case.&lt;/p&gt;
&lt;p&gt;The first step is to &lt;strong&gt;lock the closure&lt;/strong&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
(&amp;quot;ptr&amp;quot; info) = ccall lockClosure(mvar &amp;quot;ptr&amp;quot;);
&lt;/pre&gt;
&lt;p&gt;Objects on the GHC heap have an &lt;em&gt;info table header&lt;/em&gt; which indicates what kind of object they are, by pointing to the relevant info table for the object.  These headers are &lt;em&gt;also&lt;/em&gt; used for synchronization: since they are word-sized, they can be atomically swapped for other values. &lt;tt class="docutils literal"&gt;lockClosure&lt;/tt&gt; is in fact a spin-lock on the info table header:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
EXTERN_INLINE StgInfoTable *lockClosure(StgClosure *p)
{
    StgWord info;
    do {
        nat i = 0;
        do {
            info = xchg((P_)(void *)&amp;amp;p-&amp;gt;header.info, (W_)&amp;amp;stg_WHITEHOLE_info);
            if (info != (W_)&amp;amp;stg_WHITEHOLE_info) return (StgInfoTable *)info;
        } while (++i &amp;lt; SPIN_COUNT);
        yieldThread();
    } while (1);
}
&lt;/pre&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;lockClosure&lt;/tt&gt; is used for some other objects, namely thread state objects (&lt;tt class="docutils literal"&gt;stg_TSO_info&lt;/tt&gt;, via &lt;tt class="docutils literal"&gt;lockTSO&lt;/tt&gt;) and thread messages i.e. exceptions (&lt;tt class="docutils literal"&gt;stg_MSG_THROWTO_info&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;stg_MSG_NULL_info&lt;/tt&gt;).&lt;/p&gt;
&lt;p&gt;The next step is to &lt;strong&gt;apply a GC write barrier on the MVar&lt;/strong&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
if (info == stg_MVAR_CLEAN_info) {
    ccall dirty_MVAR(BaseReg &amp;quot;ptr&amp;quot;, mvar &amp;quot;ptr&amp;quot;);
}
&lt;/pre&gt;
&lt;p&gt;As I’ve &lt;a class="reference external" href="http://blog.ezyang.com/2013/01/the-ghc-scheduler/"&gt;written before&lt;/a&gt;, as the MVar is a mutable object, it can be mutated to point to objects in generation 0; thus, when a mutation happens, it has to be added to the root set via the mutable list. Since mutable is per capability, this boils down into a bunch of pointer modifications, and does not require any synchronizations. Note that we will need to add the MVar to the mutable list, &lt;em&gt;even&lt;/em&gt; if we end up blocking on it, because the MVar is a retainer of the &lt;em&gt;thread&lt;/em&gt; (TSO) which is blocked on it! (However, I suspect in some cases we can get away with not doing this.)&lt;/p&gt;
&lt;p&gt;Next, we case split depending on whether or not the MVar is full or empty.  If the MVar is empty, we need to &lt;strong&gt;block the thread until the MVar is full&lt;/strong&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
/* If the MVar is empty, put ourselves on its blocking queue,
 * and wait until we're woken up.
 */
if (StgMVar_value(mvar) == stg_END_TSO_QUEUE_closure) {

    // We want to put the heap check down here in the slow path,
    // but be careful to unlock the closure before returning to
    // the RTS if the check fails.
    ALLOC_PRIM_WITH_CUSTOM_FAILURE
        (SIZEOF_StgMVarTSOQueue,
         unlockClosure(mvar, stg_MVAR_DIRTY_info);
         GC_PRIM_P(stg_takeMVarzh, mvar));

    q = Hp - SIZEOF_StgMVarTSOQueue + WDS(1);

    SET_HDR(q, stg_MVAR_TSO_QUEUE_info, CCS_SYSTEM);
    StgMVarTSOQueue_link(q) = END_TSO_QUEUE;
    StgMVarTSOQueue_tso(q)  = CurrentTSO;

    if (StgMVar_head(mvar) == stg_END_TSO_QUEUE_closure) {
        StgMVar_head(mvar) = q;
    } else {
        StgMVarTSOQueue_link(StgMVar_tail(mvar)) = q;
        ccall recordClosureMutated(MyCapability() &amp;quot;ptr&amp;quot;,
                                         StgMVar_tail(mvar));
    }
    StgTSO__link(CurrentTSO)       = q;
    StgTSO_block_info(CurrentTSO)  = mvar;
    StgTSO_why_blocked(CurrentTSO) = BlockedOnMVar::I16;
    StgMVar_tail(mvar)             = q;

    jump stg_block_takemvar(mvar);
}
&lt;/pre&gt;
&lt;p&gt;A useful thing to know when decoding C-- primop code is that &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;StgTSO_block_info(...)&lt;/span&gt;&lt;/tt&gt; and its kin are how we spell field access on objects. C-- doesn’t know anything about C struct layout, and so these “functions” are actually macros generated by &lt;tt class="docutils literal"&gt;utils/deriveConstants&lt;/tt&gt;. Blocking a thread consists of three steps:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;We have to add the thread to the blocked queue attached to the MVar (that’s why blocking on an MVar mutates the MVar!) This involves performing a heap allocation for the linked list node as well as mutating the tail of the old linked list.&lt;/li&gt;
&lt;li&gt;We have to mark the thread as blocked (the &lt;tt class="docutils literal"&gt;StgTSO&lt;/tt&gt; modifications).&lt;/li&gt;
&lt;li&gt;We need to setup a stack frame for the thread so that when the thread wakes up, it performs the correct action (the invocation to &lt;tt class="docutils literal"&gt;stg_block_takemvar&lt;/tt&gt;). This invocation is also responsible for unlocking the closure. While the machinery here is pretty intricate, it’s not really in scope for this blog post.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If the MVar is full, then we can go ahead and &lt;strong&gt;take the value from the MVar.&lt;/strong&gt;&lt;/p&gt;
&lt;pre class="literal-block"&gt;
/* we got the value... */
val = StgMVar_value(mvar);
&lt;/pre&gt;
&lt;p&gt;But that’s not all. If there are other blocked &lt;tt class="docutils literal"&gt;putMVars&lt;/tt&gt; on the MVar (remember, when a thread attempts to put an MVar that is already full, it blocks until the MVar empties out), then we should immediately unblock one of these threads so that the MVar can always be left in a full state:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
    q = StgMVar_head(mvar);
loop:
    if (q == stg_END_TSO_QUEUE_closure) {
        /* No further putMVars, MVar is now empty */
        StgMVar_value(mvar) = stg_END_TSO_QUEUE_closure;
        unlockClosure(mvar, stg_MVAR_DIRTY_info);
        return (val);
    }
    if (StgHeader_info(q) == stg_IND_info ||
        StgHeader_info(q) == stg_MSG_NULL_info) {
        q = StgInd_indirectee(q);
        goto loop;
    }
&lt;/pre&gt;
&lt;p&gt;There is one interesting thing about the code that checks for blocked threads, and that is the check for &lt;em&gt;indirectees&lt;/em&gt; (&lt;tt class="docutils literal"&gt;stg_IND_info&lt;/tt&gt;). Under what circumstances would a queue object be stubbed out with an indirection? As it turns out, this occurs when we &lt;em&gt;delete&lt;/em&gt; an item from the linked list. This is quite nice, because on a singly-linked list, we don't have an easy way to delete items unless we also have a pointer to the previous item. With this scheme, we just overwrite out the current item with an indirection, to be cleaned up next GC. (This, by the way, is why we can't just chain up the TSOs directly, without the extra linked list nodes. [1])&lt;/p&gt;
&lt;p&gt;When we find some other threads, we immediately run them, so that the MVar never becomes empty:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
// There are putMVar(s) waiting... wake up the first thread on the queue

tso = StgMVarTSOQueue_tso(q);
StgMVar_head(mvar) = StgMVarTSOQueue_link(q);
if (StgMVar_head(mvar) == stg_END_TSO_QUEUE_closure) {
    StgMVar_tail(mvar) = stg_END_TSO_QUEUE_closure;
}

ASSERT(StgTSO_why_blocked(tso) == BlockedOnMVar::I16); // note: I16 means this is a 16-bit integer
ASSERT(StgTSO_block_info(tso) == mvar);

// actually perform the putMVar for the thread that we just woke up
W_ stack;
stack = StgTSO_stackobj(tso);
PerformPut(stack, StgMVar_value(mvar));
&lt;/pre&gt;
&lt;p&gt;There is one detail here: &lt;tt class="docutils literal"&gt;PerformPut&lt;/tt&gt; doesn’t actually run the thread, it just looks at the thread’s stack to figure out what it was &lt;em&gt;going&lt;/em&gt; to put. Once the MVar is put, we then wake up the thread, so it can go on its merry way:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
// indicate that the MVar operation has now completed.
StgTSO__link(tso) = stg_END_TSO_QUEUE_closure;

// no need to mark the TSO dirty, we have only written END_TSO_QUEUE.

ccall tryWakeupThread(MyCapability() &amp;quot;ptr&amp;quot;, tso);

unlockClosure(mvar, stg_MVAR_DIRTY_info);
return (val);
&lt;/pre&gt;
&lt;p&gt;To sum up, when you &lt;tt class="docutils literal"&gt;takeMVar&lt;/tt&gt;, you pay the costs of:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;one spinlock,&lt;/li&gt;
&lt;li&gt;on order of several dozen memory operations (write barriers, queue twiddling), and&lt;/li&gt;
&lt;li&gt;when the MVar is empty, a (small) heap allocation and stack write.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Adam and I puzzled about this a bit, and then realized the reason why the number of cycles was so large: our numbers are for &lt;em&gt;roundtrips&lt;/em&gt;, and even with such lightweight synchronization (and lack of syscalls), you still have to go through the scheduler when all is said and done, and that blows up the number of cycles.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;[1] It wasn’t always this way, see:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
commit f4692220c7cbdadaa633f50eb2b30b59edb30183
Author: Simon Marlow &amp;lt;marlowsd&amp;#64;gmail.com&amp;gt;
Date:   Thu Apr 1 09:16:05 2010 +0000

    Change the representation of the MVar blocked queue

    The list of threads blocked on an MVar is now represented as a list of
    separately allocated objects rather than being linked through the TSOs
    themselves.  This lets us remove a TSO from the list in O(1) time
    rather than O(n) time, by marking the list object.  Removing this
    linear component fixes some pathalogical performance cases where many
    threads were blocked on an MVar and became unreachable simultaneously
    (nofib/smp/threads007), or when sending an asynchronous exception to a
    TSO in a long list of thread blocked on an MVar.

    MVar performance has actually improved by a few percent as a result of
    this change, slightly to my surprise.

    This is the final cleanup in the sequence, which let me remove the old
    way of waking up threads (unblockOne(), MSG_WAKEUP) in favour of the
    new way (tryWakeupThread and MSG_TRY_WAKEUP, which is idempotent).  It
    is now the case that only the Capability that owns a TSO may modify
    its state (well, almost), and this simplifies various things.  More of
    the RTS is based on message-passing between Capabilities now.
&lt;/pre&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/nmmQFXNcDMg" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/05/anatomy-of-an-mvar-operation/#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/05/anatomy-of-an-mvar-operation/feed/atom/" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/05/anatomy-of-an-mvar-operation/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[HotOS &#8220;Unconference&#8221; report:Verifying Systems]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/2-L88v99rEY/" />
		<id>http://blog.ezyang.com/?p=8415</id>
		<!-- ezyang: omitted update time -->
		<published>2013-05-14T04:58:40Z</published>
		<category scheme="http://blog.ezyang.com" term="Computer Science" />		<summary type="html"><![CDATA[Ariel Rabkin has some code he'd like to verify, and at this year’s HotOS he appealed to participants of one “unconference” (informal breakout sessions to discuss various topics) to help him figure out what was really going on as far as formal verification went. He had three questions: &#34;What can we verify? What is impossible [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/05/hotos-unconference-reportverifying-systems/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.eecs.berkeley.edu/~asrabkin/"&gt;Ariel Rabkin&lt;/a&gt; has some code he'd like to verify, and at this year’s HotOS he appealed to participants of one “unconference” (informal breakout sessions to discuss various topics) to help him figure out what was really going on as far as formal verification went.&lt;/p&gt;
&lt;p&gt;He had three questions: &amp;quot;What can we verify? What is impossible to verify? How can we tell that the verification is correct?&amp;quot; They seemed like impossibly large questions, so we drilled in a little bit, and found that Ariel had the very understandable question, &amp;quot;Say I have some C++ code implementing a subtle network protocol, and I'd like to prove that the protocol doesn't deadlock; how do I do that?&amp;quot;&lt;/p&gt;
&lt;p&gt;I wish the formal verification community had a good answer to a question like this, but unfortunately we don't. The largest verification projects include things like verified kernels, which are written in fully specified subsets of C; which assume the translation performed by the compiler is correct, formalize C in a theorem prover, and then verify there. This is the &amp;quot;principled approach&amp;quot;. It's just not feasible to take C or C++ in its entirety and try to formalize it; it's too complicated and too ill-specified. The easiest thing to do is formalize a small fragment of your algorithm and then make a hand-wavy argument that your implementation is adequate.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://users.soe.ucsc.edu/~abadi/home.html"&gt;Martin Abadi&lt;/a&gt; remarked that before you embark on a verification project, you have to figure out where you'll get the most bang for your buck. Most of the time, a formalization won't get you &amp;quot;full correctness&amp;quot;; the &amp;quot;electrons may be faulty&amp;quot;, as the case may be. But even a flawed verification forces you to state your assumptions explicitly, which is a good thing.&lt;/p&gt;
&lt;p&gt;We then circled around to the subject of, well, what can be verified. Until the 90s, the formal verification community limited itself to only complete and sound analyses—and failed. The relaxation of this restriction lead to a renaissance of formal verification work. We talked about who was using formal verification, and the usual suspects showed up: safety critical software, cache coherence protocols (but one participant remarked that this was only a flash in the pan, as far as applications goes—he asserted that these companies are likely not going to use these methods any longer in the future), etc. Safety critical software is also likely to use coprocessors (since hardware failure is a very real issue), but &lt;a class="reference external" href="http://www.cse.unsw.edu.au/~gernot/"&gt;Gernot Heiser&lt;/a&gt; noted that these folks are trying to get away from physical separation: it is expensive in terms of expense, weight and energy. Luckily, the costs of verification, as he recounted, are within a factor of two of normal industrial assurance, and half the cost of military assurance (though, he cautioned that this was for a specific project, and for a specific size of code.) He also remarked that as far as changes to code requiring changes to the proofs, the changes in the proofs seemed to be linear in the complexity (conceptual or implementation-wise) of the change, which is a good sign!&lt;/p&gt;
&lt;p&gt;Well, supposing that you decide that you actually want to verify your software, how do you go about doing it? Unfortunately, it takes a completely different set of skills to build verified software versus normal software. Everyone agreed, &amp;quot;Yes, you need to hire a formal methods guy&amp;quot; if you're going to make any progress. But that's just not enough. The formal methods guy has to talk to the systems guy. Heiser recounted a very good experience hiring a formal methods person who was able to communicate with the other systems researchers working on the project; without this line of communication, he said, the project likely would have failed. And he mentioned another project, which had three times as much funding, but didn't accomplish nearly as much their team had. (Names not mentioned to protect the guilty.)&lt;/p&gt;
&lt;p&gt;In the end, it seemed that we didn’t manage to give Ari a quite satisfactory answer. As one participant said, “You’ll probably learn the most by just sitting down and trying to formalize the thing you are interested in.” This is probably true, though I fear most will be scared off by the realization of how much work it actually takes to formalize software.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Hey guys, I’m &lt;a class="reference external" href="http://ezyang.tumblr.com"&gt;liveblogging HotOS at my research Tumblr&lt;/a&gt;. The posts there are likely to be more fragmented than this, but if people are interested in any particular topics I can inflate them into full posts.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/2-L88v99rEY" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/05/hotos-unconference-reportverifying-systems/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/05/hotos-unconference-reportverifying-systems/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/05/hotos-unconference-reportverifying-systems/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Category theory for loop optimizations]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/BHnNhQ8hD8s/" />
		<id>http://blog.ezyang.com/?p=8378</id>
		<!-- ezyang: omitted update time -->
		<published>2013-05-12T04:25:40Z</published>
		<category scheme="http://blog.ezyang.com" term="Compilers" />		<summary type="html"><![CDATA[Christopher de Sa and I have been working on a category theoretic approach to optimizing MapReduce-like pipelines. Actually, we didn’t start with any category theory—we were originally trying to impose some structure on some of the existing loop optimizations that the Delite compiler performed, and along the way, we rediscovered the rich relationship between category [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/05/category-theory-for-loop-optimizations/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Christopher de Sa and I have been working on a category theoretic approach to optimizing MapReduce-like pipelines. Actually, we didn’t start with any category theory—we were originally trying to impose some structure on some of the existing loop optimizations that the &lt;a class="reference external" href="http://stanford-ppl.github.io/Delite/"&gt;Delite compiler&lt;/a&gt; performed, and along the way, we rediscovered the rich relationship between category theory and loop optimization.&lt;/p&gt;
&lt;p&gt;On the one hand, I think the approach is pretty cool; but on the other hand, there’s a lot of prior work in the area, and it’s tough to figure out where one stands on the research landscape.  As John Mitchell remarked to me when I was discussing the idea with him, “Loop optimization, can’t you just solve it using a table lookup?” We draw a lot of inspiration from existing work, especially the &lt;em&gt;program calculation&lt;/em&gt; literature pioneered by Bird, Meertens, Malcom, Meijer and others in the early 90s. The purpose of this blog post is to air out some of the ideas we’ve worked out and get some feedback from you, gentle reader.&lt;/p&gt;
&lt;p&gt;There are a few ways to think about what we are trying to do:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;We would like to &lt;em&gt;implement&lt;/em&gt; a calculational-based optimizer, targeting a real project (Delite) where the application of loop optimizations can have drastic impacts on the performance of a task (other systems which have had similar goals include &lt;a class="reference external" href="http://takeichi.ipl-lab.org/yicho/doc/Yicho.html"&gt;Yicho&lt;/a&gt;, &lt;a class="reference external" href="http://lecture.ecc.u-tokyo.ac.jp/~uonoue/hylo/"&gt;HYLO&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;We want to venture where theorists do not normally tread. For example, there are many “boring” functors (e.g. arrays) which have important performance properties. While they may be isomorphic to an appropriately defined algebraic data type, we argue that in a calculational optimizer, we want to &lt;em&gt;distinguish&lt;/em&gt; between these different representations.  Similarly, many functions which are not natural transformations &lt;em&gt;per se&lt;/em&gt; can be made to be natural transformations by way of partial application. For example, &lt;tt class="docutils literal"&gt;filter p xs&lt;/tt&gt; is a natural transformation when &lt;tt class="docutils literal"&gt;map p xs&lt;/tt&gt; is incorporated as part of the definition of the function (the resulting function can be applied on any list, not just the original &lt;tt class="docutils literal"&gt;xs&lt;/tt&gt;). The resulting natural transformation is &lt;em&gt;ugly&lt;/em&gt; but &lt;em&gt;useful&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;For stock optimizers (e.g. Haskell), some calculational optimizations can be supported by the use of &lt;em&gt;rewrite rules&lt;/em&gt;. While rewrite rules are a very powerful mechanism, they can only describe “always on” optimizations; e.g. for deforestation, one always wants to eliminate as many intermediate data structures as possible. In many of the applications we want to optimize, the best performance can only be achieved by &lt;em&gt;adding&lt;/em&gt; intermediate data structures: now we have a space of possible programs and rewrite rules are woefully inadequate for specifying &lt;em&gt;which&lt;/em&gt; program is the best. What we’d like to do is use category theory to give an account for rewrite rules &lt;em&gt;with structure&lt;/em&gt;, and use domain specific knowledge to pick the best programs.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I’d like to illustrate some of these ideas by way of an example. Here is some sample code, written in Delite, which calculates an iteration of (1-dimensional) k-means clustering:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
(0 :: numClusters, *) { j =&amp;gt;
  val weightedPoints = sumRowsIf(0,m){i =&amp;gt; c(i) == j}{i =&amp;gt; x(i)};
  val points = c.count(_ == j);
  val d = if (points == 0) 1 else points
  weightedPoints / d
}
&lt;/pre&gt;
&lt;p&gt;You can read it as follows: we are computing a result array containing the position of each cluster, and the outermost block is looping over the clusters by index variable &lt;tt class="docutils literal"&gt;j&lt;/tt&gt;. To compute the position of a cluster, we have to get all of the points in &lt;tt class="docutils literal"&gt;x&lt;/tt&gt; which were assigned to cluster &lt;tt class="docutils literal"&gt;j&lt;/tt&gt; (that’s the &lt;tt class="docutils literal"&gt;c(i) == j&lt;/tt&gt; condition) and sum them together, finally dividing by the sum by the number of points in the cluster to get the true location.&lt;/p&gt;
&lt;p&gt;The big problem with this code is that it iterates over the entire dataset &lt;em&gt;numClusters&lt;/em&gt; times, when we’d like to only ever do one iteration.  The optimized version which does just that looks like this:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
val allWP = hashreduce(0,m)(i =&amp;gt; c(i), i =&amp;gt; x(i), _ + _)
val allP = hashreduce(0,m)(i =&amp;gt; c(i), i =&amp;gt; 1, _ + _)
(0::numClusters, *) { j =&amp;gt;
    val weightedPoints = allWP(j);
    val points = allP(j);
    val d = if (points == 0) 1 else points
    return weightedpoints / d
}
&lt;/pre&gt;
&lt;p&gt;That is to say, we have to precompute the weighted points and the point count (note the two hashreduces can and should be fused together) before generating the new coordinates for each of the clusters: generating &lt;em&gt;more&lt;/em&gt; intermediate data structures is a win, in this case.&lt;/p&gt;
&lt;p&gt;Let us now calculate our way to the optimized version of the program.  First, however, we have to define some functors:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;D_i[X]&lt;/tt&gt; is an array of &lt;tt class="docutils literal"&gt;X&lt;/tt&gt; of a size specified by &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; (concretely, we’ll use &lt;tt class="docutils literal"&gt;D_i&lt;/tt&gt; for arrays of size &lt;tt class="docutils literal"&gt;numPoints&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;D_j&lt;/tt&gt; for arrays of size &lt;tt class="docutils literal"&gt;numClusters&lt;/tt&gt;). This family of functors is also known as the &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Diagonal_functor"&gt;diagonal functor&lt;/a&gt;, generalized for arbitrary size products. We also will rely on the fact that &lt;tt class="docutils literal"&gt;D&lt;/tt&gt; is &lt;a class="reference external" href="http://stackoverflow.com/questions/12963733/writing-cojoin-or-cobind-for-n-dimensional-grid-type"&gt;representable&lt;/a&gt;, that is to say &lt;tt class="docutils literal"&gt;D_i[X] = Loc_D_i &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; X&lt;/tt&gt; for some type &lt;tt class="docutils literal"&gt;Loc_D_i&lt;/tt&gt; (in this case, it is the index set &lt;tt class="docutils literal"&gt;{0 .. i}&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;List[X]&lt;/tt&gt; is a standard list of &lt;tt class="docutils literal"&gt;X&lt;/tt&gt;. It is the initial algebra for the functor &lt;tt class="docutils literal"&gt;F[R] = 1 + X * R&lt;/tt&gt;. Any &lt;tt class="docutils literal"&gt;D_i&lt;/tt&gt; can be embedded in &lt;tt class="docutils literal"&gt;List&lt;/tt&gt;; we will do such conversions implicitly (note that the reverse is not true.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are a number of functions, which we will describe below:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;tabulate&lt;/tt&gt; witnesses one direction of the isomorphism between &lt;tt class="docutils literal"&gt;Loc_D_i &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; X&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;D_i[X]&lt;/tt&gt;, since &lt;tt class="docutils literal"&gt;D_i&lt;/tt&gt; is representable. The other direction is &lt;tt class="docutils literal"&gt;index&lt;/tt&gt;, which takes &lt;tt class="docutils literal"&gt;D_i[X]&lt;/tt&gt; and a &lt;tt class="docutils literal"&gt;Loc_D_i&lt;/tt&gt; and returns an &lt;tt class="docutils literal"&gt;X&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;fold&lt;/tt&gt; is the unique function determined by the initial algebra on &lt;tt class="docutils literal"&gt;List&lt;/tt&gt;. Additionally, suppose that we have a function &lt;tt class="docutils literal"&gt;*&lt;/tt&gt; which combines two algebras by taking their cartesian product,&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;bucket&lt;/tt&gt; is a natural transformation which takes a &lt;tt class="docutils literal"&gt;D_i[X]&lt;/tt&gt; and buckets it into &lt;tt class="docutils literal"&gt;D_j[List[X]]&lt;/tt&gt; based on some function which assigns elements in &lt;tt class="docutils literal"&gt;D_i&lt;/tt&gt; to slots in &lt;tt class="docutils literal"&gt;D_j&lt;/tt&gt;. This is an example of a natural transformation that is not a natural transformation until it is partially applied: if we compute &lt;tt class="docutils literal"&gt;D_i[Loc_D_j]&lt;/tt&gt;, then we can create a natural transformation that doesn’t ever look at &lt;tt class="docutils literal"&gt;X&lt;/tt&gt;; it simply “knows” where each slot of &lt;tt class="docutils literal"&gt;D_i&lt;/tt&gt; needs to go in the resulting structure.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let us now rewrite the loop in more functional terms:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
tabulate (\j -&amp;gt;
  let weightedPoints = fold plus . filter (\i -&amp;gt; c[i] == j) $ x
      points = fold inc . filter (\i -&amp;gt; c[i] == j) $ x
  in divide (weightedPoints, points)
)
&lt;/pre&gt;
&lt;p&gt;(Where &lt;tt class="docutils literal"&gt;divide&lt;/tt&gt; is just a function which divides its arguments but checks that the divisor is not zero.) Eliminating some common sub-expressions and fusing the two folds together, we get:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
tabulate (\j -&amp;gt; divide . fold (plus * inc) . filter (\i -&amp;gt; c[i] == j) $ x)
&lt;/pre&gt;
&lt;p&gt;At this point, it is still not at all clear that there are any rewrites we can carry out: the &lt;tt class="docutils literal"&gt;filter&lt;/tt&gt; is causing problems for us. However, because filter is testing on equality, we can rewrite it in a different way:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
tabulate (\j -&amp;gt; divide . fold (plus * inc) . index j . bucket c $ x)
&lt;/pre&gt;
&lt;p&gt;What is happening here? Rather than directly filtering for just items in cluster &lt;tt class="docutils literal"&gt;j&lt;/tt&gt;, we can instead view this as &lt;em&gt;bucketing&lt;/em&gt; &lt;tt class="docutils literal"&gt;x&lt;/tt&gt; on &lt;tt class="docutils literal"&gt;c&lt;/tt&gt; and then indexing out the single bucket we care about. This shift in perspective is key to the whole optimization.&lt;/p&gt;
&lt;p&gt;Now we can apply the fundamental rule of natural transformations. Let &lt;tt class="docutils literal"&gt;phi = index j&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;f = divide . fold (plus * inc)&lt;/tt&gt;, then we can push &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; to the other side of &lt;tt class="docutils literal"&gt;phi&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
tabulate (\j -&amp;gt; index j . fmap (divide . fold (plus * inc)) . bucket c $ x)
&lt;/pre&gt;
&lt;p&gt;Now we can eliminate &lt;tt class="docutils literal"&gt;tabulate&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;index&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
fmap (divide . fold (plus * inc)) . bucket c $ x
&lt;/pre&gt;
&lt;p&gt;Finally, because we know how to efficiently implement &lt;tt class="docutils literal"&gt;fmap (fold f) . bucket c&lt;/tt&gt; (as a &lt;tt class="docutils literal"&gt;hashreduce&lt;/tt&gt;), we split up the &lt;tt class="docutils literal"&gt;fmap&lt;/tt&gt; and join the fold and bucket:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
fmap divide . hashreduce (plus * inc) c $ x
&lt;/pre&gt;
&lt;p&gt;And we have achieved our fully optimized program.&lt;/p&gt;
&lt;p&gt;All of this is research in progress, and there are lots of open questions which we have not resolved. Still, I hope this post has given you a flavor of the approach we are advocating. I am quite curious in your comments, from “That’s cool!” to “This was all done 20 years ago by X system.” Have at it!&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/BHnNhQ8hD8s" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/05/category-theory-for-loop-optimizations/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/05/category-theory-for-loop-optimizations/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/05/category-theory-for-loop-optimizations/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[The Difference between Recursion &amp; Induction]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/_U33n9IZwZ0/" />
		<id>http://blog.ezyang.com/?p=7437</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-27T07:30:17Z</published>
		<category scheme="http://blog.ezyang.com" term="Math" />		<summary type="html"><![CDATA[Recursion and induction are closely related. When you were first taught recursion in an introductory computer science class, you were probably told to use induction to prove that your recursive algorithm was correct. (For the purposes of this post, let us exclude hairy recursive functions like the one in the Collatz conjecture which do not [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/the-difference-between-recursion-induction/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Recursion and induction are closely related. When you were first taught recursion in an introductory computer science class, you were probably told to use induction to prove that your recursive algorithm was correct. (For the purposes of this post, let us exclude hairy recursive functions like the one in the &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Collatz_conjecture"&gt;Collatz conjecture&lt;/a&gt; which do not obviously terminate.) Induction suspiciously resembles recursion: the similarity comes from the fact that the inductive hypothesis looks a bit like the result of a “recursive call” to the theorem you are proving. If an ordinary recursive computation returns plain old values, you might wonder if an “induction computation” returns proof terms (which, by the Curry-Howard correspondence, could be thought of as a value).&lt;/p&gt;
&lt;p&gt;As it turns out, however, when you look at recursion and induction categorically, they are not equivalent! Intuitively, the difference lies in the fact that when you are performing induction, the data type you are performing induction over (e.g. the numbers) appears at the &lt;em&gt;type&lt;/em&gt; level, not the term level.  In the words of a category theorist, both recursion and induction have associated initial algebras, but the carrier sets and endofunctors are different.  In this blog post, I hope to elucidate precisely what the difference between recursion and induction is.  Unfortunately, I need to assume &lt;em&gt;some&lt;/em&gt; familiarity with initial algebras: if you don’t know what the relationship between a fold and an initial algebra is, check out this &lt;a class="reference external" href="http://blog.ezyang.com/2012/10/duality-for-haskellers/"&gt;derivation of lists in initial algebra form&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When dealing with generalized abstract nonsense, the most important first step is to use a concrete example! So let us go with the simplest nontrivial data type one can gin up: the natural numbers (our examples are written in both Coq and Haskell, when possible):&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Inductive nat : Set := (* defined in standard library *)
  | 0 : nat
  | S : nat -&amp;gt; nat.

data Nat = Z | S Nat
&lt;/pre&gt;
&lt;p&gt;Natural numbers are a pretty good example: even the &lt;a class="reference external" href="http://en.wikipedia.org/wiki/F-algebra"&gt;Wikipedia article on F-algebras&lt;/a&gt; uses them. To recap, an F-algebra (or sometimes simply “algebra”) has three components: an (endo)functor &lt;tt class="docutils literal"&gt;f&lt;/tt&gt;, a type &lt;tt class="docutils literal"&gt;a&lt;/tt&gt; and a reduction function &lt;tt class="docutils literal"&gt;f a &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; a&lt;/tt&gt;. For simple recursion over natural numbers, we need to define a functor &lt;tt class="docutils literal"&gt;NatF&lt;/tt&gt; which “generates” the natural numbers; then our type &lt;tt class="docutils literal"&gt;a&lt;/tt&gt; is &lt;tt class="docutils literal"&gt;Nat&lt;/tt&gt; and the reduction function is type &lt;tt class="docutils literal"&gt;NatF Nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Nat&lt;/tt&gt;.  The functor is defined as follows:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Inductive NatF (x : Set) : Set :=
  | F0 : NatF x.
  | FS : x -&amp;gt; NatF x.

data NatF x = FZ | FS x
&lt;/pre&gt;
&lt;p&gt;Essentially, take the original definition but replace any recursive occurrence of the type with a polymorphic variable. As an exercise, show that &lt;tt class="docutils literal"&gt;NatF Nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Nat&lt;/tt&gt; exists: it is the (co)product of &lt;tt class="docutils literal"&gt;() &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Nat&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;Nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Nat&lt;/tt&gt;.  The initiality of this algebra implies that any function of type &lt;tt class="docutils literal"&gt;NatF x &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; x&lt;/tt&gt; (for some arbitrary type &lt;tt class="docutils literal"&gt;x&lt;/tt&gt;) can be used in a fold &lt;tt class="docutils literal"&gt;Nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; x&lt;/tt&gt;: this fold is the homomorphism from the initial algebra (&lt;tt class="docutils literal"&gt;NatF Nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Nat&lt;/tt&gt;) to another algebra (&lt;tt class="docutils literal"&gt;NatF x &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; x&lt;/tt&gt;). The take-away point is that the initial algebra of natural numbers consists of an endofunctor over &lt;strong&gt;sets&lt;/strong&gt;.&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/nat-recursion.png" src="/img/nat-recursion.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Let’s look at the F-algebra for induction now.  As a first try, let’s try to use the same F-algebra and see if an appropriate homomorphism exists with the “type of induction”. (We can’t write this in Haskell, so now the examples will be Coq only.)  Suppose we are trying to prove some proposition &lt;tt class="docutils literal"&gt;P : nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Prop&lt;/tt&gt; holds for all natural numbers; then the type of the final proof term must be &lt;tt class="docutils literal"&gt;forall n : nat, P n&lt;/tt&gt;.  We can now write out the morphism of the algebra: &lt;tt class="docutils literal"&gt;NatF (forall n : nat, P n) &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; forall n : nat, P n&lt;/tt&gt;.  But this “inductive principle” is both nonsense and not true:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Hint Constructors nat NatF.
Goal ~ (forall (P : nat -&amp;gt; Prop), (NatF (forall n : nat, P n) -&amp;gt; forall n : nat, P n)).
  intro H; specialize (H (fun n =&amp;gt; False)); auto.
Qed.
&lt;/pre&gt;
&lt;p&gt;(Side note: you might say that this proof fails because I’ve provided a predicate which is false over all natural numbers. But induction still “works” even when the predicate you’re trying to prove is false: you should fail when trying to provide the base case or inductive hypothesis!)&lt;/p&gt;
&lt;p&gt;We step back and now wonder, “So, what’s the right algebra?” It should be pretty clear that our endofunctor is wrong. Fortunately, we can get a clue for what the right endofunctor might be by inspecting the type the induction principle for natural numbers:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
(* Check nat_ind. *)
nat_ind : forall P : nat -&amp;gt; Prop,
  P 0 -&amp;gt; (forall n : nat, P n -&amp;gt; P (S n)) -&amp;gt; forall n : nat, P n
&lt;/pre&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;P 0&lt;/tt&gt; is the type of the base case, and &lt;tt class="docutils literal"&gt;forall n : nat, P n &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P (S n)&lt;/tt&gt; is the type of the inductive case. In much the same way that we defined &lt;tt class="docutils literal"&gt;NatF nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; nat&lt;/tt&gt; for natural numbers, which was the combination of &lt;tt class="docutils literal"&gt;zero : unit &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; nat&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;succ : nat &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; nat&lt;/tt&gt;, we need to define a single function which combines the base case and the inductive case.  This seems tough: the result types are not the same. But dependent types come to the rescue: the type we are looking for is:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
fun (P : nat -&amp;gt; Prop) =&amp;gt; forall n : nat, match n with 0 =&amp;gt; True | S n' =&amp;gt; P n' end -&amp;gt; P n
&lt;/pre&gt;
&lt;p&gt;You can read this type as follows:  I will give you a proof object of type &lt;tt class="docutils literal"&gt;P n&lt;/tt&gt; for any &lt;tt class="docutils literal"&gt;n&lt;/tt&gt;.  If &lt;tt class="docutils literal"&gt;n&lt;/tt&gt; is 0, I will give you this proof object with no further help (&lt;tt class="docutils literal"&gt;True &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P 0&lt;/tt&gt;). However, if &lt;tt class="docutils literal"&gt;n&lt;/tt&gt; is &lt;tt class="docutils literal"&gt;S n'&lt;/tt&gt;, I will require you to furnish me with &lt;tt class="docutils literal"&gt;P n'&lt;/tt&gt; (&lt;tt class="docutils literal"&gt;P n' &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P (S n')&lt;/tt&gt;).&lt;/p&gt;
&lt;p&gt;We’re getting close. If this is the morphism of an initial algebra, then the functor &lt;tt class="docutils literal"&gt;IndF&lt;/tt&gt; must be:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
fun (P : nat -&amp;gt; Prop) =&amp;gt; forall n : nat, match n with 0 =&amp;gt; True | S n' =&amp;gt; P n' end
&lt;/pre&gt;
&lt;p&gt;What category is this a functor over?  Unfortunately, neither this post nor my brain has the space to give a rigorous treatment, but roughly the category can be thought of as nat-indexed propositions. Objects of this category are of the form &lt;tt class="docutils literal"&gt;forall n : nat, P n&lt;/tt&gt;, morphisms of the category are of the form &lt;tt class="docutils literal"&gt;forall n : nat, P n &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P' n&lt;/tt&gt;. [1] As an exercise, show that identity and composition exist and obey the appropriate laws.&lt;/p&gt;
&lt;p&gt;Something amazing is about to happen. We have defined our functor, and we are now in search of the initial algebra. As was the case for natural numbers, the initial algebra is defined by the least fixed point over the functor:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Fixpoint P (n : nat) : Prop :=
  match n with 0 =&amp;gt; True | S n' =&amp;gt; P n' end.
&lt;/pre&gt;
&lt;p&gt;But this is just &lt;tt class="docutils literal"&gt;True&lt;/tt&gt;!&lt;/p&gt;
&lt;pre class="literal-block"&gt;
Hint Unfold P.
Goal forall n, P n = True.
  induction n; auto.
Qed.
&lt;/pre&gt;
&lt;p&gt;Drawing out our diagram:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/nat-ind.png" src="/img/nat-ind.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The algebras of our category (downward arrows) correspond to inductive arguments. Because our morphisms take the form of &lt;tt class="docutils literal"&gt;forall n, P n &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P' n&lt;/tt&gt;, one cannot trivially conclude &lt;tt class="docutils literal"&gt;forall n, P' n&lt;/tt&gt; simply given &lt;tt class="docutils literal"&gt;forall n, P n&lt;/tt&gt;; however, the presence of the initial algebra means that &lt;tt class="docutils literal"&gt;True &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; forall n, P n&lt;/tt&gt; whenever we have an algebra &lt;tt class="docutils literal"&gt;forall n, IndF n &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; P n&lt;/tt&gt;. Stunning! (As a side note, Lambek’s lemma states that &lt;tt class="docutils literal"&gt;Mu P&lt;/tt&gt; is isomorphic to &lt;tt class="docutils literal"&gt;P (Mu P)&lt;/tt&gt;, so the initial algebra is in fact &lt;em&gt;really really&lt;/em&gt; trivial.)&lt;/p&gt;
&lt;p&gt;In conclusion:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;strong&gt;Recursion&lt;/strong&gt; over the natural numbers involves F-algebras with the functor &lt;tt class="docutils literal"&gt;unit + X&lt;/tt&gt; over the category of &lt;strong&gt;Sets&lt;/strong&gt;. The least fixed point of this functor is the natural numbers, and the morphism induced by the initial algebra corresponds to a &lt;em&gt;fold&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Induction&lt;/strong&gt; over the natural numbers involves F-algebras with the functor &lt;tt class="docutils literal"&gt;fun n =&amp;gt; match n with 0 =&amp;gt; True | S n' =&amp;gt; P n'&lt;/tt&gt; over the category of nat-indexed propositions. The least fixed point of this functor is &lt;tt class="docutils literal"&gt;True&lt;/tt&gt;, and the morphism induced by the initial algebra &lt;em&gt;establishes the truth of the proposition being inductively proven&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, the next time someone tells asks you what the difference between induction and recursion is, tell them: &lt;em&gt;Induction is just the unique homomorphism induced by an initial algebra over indexed propositions, what’s the problem?&lt;/em&gt;&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Acknowledgements go to Conor McBride, who explained this shindig to me over ICFP. I promised to blog about it, but forgot, and ended up having to rederive it all over again.&lt;/p&gt;
&lt;p&gt;[1] Another plausible formulation of morphisms goes &lt;tt class="docutils literal"&gt;(forall n : nat, P n) &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; (forall n : nat, P' n)&lt;/tt&gt;. However, morphisms in this category are too &lt;em&gt;strong&lt;/em&gt;: they require you to go and prove the result for all &lt;em&gt;n&lt;/em&gt;... which you would do with induction, which misses the point. Plus, this category is a subcategory of the ordinary category of propositions.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/_U33n9IZwZ0" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/the-difference-between-recursion-induction/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/the-difference-between-recursion-induction/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/the-difference-between-recursion-induction/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Kindle is not good for textbooks]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/0NJ1FoIIiRY/" />
		<id>http://blog.ezyang.com/?p=8279</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-15T23:52:20Z</published>
		<category scheme="http://blog.ezyang.com" term="Toolbox" />		<summary type="html"><![CDATA[Having attempted to read a few textbooks on my Kindle, I have solemnly concluded that the Kindle is in fact a terrible device for reading textbooks. The fundamental problem is that, due to technological limitations, the Kindle is optimized for sequential reading. This can be seen in many aspects: Flipping a page in the Kindle [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/kindle-is-not-good-for-textbooks/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Having attempted to read a few textbooks on my Kindle, I have solemnly concluded that the Kindle is in fact a terrible device for reading textbooks.  The fundamental problem is that, due to technological limitations, the Kindle is optimized for &lt;em&gt;sequential&lt;/em&gt; reading.  This can be seen in many aspects:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Flipping a page in the Kindle is not instantaneous (I don't have a good setup to time how long the screen refresh takes, but there is definitely a perceptible lag before when you swipe, and when the Kindle successfully redraws the screen—and it’s even worse if you try to flip backwards).&lt;/li&gt;
&lt;li&gt;Rapidly flipping through pages in order to scan for a visual feature compounds the delay problem.&lt;/li&gt;
&lt;li&gt;There is no way to take the “finger” approach to random access (i.e. wedge your finger between two pages to rapidly switch between them); jumping between bookmarks requires &lt;em&gt;four&lt;/em&gt; presses with the current Kindle interface!&lt;/li&gt;
&lt;li&gt;The screen size of the Kindle is dramatically smaller than that of an average textbook, which reduces the amount of information content that can be placed on one screen and further exacerbates slow page turns.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A textbook cannot be read as a light novel.  So, while the Kindle offers the tantalizing possibility of carrying a stack of textbooks with you everywhere, in fact, you’re better off getting the actual dead tree version if you’re planning on doing some serious studying from it.  That is not to say textbook ebooks are not useful; in fact, having a searchable textbook on your laptop is seriously awesome—but this is when you’re using the textbook as a reference material, and &lt;em&gt;not&lt;/em&gt; when you’re trying to actually learn the material.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/0NJ1FoIIiRY" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/kindle-is-not-good-for-textbooks/#comments" thr:count="7" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/kindle-is-not-good-for-textbooks/feed/atom/" thr:count="7" />
		<thr:total>7</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/kindle-is-not-good-for-textbooks/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[A Zerocoin puzzle]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/jabXxayRegE/" />
		<id>http://blog.ezyang.com/?p=8270</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-11T22:54:02Z</published>
		<category scheme="http://blog.ezyang.com" term="Cryptography" /><category scheme="http://blog.ezyang.com" term="Math" />		<summary type="html"><![CDATA[I very rarely post linkspam, but given that I’ve written on the subject of anonymizing Bitcoins in the past, this link seems relevant: Zerocoin: making Bitcoin anonymous. Their essential innovation is to have a continuously operating mixing pool built into the block chain itself; they pull this off using zero-knowledge proofs. Nifty! Here is a [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/a-zerocoin-puzzle/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;I very rarely post linkspam, but given that I’ve written on the subject of &lt;a class="reference external" href="http://blog.ezyang.com/2012/07/secure-multiparty-bitcoin-anonymization/"&gt;anonymizing Bitcoins&lt;/a&gt; in the past, this link seems relevant: &lt;a class="reference external" href="http://blog.cryptographyengineering.com/2013/04/zerocoin-making-bitcoin-anonymous.html"&gt;Zerocoin: making Bitcoin anonymous&lt;/a&gt;. Their essential innovation is to have a &lt;em&gt;continuously operating&lt;/em&gt; mixing pool built into the block chain itself; they pull this off using zero-knowledge proofs. Nifty!&lt;/p&gt;
&lt;p&gt;Here is a puzzle for the readers of this blog.  Suppose that I am a user who wants to anonymize some Bitcoins, and I am willing to wait expected time &lt;em&gt;N&lt;/em&gt; before redeeming my Zerocoins. What is the correct probability distribution for me to pick my wait time from?  Furthermore, suppose a population of Zerocoin participants, all of which are using this probability distribution. Furthermore, suppose that each participant has some utility function trading off anonymity and expected wait time (feel free to make assumptions that make the analysis easy). Is this population in Nash equilibrium?&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/jabXxayRegE" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/a-zerocoin-puzzle/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/a-zerocoin-puzzle/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/a-zerocoin-puzzle/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[A classical logic fairy tale]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/Li9N0EpBepI/" />
		<id>http://blog.ezyang.com/?p=8264</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-07T21:52:08Z</published>
		<category scheme="http://blog.ezyang.com" term="Logic" />		<summary type="html"><![CDATA[(Selinger) Here is a fairy tale: The evil king calls the poor shepherd and gives him these orders. “You must bring me the philosophers stone, or you have to find a way to turn the philosopher’s stone to gold. If you don’t, your head will be taken off tomorrow!” What can the poor shepherd do [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/a-classical-logic-fairy-tale/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;!--  --&gt;
&lt;blockquote&gt;
(Selinger) Here is a fairy tale: The evil king calls the poor shepherd and gives him these orders. “You must bring me the philosophers stone, or you have to find a way to turn the philosopher’s stone to gold. If you don’t, your head will be taken off tomorrow!” What can the poor shepherd do to save his life?&lt;/blockquote&gt;
&lt;p&gt;Hat tip to &lt;a class="reference external" href="http://www.cs.cmu.edu/~cmartens/"&gt;Chris&lt;/a&gt; for originally telling me a different variant of this story. Unfortunately, this quote from &lt;em&gt;Lectures on the Curry-Howard Isomorphism&lt;/em&gt; was the only reference I could find. What should the shepherd do? Is there something a little odd about this story?&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/Li9N0EpBepI" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/a-classical-logic-fairy-tale/#comments" thr:count="21" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/a-classical-logic-fairy-tale/feed/atom/" thr:count="21" />
		<thr:total>21</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/a-classical-logic-fairy-tale/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[NDSEG]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/QMS0ilWA5Tw/" />
		<id>http://blog.ezyang.com/?p=8248</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-06T02:40:31Z</published>
		<category scheme="http://blog.ezyang.com" term="Academia" />		<summary type="html"><![CDATA[Humbly presented for your consideration: Exhibit A, an NDSEG essay that did not get accepted; Exhibit B, an NDSEG essay that did get accepted. It’s pretty cool what making a statement more focused can do. (See also Philip Guo’s page on the topic.)]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/ndseg/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Humbly presented for your consideration: &lt;a class="reference external" href="http://web.mit.edu/~ezyang/Public/ndseg-goals.pdf"&gt;Exhibit A&lt;/a&gt;, an NDSEG essay that did not get accepted; &lt;a class="reference external" href="http://web.mit.edu/~ezyang/Public/ndseg2012-4.pdf"&gt;Exhibit B&lt;/a&gt;, an NDSEG essay that did get accepted. It’s pretty cool what making a statement more focused can do. (See also &lt;a class="reference external" href="http://www.pgbovine.net/fellowship-tips.htm"&gt;Philip Guo’s page on the topic.&lt;/a&gt;)&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/QMS0ilWA5Tw" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/ndseg/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/ndseg/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/ndseg/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Resource limits for Haskell]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/tNo7j4O86mU/" />
		<id>http://blog.ezyang.com/?p=8239</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-02T20:36:40Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Last week, I made my very first submission to ICFP! The topic? An old flame of mine: how to bound space usage of Haskell programs. We describe the first iteration of a resource limits system for Haskell, taking advantage of the key observation that resource limits share semantics and implementation strategy with profiling. We pay [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/04/resource-limits-for-haskell/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Last week, I made my very first submission to ICFP! The topic? An old flame of mine: how to bound space usage of Haskell programs.&lt;/p&gt;
&lt;blockquote&gt;
We describe the first iteration of a resource limits system for
Haskell, taking advantage of the key observation that resource limits share semantics and implementation strategy with profiling. We
pay special attention to the problem of limiting resident memory
usage: we describe a simple implementation technique for carrying
out incremental heap censuses and describe a novel information-flow control solution for handling forcible resource reclamation.
This system is implemented as a set of patches to GHC.&lt;/blockquote&gt;
&lt;p&gt;You can get &lt;a class="reference external" href="http://ezyang.com/papers/ezyang13-rlimits.pdf"&gt;a copy of the submission here.&lt;/a&gt; I've reproduced below the background section on how profiling Haskell works; if this tickles your fancy, check out the rest of the paper!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Profiling in Haskell is performed by charging the costs of computation
to the “current cost center.”  A &lt;em&gt;cost center&lt;/em&gt; is an abstract,
programmer-specified entity to which costs can be charged; only one is active
per thread at any given time, and the &lt;em&gt;cost semantics&lt;/em&gt; determines how the current cost center changes as the program
executes.  For example, the &lt;tt class="docutils literal"&gt;scc cc e&lt;/tt&gt; expression (set-cost-center)
modifies the current cost center during evaluation of &lt;tt class="docutils literal"&gt;e&lt;/tt&gt; to be
&lt;tt class="docutils literal"&gt;cc&lt;/tt&gt;.  Cost centers are defined statically at compile time.&lt;/p&gt;
&lt;p&gt;A cost semantics for Haskell was defined by Sansom et al. (1995) Previously,
there had not been a formal account for how to attribute costs in the
presence of lazy evaluation and higher-order functions; this paper
resolved these questions. The two insights of their paper were the
following:  first, they articulated that cost attribution should be
independent of evaluation order. For the sake of understandability,
whether a thunk is evaluated immediately or later should not affect who
is charged for it. Secondly, they observed that there are two ways of
attributing costs for functions, in direct parallel to the difference
between lexical scoping and dynamic scoping.&lt;/p&gt;
&lt;p&gt;The principle of order-independent cost-attribution can be seen by this
program:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
f x = scc &amp;quot;f&amp;quot; (Just (x * x))
g x = let Just y = f x in scc &amp;quot;g&amp;quot; y
&lt;/pre&gt;
&lt;p&gt;When &lt;tt class="docutils literal"&gt;g 4&lt;/tt&gt; is invoked, who is charged the cost of evaluating
&lt;tt class="docutils literal"&gt;x * x&lt;/tt&gt;?  With strict evaluation, it is easy to see that &lt;tt class="docutils literal"&gt;f&lt;/tt&gt;
should be charged, since &lt;tt class="docutils literal"&gt;x * x&lt;/tt&gt; is evaluated immediately inside
the &lt;tt class="docutils literal"&gt;scc&lt;/tt&gt; expression.  Order-independence dictates, then, that
even if the execution of &lt;tt class="docutils literal"&gt;x * x&lt;/tt&gt; is deferred to the inside of
&lt;tt class="docutils literal"&gt;scc &amp;quot;g&amp;quot; y&lt;/tt&gt;, the cost should &lt;em&gt;still&lt;/em&gt; be attributed to &lt;tt class="docutils literal"&gt;f&lt;/tt&gt;.
In general, &lt;tt class="docutils literal"&gt;scc &amp;quot;f&amp;quot; x&lt;/tt&gt; on a variable &lt;tt class="docutils literal"&gt;x&lt;/tt&gt; is a no-op.  In order
to implement such a scheme, the current cost-center at the time of
the allocation of the thunk must be recorded with the thunk and restored
when the thunk is forced.&lt;/p&gt;
&lt;p&gt;The difference between lexical scoping and dynamic scoping for function
cost attribution can be seen in this example:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
f = scc &amp;quot;f&amp;quot; (\x -&amp;gt; x * x)
g = \x -&amp;gt; scc &amp;quot;g&amp;quot; (x * x)
&lt;/pre&gt;
&lt;p&gt;What is the difference between these two functions?  We are in a
situation analogous to the choice for thunks: should the current
cost-center be saved along with the closure, and restored upon
invocation of the function? If the answer is yes, we are using lexical
scoping and the functions are equivalent; if the answer is no, we are
using dynamic scoping and the &lt;tt class="docutils literal"&gt;scc&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; is a no-op.
The choice GHC has currently adopted for &lt;tt class="docutils literal"&gt;scc&lt;/tt&gt; is dynamic scoping.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/tNo7j4O86mU" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/04/resource-limits-for-haskell/#comments" thr:count="8" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/04/resource-limits-for-haskell/feed/atom/" thr:count="8" />
		<thr:total>8</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/04/resource-limits-for-haskell/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[The single export pattern]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/ul4qargshjs/" />
		<id>http://blog.ezyang.com/?p=8233</id>
		<!-- ezyang: omitted update time -->
		<published>2013-04-01T00:39:41Z</published>
		<category scheme="http://blog.ezyang.com" term="Programming Languages" />		<summary type="html"><![CDATA[From the files of the ECMAScript TC39 proceedings Single export refers to a design pattern where a module identifier is overloaded to also represent a function or type inside the module. As far as I can tell, the term “single export” is not particularly widely used outside the ECMAScript TC39 committee; however, the idea shows [...]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2013/03/the-single-export-pattern/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;div class="center container"&gt;
&lt;em&gt;From the files of the ECMAScript TC39 proceedings&lt;/em&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Single export&lt;/strong&gt; refers to a design pattern where a module identifier is overloaded to also represent a function or type inside the module.  As far as I can tell, the term “single export” is not particularly widely used outside the ECMAScript TC39 committee; however, the idea shows up in other contexts, so I’m hoping to popularize this particular name (since names are powerful).&lt;/p&gt;
&lt;p&gt;The basic idea is very simple. In JavaScript, a module is frequently represented as an object:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
var sayHello = require('./sayhello.js');
sayHello.run();
&lt;/pre&gt;
&lt;p&gt;The methods of &lt;tt class="docutils literal"&gt;sayHello&lt;/tt&gt; are the functions exported by the module. But what about &lt;tt class="docutils literal"&gt;sayHello&lt;/tt&gt; itself? Because functions are objects too, we could imagine that &lt;tt class="docutils literal"&gt;sayHello&lt;/tt&gt; was a function as well, and thus:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
sayHello()
&lt;/pre&gt;
&lt;p&gt;would be a valid fragment of code, perhaps equivalent to &lt;tt class="docutils literal"&gt;sayHello.run()&lt;/tt&gt;. Only one symbol can be exported this way, but in many modules, there is an obvious choice (think of jQuery’s &lt;tt class="docutils literal"&gt;$&lt;/tt&gt; object, etc).&lt;/p&gt;
&lt;p&gt;This pattern is also commonly employed in Haskell, by taking advantage of the fact that types and modules live in different namespaces:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
import qualified Data.Map as Map
import Data.Map (Map)
&lt;/pre&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;Map&lt;/tt&gt; is now overloaded to be both a type and a module.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/ul4qargshjs" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://blog.ezyang.com/2013/03/the-single-export-pattern/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2013/03/the-single-export-pattern/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://blog.ezyang.com/2013/03/the-single-export-pattern/</feedburner:origLink></entry>
	</feed>
