<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Conal Elliott</title>
	<atom:link href="http://conal.net/blog/feed" rel="self" type="application/rss+xml" />
	<link>http://conal.net/blog</link>
	<description>Inspirations &#38; experiments, mainly about denotative/functional programming in Haskell</description>
	<lastBuildDate>Thu, 25 Jul 2019 18:15:11 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.1.17</generator>
	<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2F&amp;language=en_US&amp;category=text&amp;title=Conal+Elliott&amp;description=Inspirations+%26amp%3B+experiments%2C+mainly+about+denotative%2Ffunctional+programming+in+Haskell&amp;tags=blog" type="text/html" />
	<item>
		<title>Circuits as a bicartesian closed category</title>
		<link>http://conal.net/blog/posts/circuits-as-a-bicartesian-closed-category</link>
		<comments>http://conal.net/blog/posts/circuits-as-a-bicartesian-closed-category#comments</comments>
		<pubDate>Mon, 16 Sep 2013 22:52:16 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=547</guid>
		<description><![CDATA[My previous few posts have been about cartesian closed categories (CCCs). In From Haskell to hardware via cartesian closed categories, I gave a brief motivation: typed lambda expressions and the CCC vocabulary are equally expressive, but have different strengths: In Haskell, the CCC vocabulary is overloadable and so can be interpreted more flexibly than lambda [&#8230;]]]></description>
				<content:encoded><![CDATA[<p><!-- references --></p>

<p><!-- teaser --></p>

<p>My <a href="http://conal.net/blog/posts/haskell-to-hardware-via-cccs/" title="blog post">previous</a> <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post">few</a> <a href="http://conal.net/blog/posts/optimizing-cccs/" title="blog post">posts</a> have been about cartesian closed categories (CCCs). In <a href="http://conal.net/blog/posts/haskell-to-hardware-via-cccs/" title="blog post"><em>From Haskell to hardware via cartesian closed categories</em></a>, I gave a brief motivation: typed lambda expressions and the CCC vocabulary are equally expressive, but have different strengths:</p>

<ul>
<li>In Haskell, the CCC vocabulary is overloadable and so can be interpreted more flexibly than lambda and application.</li>
<li>Lambda expressions are friendlier for human programmers to write and read.</li>
</ul>

<p>By automatically translating lambda expressions to CCC form (as in <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>), I hope to get the best of both options.</p>

<p>An interpretation I’m especially keen on—and the one that inspired this series of posts—is circuits, as described in this post.</p>

<p><span id="more-547"></span></p>

<p><strong>Edits:</strong></p>

<ul>
<li>2013–09–17: “defined for all products with categories” ⇒ “defined for all categories with products”. Thanks to Tom Ellis.</li>
<li>2013–09–17: Clarified first CCC/lambda contrast above: “In Haskell, the CCC vocabulary is overloadable and so can be interpreted more flexibly than lambda and application.” Thanks to Darryl McAdams.</li>
</ul>

<h3 id="cccs">CCCs</h3>

<p>First a reminder about CCCs, taken from <a href="http://conal.net/blog/posts/haskell-to-hardware-via-cccs/" title="blog post"><em>From Haskell to hardware via cartesian closed categories</em></a>:</p>

<blockquote>

<p>You may have heard of “cartesian closed categories” (CCCs). CCC is an abstraction having a small vocabulary with associated laws:</p>
<ul>
<li>The “category” part means we have a notion of “morphisms” (or “arrows”) each having a domain and codomain “object”. There is an identity morphism for and associative composition operator. If this description of morphisms and objects sounds like functions and types (or sets), it’s because functions and types are one example, with <code>id</code> and <code>(∘)</code>.</li>
<li>The “cartesian” part means that we have products, with projection functions and an operator to combine two functions into a pair-producing function. For Haskell functions, these operations are <code>fst</code>, <code>snd</code> and <code>(△)</code>. (The latter is called “<code>(&amp;&amp;&amp;)</code>” in <code>Control.Arrow</code>.)</li>
<li>The “closed” part means that we have a way to represent morphisms as objects, referred to as “exponentials”. The corresponding operations are <code>curry</code>, <code>uncurry</code>, and <code>apply</code>. Since Haskell is a higher-order language, these exponential objects are simply (first class) functions.</li>
</ul>

</blockquote>

<p>As mentioned in <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>, I also want coproducts (corresponding to sum types in Haskell), extending CCCs to <em>bicartesian</em> close categories, or “biCCCs”.</p>

<p>Normally, I’d formalize a notion like (bi)CCC with a small collection of type classes (e.g., as in Edward Kmett’s <a href="http://hackage.haskell.org/package/categories">categories package</a>). Due to a technical problem with associated constraints (to be explored in a future post), I’ve so far been unable to find a satisfactory such formulation. Instead, I’ll convert the biCCC term representation given in the post <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>.</p>

<h4 id="circuits">Circuits</h4>

<p>How might we think of circuits? A simple-sounding idea is that circuits are directed graphs of components components (logic gates, adders, flip-flops, etc), in which the graph edges represent wires. Each component has some input pins and some output pins, and each wire connects an output pin of some component to an input pin of another component.</p>

<p>On closer examination, some questions arise:</p>

<ul>
<li>How to identify intended inputs and outputs?</li>
<li>How to ensure that the graphs are fully connected other than intended external inputs and outputs?</li>
<li>How to ensure that input pins are driven by at most one output pin, while allowing output pins to drive any number of input pins?</li>
<li>How to sequentially compose graphs, matching up and consuming free outputs and inputs?</li>
</ul>

<p>Note that similar questions arose in the design of programming languages. In functional languages (or even semi-functional ones like Fortran, ML, and <a href="http://conal.net/blog/posts/the-c-language-is-purely-functional/" title="blog post">Haskell+IO</a>), we answer the connectivity/composition questions by nesting function applications. Overall inputs are identified syntactically as function parameters, while output corresponds to the body of the function.</p>

<p>We can adapt this technique to the construction of circuits as follows. Instead of building graph fragments directly and then adding edges/wires to connect those fragments, let’s build recipes that <em>consume</em> output pins, build a graph fragment, and indicate the output pins of that fragment. A circuit (generator) is thus a function that takes some output pins as arguments, instantiates a collection of components, and yields some output pins. The passed-in output pins come from other component instantiations, <em>or</em> are the top-level external inputs to a circuits. The number and arrangement of the pins consumed and yielded vary and so will appear as type parameters. Since distinct pins are generated as needed, a circuit will also consume part of a given supply of pins, passing on the remainder for further component construction.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">CircuitG</span> b <span class="fu">=</span> <span class="dt">PinSupply</span> <span class="ot">→</span> (<span class="dt">PinSupply</span>, [<span class="dt">Comp</span>], b) <span class="co">-- first try</span></code></pre>

<p>Note that the input type is absent, because it can show up as part of a function type: <code>a → CircuitG b</code>. This factoring is typical in monadic formulations.</p>

<h5 id="a-circuit-monad">A circuit monad</h5>

<p>Of course, we’ve seen this pattern before, in writer and state monads. Moreover, the writer will want to append component lists, so for efficiency, we’ll replace <code>[a]</code> with an append-friendly data type, namely sequences represented as finger trees (<a href="http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Sequence.html#t:Seq" title="Hackage documentation"><code>Seq</code></a> from <a href="http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Sequence.html" title="Hackage documentation"><code>Data.Sequence</code></a>).</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">CircuitM</span> <span class="fu">=</span> <span class="dt">WriterT</span> (<span class="dt">Seq</span> <span class="dt">Comp</span>) (<span class="dt">State</span> <span class="dt">PinSupply</span>)</code></pre>

<p>One very simple operation is generation of a single pin (producing no components).</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">newPin <span class="ot">∷</span> <span class="dt">CircuitM</span> <span class="dt">Pin</span>
newPin <span class="fu">=</span> <span class="kw">do</span> { (p<span class="fu">:</span>ps&#39;) <span class="ot">←</span> get ; put ps&#39; ; <span class="fu">return</span> p }</code></pre>

<p>In fact, this definition has a considerably more general type, because it doesn’t use the <code>WriterT</code> aspect of <code>CircuitM</code>. The <code>get</code> and <code>put</code> operations come from the <a href="http://hackage.haskell.org/packages/archive/mtl/latest/doc/html/Control-Monad-State-Class.html#t:MonadState" title="Hackage documentation"><code>MonadState</code></a> in the <a href="http://hackage.haskell.org/package/mtl-2.1.2" title="Hackage documentation"><code>mtl</code></a> package, so we can use any <code>MonadState</code> instance with <code>PinSupply</code> as the state. For convenience, define a constraint abbreviation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">MonadPins</span> <span class="fu">=</span> <span class="dt">MonadState</span> <span class="dt">PinSupply</span></code></pre>

<p>and use the more general type:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">newPin <span class="ot">∷</span> <span class="dt">MonadPins</span> m <span class="ot">⇒</span> m <span class="dt">Pin</span></code></pre>

<p>We’ll need this extra generality later.</p>

<p>I know I promised you a category rather than a monad. We’ll get there.</p>

<p>Each pin will represent a channel to convey one bit of information, but varying with time, i.e., a signal. The values conveyed on these wires will not be available until the circuit is realized in hardware and run. While constructing the graph/circuit, we’ll only need a way of distinguishing the pins and generating new ones. Given these simple requirements, we’ll represent pins simply as integers, but <code>newtype</code>-wrapped for type-safety:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Pin</span> <span class="fu">=</span> <span class="dt">Pin</span> <span class="dt">Int</span> <span class="kw">deriving</span> (<span class="kw">Eq</span>,<span class="kw">Ord</span>,<span class="kw">Show</span>,<span class="kw">Enum</span>)
<span class="kw">type</span> <span class="dt">PinSupply</span> <span class="fu">=</span> [<span class="dt">Pin</span>]</code></pre>

<p>Each circuit component is an instance of an underlying primitive and has three characteristics:</p>

<ul>
<li>the underlying “primitive”, which determines the functionality and interface (type of information in and out),</li>
<li>the pins carrying information into the instance (and coming from the outputs of other components), and</li>
<li>the pins carrying information out of the instance.</li>
</ul>

<p>Components can have different interface types, but we’ll have to combine them all into a single collection, so we’ll use an existential type:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Comp</span> <span class="fu">=</span> <span class="ot">∀</span> a b<span class="fu">.</span> <span class="dt">IsSource2</span> a b <span class="ot">⇒</span> <span class="dt">Comp</span> (<span class="dt">Prim</span> a b) a b

<span class="kw">deriving</span> <span class="kw">instance</span> <span class="kw">Show</span> <span class="dt">Comp</span></code></pre>

<p>For now, a primitive will be identified simply by a name:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Prim</span> a b <span class="fu">=</span> <span class="dt">Prim</span> <span class="dt">String</span>

<span class="kw">instance</span> <span class="kw">Show</span> (<span class="dt">Prim</span> a b) <span class="kw">where</span> <span class="fu">show</span> (<span class="dt">Prim</span> str) <span class="fu">=</span> str</code></pre>

<p>The <code>IsSource2</code> constraint is an abbreviation for <code>IsSource</code> constraints on the domain and range types:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">IsSource2</span> a b <span class="fu">=</span> (<span class="dt">IsSource</span> a, <span class="dt">IsSource</span> b)</code></pre>

<p>Sources will be structures of pins. We’ll need to flatten them into sequences, generate them for the outputs of a new instance, and inquire the number of pins based on the type (i.e., without evaluation):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="kw">Show</span> a <span class="ot">⇒</span> <span class="dt">IsSource</span> a <span class="kw">where</span>
  toPins    <span class="ot">∷</span> a <span class="ot">→</span> <span class="dt">Seq</span> <span class="dt">Pin</span>
  genSource <span class="ot">∷</span> <span class="dt">MonadPins</span> m <span class="ot">⇒</span> m a
  numPins   <span class="ot">∷</span> a <span class="ot">→</span> <span class="dt">Int</span></code></pre>

<p>Instances of <code>IsSource</code> are straightforward to define. For instance,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">IsSource</span> () <span class="kw">where</span>
  toPins () <span class="fu">=</span> ∅
  genSource <span class="fu">=</span> <span class="fu">return</span> ()
  numPins _ <span class="fu">=</span> <span class="dv">0</span>

<span class="kw">instance</span> <span class="dt">IsSource</span> <span class="dt">Pin</span> <span class="kw">where</span>
  toPins p  <span class="fu">=</span> singleton p
  genSource <span class="fu">=</span> newPin
  numPins _ <span class="fu">=</span> <span class="dv">1</span>

<span class="kw">instance</span> <span class="dt">IsSource2</span> a b <span class="ot">⇒</span> <span class="dt">IsSource</span> (a × b) <span class="kw">where</span>
  toPins (sa,sb) <span class="fu">=</span> toPins sa ⊕ toPins sb
  genSource      <span class="fu">=</span> liftM2 (,) genSource genSource
  numPins <span class="fu">~</span>(a,b) <span class="fu">=</span> numPins a <span class="fu">+</span> numPins b</code></pre>

<p>Note that we’re taking care never to evaluate the argument to <code>numPins</code>, which will be <code>⊥</code> in practice.</p>

<h5 id="a-circuit-category">A circuit category</h5>

<p>I promised you a circuit category but gave you a monad. There’s a standard construction to turn monads into categories, namely <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html#v:Kleisli" title="Hackage documentation"><code>Kleisli</code></a> from <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Category.html" title="Hackage documentation"><code>Control.Category</code></a>, so you might think we could simply define</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> a ⇴ b <span class="fu">=</span> <span class="dt">Kleisli</span> <span class="dt">CircuitM</span>  <span class="co">-- first try</span></code></pre>

<p>What I don’t like about this definition is that it requires parameter types like <code>Pin</code> and <code>Pin × Pin</code>, which expose aspects of the implementation. I’d prefer to use <code>Bool</code> and <code>Bool × Bool</code> instead, to reflect the conceptual types of information <em>flowing through</em> circuits. Moreover, I want to generate computations parametrized over the underlying category (and indeed generate these category-generic computations automatically from Haskell source). Explicit mention of representation notions like <code>Pin</code> would thwart this genericity, restricting to circuits.</p>

<p>To get type parameters like <code>Bool</code> and <code>Bool × Bool</code>, we’ll have to convert value types to pin types. Type families gives us this ability:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> family <span class="dt">Pins</span> a</code></pre>

<p>Now we can say that circuits can pass a <code>Bool</code> value by means of a single pin:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> <span class="dt">Bool</span> <span class="fu">=</span> <span class="dt">Pin</span></code></pre>

<p>We can pass the unit with no pins at all:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> () <span class="fu">=</span> ()</code></pre>

<p>The pins for <code>a × b</code> include pins for <code>a</code> and pins for <code>b</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> (a × b) <span class="fu">=</span> <span class="dt">Pins</span> a × <span class="dt">Pins</span> b</code></pre>

<p>Sum types are trickier. We’ll get there in a bit.</p>

<p>Now we can define our improved circuit category:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> a ⇴ b <span class="fu">=</span> <span class="dt">C</span> (<span class="dt">Kleisli</span> <span class="dt">CircuitM</span> (<span class="dt">Pins</span> a) (<span class="dt">Pins</span> b))</code></pre>

<h5 id="sum-types">Sum types</h5>

<p>As we say above, the <code>Pins</code> type family distributes over <code>()</code> and pairing. The same is true for every fixed-shape type, i.e., every type in which all values have the same representation shape, including <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math>-tuples, length-typed vectors and depth-typed perfect leaf trees.</p>

<p>The canonical example of a type whose elements can vary in shape is sums, represented in Haskell as the <code>Either</code> algebraic data type, for instance <code>Either Bool (Bool,Bool)</code>, which I’ll write instead as <code>Bool + Bool × Bool</code>. Can <code>Pins</code> distribute over <code>+</code>, i.e., can we define</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> (a <span class="fu">+</span> b) <span class="fu">=</span> <span class="dt">Pins</span> a <span class="fu">+</span> <span class="dt">Pins</span> b  <span class="co">-- ??</span></code></pre>

<p>We cannot use this definition, because it implies that the we must choose a shape <em>statically</em>, i.e., when constructing the circuit. The data may, however, change shape <em>dynamically</em>, so no one static choice suffices.</p>

<p>I’ll give a solution, which seems to work out okay. However, it lacks the elegance and inevitability that I always look for, so if you have other ideas, please leave suggestions in comments on this post.</p>

<p>The idea is that we’ll use enough pins for the larger of the two representations. Since the two <code>Pins</code> representations (<code>Pins a</code> vs <code>Pins b</code>) can be arbitrarily different, flatten them into a common shape, namely a sequence. To distinguish the two summands, throw in an additional bit/pin:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> a <span class="fu">:++</span> b <span class="fu">=</span> <span class="dt">UP</span> { sumPins <span class="ot">∷</span> <span class="dt">Seq</span> <span class="dt">Pin</span>, sumFlag <span class="ot">∷</span> <span class="dt">Pin</span> }

<span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> (a <span class="fu">+</span> b) <span class="fu">=</span> <span class="dt">Pins</span> a <span class="fu">:++</span> <span class="dt">Pins</span> b</code></pre>

<p>Now we’ll want to define an <code>IsSource</code> instance. Recall the class definition:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="kw">Show</span> a <span class="ot">⇒</span> <span class="dt">IsSource</span> a <span class="kw">where</span>
  toPins    <span class="ot">∷</span> a <span class="ot">→</span> <span class="dt">Seq</span> <span class="dt">Pin</span>
  genSource <span class="ot">∷</span> <span class="dt">MonadPins</span> m <span class="ot">⇒</span> m a
  numPins   <span class="ot">∷</span> a <span class="ot">→</span> <span class="dt">Int</span></code></pre>

<p>It’s easy to generate a sequence of pins:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">IsSource2</span> a b <span class="ot">⇒</span> <span class="dt">IsSource</span> (a <span class="fu">:++</span> b) <span class="kw">where</span>
  toPins (<span class="dt">UP</span> ps f) <span class="fu">=</span> ps ⊕ singleton f</code></pre>

<p>The number of pins in <code>a :++ b</code> is the maximum number of pins in <code>a</code> or <code>b</code>, plus one for the flag bit:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  numPins _ <span class="fu">=</span> (numPins (⊥ <span class="ot">∷</span> a) <span class="ot">`max`</span> numPins (⊥ <span class="ot">∷</span> b)) <span class="fu">+</span> <span class="dv">1</span></code></pre>

<p>To generate an <code>a :++ b</code>, generate this many pins, using one for <code>sumFlag</code> and the rest for <code>sumPins</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  genSource <span class="fu">=</span>
    liftM2 <span class="dt">UP</span> (Seq.replicateM (numPins (⊥ <span class="ot">∷</span> (a <span class="fu">:++</span> b)) <span class="fu">-</span> <span class="dv">1</span>) newPin)
              newPin</code></pre>

<p>where <a href="http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Sequence.html#v:replicateM" title="Hackage documentation"><code>Seq.replicateM</code></a> function here comes from <a href="http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Sequence.html" title="Hackage documentation"><code>Data.Sequence</code></a>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">replicateM <span class="ot">∷</span> <span class="kw">Monad</span> m <span class="ot">⇒</span> <span class="dt">Int</span> <span class="ot">→</span> m a <span class="ot">→</span> m (<span class="dt">Seq</span> a)</code></pre>

<p>This <code>genSource</code> definition is one motivation for the <code>numPins</code> method. Another is coming up in the next section.</p>

<h3 id="categorical-operations">Categorical operations</h3>

<p>I’m working toward a representation of circuits that is both simple and able to implement the standard collection of operations for a cartesian closed category, plus coproducts (i.e., a <a href="http://ncatlab.org/nlab/show/bicartesian+closed+category" title="nLab wiki page">bicartesian closed category</a>, IIUC). Here, I’ll show how to implement these operations, which are also mentioned in my recent post <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>.</p>

<h4 id="category-operations">Category operations</h4>

<p>A category has an identity and sequential composition. As defined in <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Category.html" title="Hackage documentation"><code>Control.Category</code></a>,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Category</span> k <span class="kw">where</span>
  <span class="fu">id</span>  <span class="ot">∷</span> a <span class="ot">`k`</span> a
  (∘) <span class="ot">∷</span> (b <span class="ot">`k`</span> c) <span class="ot">→</span> (a <span class="ot">`k`</span> b) <span class="ot">→</span> (a <span class="ot">`k`</span> c)</code></pre>

<p>The required laws are that <code>id</code> is both left- and right-identity for <code>(∘)</code> and that <code>(∘)</code> is associative.</p>

<p>Recall that our circuit category <code>(⇴)</code> is <em>almost</em> the same as <code>Kleisli CircuitM</code>, where <code>CircuitM</code> is a monad (defined via standard monadic building blocks). Thus we <em>almost</em> have for free that <code>(⇴)</code> is a category, but we still need a little bit of work.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> a ⇴ b <span class="fu">=</span> <span class="dt">C</span> (<span class="dt">Kleisli</span> <span class="dt">CircuitM</span> (<span class="dt">Pins</span> a) (<span class="dt">Pins</span> b))</code></pre>

<p>Since this representation wraps <code>Kleisli CircuitM</code>, which is already a category, we need only do a little more unwrapping and wrapping:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Category</span> (⇴) <span class="kw">where</span>
  <span class="fu">id</span>  <span class="fu">=</span> <span class="dt">C</span> <span class="fu">id</span>
  <span class="dt">C</span> g ∘ <span class="dt">C</span> f <span class="fu">=</span> <span class="dt">C</span> (g ∘ f)</code></pre>

<p>The category laws for <code>(⇴)</code> follow easily. For instance,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="fu">id</span> ∘ <span class="dt">C</span> f ≡ <span class="dt">C</span> <span class="fu">id</span> ∘ <span class="dt">C</span> f ≡ <span class="dt">C</span> (<span class="fu">id</span> ∘ f) ≡ <span class="dt">C</span> f</code></pre>

<p>I’ll leave the other two (right-identity and associativity) as a simple exercise.</p>

<p>There’s an idiom I like to use for definitions such as the <code>Category</code> instance above, to automate the unwrapping and wrapping:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Category</span> (⇴) <span class="kw">where</span>
  <span class="fu">id</span>  <span class="fu">=</span> <span class="dt">C</span> <span class="fu">id</span>
  (∘) <span class="fu">=</span> inC2 (∘)</code></pre>

<p>where</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">inC  <span class="fu">=</span>   <span class="dt">C</span> ↜ unC
inC2 <span class="fu">=</span> inC ↜ unC</code></pre>

<p>The <code>(↜)</code> operator here adds post- and pre-processing:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(h ↜ f) g <span class="fu">=</span> h ∘ g ∘ f</code></pre>

<h4 id="product-operations">Product operations</h4>

<p>Next, let’s add product types and a minimal set of associated operations: One simple formulation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Category</span> k <span class="ot">⇒</span> <span class="dt">ProductCat</span> k <span class="kw">where</span>
  exl <span class="ot">∷</span> (a × b) <span class="ot">`k`</span> a
  exr <span class="ot">∷</span> (a × b) <span class="ot">`k`</span> b
  (△) <span class="ot">∷</span> (a <span class="ot">`k`</span> c) <span class="ot">→</span> (a <span class="ot">`k`</span> d) <span class="ot">→</span> (a <span class="ot">`k`</span> (c × d))</code></pre>

<p>If you’ve used <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html" title="Hackage documentation"><code>Control.Arrow</code></a>, you’ll recognize <code>(△)</code> as “<code>(&amp;&amp;&amp;)</code>”. The <code>exl</code> and <code>exr</code> methods generalize <code>fst</code> and <code>snd</code>. There are other operations from <code>Arrow</code> methods that can be defined in terms of these primitives, including <code>first</code>, <code>second</code>, and <code>(×)</code> (called “<code>(***)</code>” in <code>Control.Arrow</code>):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(×) <span class="ot">∷</span> <span class="dt">ProductCat</span> k <span class="ot">⇒</span> (a <span class="ot">`k`</span> c) <span class="ot">→</span> (b <span class="ot">`k`</span> d) <span class="ot">→</span> (a × b <span class="ot">`k`</span> c × d)
f × g <span class="fu">=</span> f ∘ exl △ g ∘ exr

first <span class="ot">∷</span> <span class="dt">ProductCat</span> k <span class="ot">⇒</span> (a <span class="ot">`k`</span> c) <span class="ot">→</span> ((a × b) <span class="ot">`k`</span> (c × b))
first f <span class="fu">=</span> f × <span class="dt">Id</span>

second <span class="ot">∷</span> <span class="dt">ProductCat</span> k <span class="ot">⇒</span> (b <span class="ot">`k`</span> d) <span class="ot">→</span> ((a × b) <span class="ot">`k`</span> (a × d))
second g <span class="fu">=</span> <span class="dt">Id</span> × g</code></pre>

<p>Notibly missing is the <code>Arrow</code> class’s <code>arr</code> method, which converts an arbitrary Haskell function into an arrow. If I could implement <code>arr</code>, I’d have my Haskell-to-circuit compiler. I took the names “<code>exl</code>”, “<code>exr</code>”, and “<code>(△)</code>” (pronounced “fork”) from Jeremy Gibbon’s delightful paper <a href="http://www.cs.ox.ac.uk/jeremy.gibbons/publications/acmmpc-calcfp.pdf" title="Paper by Jeremy Gibbons"><em>Calculating Functional Programs</em></a>.</p>

<p>Again, it’s easy to define a <code>ProductCat</code> instance for <code>(⇴)</code> using the <code>ProductCat</code> instance for the underlying ProductCat for <code>Kleisli CircuitM</code> (which exists because <code>CircuitM</code> is a monad):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">ProductCat</span> (⇴) <span class="kw">where</span>
  exl <span class="fu">=</span> <span class="dt">C</span> exl
  exr <span class="fu">=</span> <span class="dt">C</span> exr
  (△) <span class="fu">=</span> inC2 (△)</code></pre>

<p>There is a subtlety in type-checking this instance definition. In the <code>exl</code> definition, the RHS <code>exl</code> above has type</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Kleisli</span> <span class="dt">CircuitM</span> (<span class="dt">Pins</span> a × <span class="dt">Pins</span> b) (<span class="dt">Pins</span> b)</code></pre>

<p>but the <code>exl</code> definition requires type</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Kleisli</span> <span class="dt">CircuitM</span> (<span class="dt">Pins</span> (a × b)) (<span class="dt">Pins</span> b)</code></pre>

<p>Fortunately, these two types are equivalent, thanks to the <code>Pins</code> instance for products given above:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="kw">instance</span> <span class="dt">Pins</span> (a × b) <span class="fu">=</span> <span class="dt">Pins</span> a × <span class="dt">Pins</span> b</code></pre>

<p>Again, the class law proofs are again straightforward.</p>

<p>The product laws are given in <a href="http://www.cs.ox.ac.uk/jeremy.gibbons/publications/acmmpc-calcfp.pdf" title="Paper by Jeremy Gibbons"><em>Calculating Functional Programs</em></a> (p 155) and again are straightforward to verify. For instance,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exl ∘ (u △ v) ≡ u</code></pre>

<p>Proof:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  exl ∘ (<span class="dt">C</span> f △ <span class="dt">C</span> g)
≡ <span class="dt">C</span> exl ∘ <span class="dt">C</span> (f △ g)
≡ <span class="dt">C</span> (exl ∘ (f △ g))
≡ <span class="dt">C</span> f</code></pre>

<h4 id="coproduct-operations">Coproduct operations</h4>

<p>The coproduct/sum operations are exactly the duals of the product operations. The method signatures thus result from those of <code>ProductCat</code> by inverting the category arrows and replacing products by coproducts:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Category</span> k <span class="ot">⇒</span> <span class="dt">CoproductCat</span> k <span class="kw">where</span>
  inl <span class="ot">∷</span> a <span class="ot">`k`</span> (a <span class="fu">+</span> b)
  inr <span class="ot">∷</span> b <span class="ot">`k`</span> (a <span class="fu">+</span> b)
  (▽) <span class="ot">∷</span> (a <span class="ot">`k`</span> c) <span class="ot">→</span> (b <span class="ot">`k`</span> c) <span class="ot">→</span> ((a <span class="fu">+</span> b) <span class="ot">`k`</span>  c)</code></pre>

<p>The coproduct laws are also exactly dual to the product laws, i.e., the operations are replaced by their counterparts, and the the compositions are reversed. For instance,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exl ∘ (u △ v) ≡ u</code></pre>

<p>becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(u ▽ v) ∘ inl ≡ u</code></pre>

<p>Just as the <code>IsSource</code> definition for sums above is more complex than the one for products, similarly, the <code>CoproductCat</code> instance I’ve found is much trickier than the <code>ProductCat</code> instance. I’d really love to find much simpler definitions, as the extra complexity worries me. If you think of simpler angles, please do suggest them in comments on this post. Alternatively, if you understand the essential cause of the loss of simplicity in going from products to coproducts, please chime in as well.</p>

<p>For the left-injection, <code>inl ∷ a ⇴ a + b</code>, flatten the <code>a</code> pins, pad to the longer of the two representations as needed, and add a flag of <code>False</code> (left):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">inl <span class="fu">=</span> <span class="dt">C</span> ∘ <span class="dt">Kleisli</span> <span class="fu">$</span> λ a <span class="ot">→</span>
  <span class="kw">do</span> x <span class="ot">←</span> constM <span class="kw">False</span> a
     <span class="kw">let</span> na  <span class="fu">=</span> numPins (⊥ <span class="ot">∷</span> <span class="dt">Pins</span> a)
         nb  <span class="fu">=</span> numPins (⊥ <span class="ot">∷</span> <span class="dt">Pins</span> b)
         pad <span class="fu">=</span> Seq.replicate (<span class="fu">max</span> na nb <span class="fu">-</span> na) x
     <span class="fu">return</span> (<span class="dt">UP</span> (toPins a ⊕ pad) x)</code></pre>

<p>Similarly for <code>inr</code>. (The <a href="https://github.com/conal/circat/blob/master/src/Circat/Circuit.hs" title="source module">implementation</a> refactors to remove redundancy.)</p>

<p>There is a problem with this definition, however. Its type is</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">inlC <span class="ot">∷</span> <span class="dt">IsSourceP2</span> a b <span class="ot">⇒</span> a ⇴ a <span class="fu">+</span> b</code></pre>

<p>where</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">IsSourceP</span>  a   <span class="fu">=</span> <span class="dt">IsSource</span> (<span class="dt">Pins</span> a)
<span class="kw">type</span> <span class="dt">IsSourceP2</span> a b <span class="fu">=</span> (<span class="dt">IsSourceP</span> a, <span class="dt">IsSourceP</span> b)</code></pre>

<p>In contrast, the <code>CoproductCat</code> class definition insists on full generality (unconstrained <code>a</code> and <code>b</code>). I don’t know how to resolve this problem. We can change the <code>CoproductCat</code> class definition to add associated constraints, but when I tried, the types of derived operations (definable via the class methods) became terribly complicated. For now, I’ll settle for a near miss, implementing operations like those of <code>CoproductCat</code> but with the extra constraints that thwarts the instance definition I’m seeking.</p>

<p>For the <code>(▽)</code> operation, let’s assume we have a conditional operation, taking two values and a boolean, with the <code>False</code>/<code>else</code> case coming first:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">condC <span class="ot">∷</span> <span class="dt">IsSource</span> (<span class="dt">Pins</span> c) <span class="ot">⇒</span> ((c × c) × <span class="dt">Bool</span>) ⇴ c</code></pre>

<p>Now, given an <code>a :++ b</code> representation,</p>

<ul>
<li>extract the <code>sumFlag</code> for the <code>Bool</code>,</li>
<li>extract pins for <code>a</code> and feed them to <code>f</code>,</li>
<li>extract pins for <code>b</code> and feed them to <code>g</code>, and</li>
<li>feed these three results into <code>condC</code>:</li>
</ul>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f ▽ g <span class="fu">=</span> condC ∘ ((f × g) ∘ extractBoth △ pureC sumFlag)</code></pre>

<p>The <code>(×)</code> operation here is simple parallel composition and is defined for all categories with products:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(×) <span class="ot">∷</span> <span class="dt">ProductCat</span> k <span class="ot">⇒</span> 
      (a <span class="ot">`k`</span> c) <span class="ot">→</span> (b <span class="ot">`k`</span> d) <span class="ot">→</span> ((a × b) <span class="ot">`k`</span> (c <span class="fu">*</span> d))
f × g <span class="fu">=</span> f ∘ exl △ g ∘ exr</code></pre>

<p>The <code>pureC</code> function wraps a pins-to-pins function as a circuit and is easily defined thanks to our use of the <code>Kleisli</code> arrow:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">pureC <span class="ot">∷</span> (<span class="dt">Pins</span> a <span class="ot">→</span> <span class="dt">Pins</span> b) <span class="ot">→</span> (a ⇴ b)
pureC <span class="fu">=</span> <span class="dt">C</span> ∘ arr</code></pre>

<p>The <code>extractBoth</code> function extracts <em>both</em> interpretations of a sum:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">extractBoth <span class="ot">∷</span> <span class="dt">IsSourceP2</span> a b <span class="ot">⇒</span> a <span class="fu">+</span> b ⇴ a × b
extractBoth <span class="fu">=</span> pureC ((pinsSource △ pinsSource) ∘ sumPins)</code></pre>

<p>Finally, <code>pinsSource</code> builds a source from a pin sequence, using the <code>genSource</code> method from <code>IsSource</code>.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">pinsSource <span class="ot">∷</span> <span class="dt">IsSource</span> a <span class="ot">⇒</span> <span class="dt">Seq</span> <span class="dt">Pin</span> <span class="ot">→</span> a
pinsSource pins <span class="fu">=</span> Mtl.evalState genSource (toList pins)</code></pre>

<p>It is for this function that I wanted <code>genSource</code> to work with monads other than <code>CircuitM</code>. Here, we’re using simply <code>State PinSupply</code>.</p>

<p>So here we have circuits as a category with coproducts. It “seems to work”, but I have a few points of dissatisfaction:</p>

<ul>
<li>We don’t quite get the <code>CoproductCat</code> instance, because of the <code>IsSource</code> constraints imposed by all three would-be method definitions.</li>
<li>The definitions are considerably more complex than the <code>ProductCat</code> instance and don’t exhibit an apparent duality to those definitions.</li>
<li>The use of <code>extractBoth</code> frightens me, as it implies a sort of dynamic cast between any two types. (Consider <code>exr ∘ extractBoth ∘ inl</code>.)</li>
</ul>

<h4 id="closure">Closure</h4>

<p>My compilation scheme relies on translating Haskell programs to biCCC (bicartesian closed category) form. We’ve seen above how to interpret the category, cartesian, and cocartesian (coproduct) aspects as circuits. What about closed? I don’t have a precise and implemented answer. Below are some thoughts.</p>

<p>Recall from <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>, that a (cartesian) <em>closed</em> category is one with exponential objects <code>b ⇨ c</code> (often written “<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msup><mi>c</mi><mi>b</mi></msup></mrow></math>”) with the following operations:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply   <span class="ot">∷</span> (a ⇨ b) × a ↝ b
<span class="fu">curry</span>   <span class="ot">∷</span> (a × b ↝ c) <span class="ot">→</span> (a ↝ (b ⇨ c))
<span class="fu">uncurry</span> <span class="ot">∷</span> (a ↝ (b ⇨ c)) <span class="ot">→</span> (a × b ↝ c)</code></pre>

<p>What is a circuit exponential, i.e., a “hardware closure”?</p>

<p>We could operate on lambda expressions, removing inner lambdas, as traditional (I think) in defunctionalization. (See, e.g., <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.164.8417" title="paper by Olivier Danvy and Lasse R. Nielsen (2001)"><em>Defunctionalization at Work</em></a> and <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.9.4715" title="paper by Francois Pottier and Nadji Gauthier (2004)"><em>Polymorphic Typed Defunctionalization</em></a>.) In this case, <code>curry</code> would not appear in the generated CCC term, and application (by other than statically known primitives) would be replaced by pattern matching and invocation of statically known functions/circuits.</p>

<p>Alternatively, first convert to CCC form, simplify, and then look at the remaining uses of <code>curry</code>, <code>uncurry</code>, and <code>apply</code>. I’m not sure I really need to handle <code>uncurry</code>, which is not generated by the lambda-to-CCC translation. I think I currently use it only for uncurrying primitives. In any case, focus on <code>curry</code> and <code>apply</code>.</p>

<p>As in defunctionalization, do a global sweep of the code, and extract all of the closure formations. If we’ve already translated to CCC form, those formations are just the explicit arguments to <code>curry</code> applications. Assemble all of these <code>curry</code> applications into a single GADT:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> b ⇨ c <span class="kw">where</span>
  ⋯</code></pre>

<p>For every application <code>curry f</code> in our program, where <code>f ∷ A × B ↣ C</code> for some types <code>A</code>, <code>B</code>, and <code>C</code>, generate a GADT constructor/tag like the following:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  <span class="dt">Clo_xyz</span> <span class="ot">∷</span> <span class="dt">A</span> <span class="ot">→</span> (<span class="dt">B</span> ⇨ <span class="dt">C</span>)</code></pre>

<p>where “<code>xyz</code>” is generated automatically for distinctness. Note that we cannot use simple algebraic data types, because the type <code>B ⇨ C</code> is restricted. Furthermore, if <code>f</code> is polymorphic, we may have an existential constructor. Since there are only finitely many occurrences of <code>curry</code>, we can represent the GADT constructors with finitely many bits, generalizing the treatment of coproducts described above. If we monomorphize, then we can use several different closure data types for different types <code>b</code> and <code>c</code>, reducing the required number of bits.</p>

<p>Now consider <code>apply</code>. Each occurrence will have some type of the form <code>(b ⇨ c) × b ↝ c</code>. The implementation of apply will extract the closure constructor and its argument of some type <code>a</code>, use the constructor to identify the intended circuit <code>f ∷ a × b ↣ c</code>, and feed the <code>a</code> and <code>b</code> into <code>f</code>, yielding <code>c</code>.</p>

<p>What about <code>uncurry</code>?</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="fu">uncurry</span> <span class="ot">∷</span> (a ↝ (b ⇨ c)) <span class="ot">→</span> (a × b ↝ c)</code></pre>

<p>The constructed circuit would work as follows: given <code>(a,b)</code>, feed <code>a</code> to the argument morphism to get a closure of type <code>b ⇨ c</code>, which has an <code>a'</code> and a tag that refers to some <code>a' × b → c</code>. Feed the <code>a'</code> and the <code>b</code> into that circuit to get a <code>c</code>, which is returned.</p>

<h3 id="status">Status</h3>

<p>I have not yet made the general plan for exponentials precise enough to implement, so I expect some surprises. And perhaps there are better approaches. Please offer suggestions!</p>

<p>Recursive types need thought. If simply translated to sums and products, we’d get infinite representations. Instead, I think we’ll have to use indirections through some kind of memory, as is typically done in software implementations. In this case, dynamic memory management seems inevitable. Indirection might best be used for sums whether appearing in a recursive type or not, depend on the disparity and magnitude of representation sizes of the summand types.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=547&amp;md5=eebff4da80ba8c3a9e96131b8538231f"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/circuits-as-a-bicartesian-closed-category/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fcircuits-as-a-bicartesian-closed-category&amp;language=en_GB&amp;category=text&amp;title=Circuits+as+a+bicartesian+closed+category&amp;description=My+previous+few+posts+have+been+about+cartesian+closed+categories+%28CCCs%29.+In+From+Haskell+to+hardware+via+cartesian+closed+categories%2C+I+gave+a+brief+motivation%3A+typed+lambda+expressions+and+the...&amp;tags=blog" type="text/html" />
	</item>
		<item>
		<title>Optimizing CCCs</title>
		<link>http://conal.net/blog/posts/optimizing-cccs</link>
		<comments>http://conal.net/blog/posts/optimizing-cccs#comments</comments>
		<pubDate>Sat, 14 Sep 2013 01:27:22 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[category]]></category>
		<category><![CDATA[CCC]]></category>
		<category><![CDATA[overloading]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=537</guid>
		<description><![CDATA[In the post Overloading lambda, I gave a translation from a typed lambda calculus into the vocabulary of cartesian closed categories (CCCs). This simple translation leads to unnecessarily complex expressions. For instance, the simple lambda term, “λ ds → (λ (a,b) → (b,a)) ds”, translated to a rather complicated CCC term: apply ∘ (curry (apply [&#8230;]]]></description>
				<content:encoded><![CDATA[<p><!-- references --></p>

<p><!-- teaser --></p>

<p>In the post <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>, I gave a translation from a typed lambda calculus into the vocabulary of cartesian closed categories (CCCs). This simple translation leads to unnecessarily complex expressions. For instance, the simple lambda term, “<code>λ ds → (λ (a,b) → (b,a)) ds</code>”, translated to a rather complicated CCC term:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (apply ∘ (apply ∘ (<span class="fu">const</span> (,) △ (<span class="fu">id</span> ∘ exr) ∘ exr) △ (<span class="fu">id</span> ∘ exl) ∘ exr)) △ <span class="fu">id</span>)</code></pre>

<p>(Recall from the previous post that <code>(∘)</code> binds more tightly than <code>(△)</code> and <code>(▽)</code>.)</p>

<p>However, we can do much better, translating to</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exr △ exl</code></pre>

<p>which says to pair the right and left halves of the argument pair, i.e., swap.</p>

<p>This post applies some equational properties to greatly simplify/optimize the result of translation to CCC form, including example above. First I’ll show the equational reasoning and then how it’s automated in the <a href="https://github.com/conal/lambda-ccc" title="Github project">lambda-ccc</a> library.</p>

<p><span id="more-537"></span></p>

<h3 id="equational-reasoning-on-ccc-terms">Equational reasoning on CCC terms</h3>

<p>First, use the identity/composition laws:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f ∘ <span class="fu">id</span> ≡ f
<span class="fu">id</span> ∘ g ≡ g</code></pre>

<p>Our example is now slightly simpler:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (apply ∘ (apply ∘ (<span class="fu">const</span> (,) △ exr ∘ exr) △ exl ∘ exr)) △ <span class="fu">id</span>)</code></pre>

<p>Next, consider the subterm <code>apply ∘ (const (,) △ exr ∘ exr)</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (<span class="fu">const</span> (,) △ exr ∘ exr)
≡ <span class="co">{- definition of (∘)  -}</span>
  λ x <span class="ot">→</span> apply ((<span class="fu">const</span> (,) △ exr ∘ exr) x)
≡ <span class="co">{- definition of (△) -}</span>
  λ x <span class="ot">→</span> apply (<span class="fu">const</span> (,) x, (exr ∘ exr) x)
≡ <span class="co">{- definition of apply -}</span>
  λ x <span class="ot">→</span> <span class="fu">const</span> (,) x ((exr ∘ exr) x)
≡ <span class="co">{- definition of const -}</span>
  λ x <span class="ot">→</span> (,) ((exr ∘ exr) x)
≡ <span class="co">{- η-reduce -}</span>
  (,) ∘ (exr ∘ exr)</code></pre>

<p>We didn’t use any properties of <code>(,)</code> or of <code>(exr ∘ exr)</code>, so let’s generalize:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (<span class="fu">const</span> g △ f)
≡ λ x <span class="ot">→</span> apply ((<span class="fu">const</span> g △ f) x)
≡ λ x <span class="ot">→</span> apply (<span class="fu">const</span> g x, f x)
≡ λ x <span class="ot">→</span> <span class="fu">const</span> g x (f x)
≡ λ x <span class="ot">→</span> g (f x)
≡ g ∘ f</code></pre>

<p>(Note that I’ve cheated here by appealing to the <em>function</em> interpretations of <code>apply</code> and <code>const</code>. <em>Question:</em> Is there a purely algebraic proof, using only the CCC laws?)</p>

<p>With this equivalence, our example simplifies further:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr)) △ <span class="fu">id</span>)</code></pre>

<p>Next, lets focus on <code>apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr)</code>. Generalize to <code>apply ∘ (h ∘ f △ g)</code> and fiddle about:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (h ∘ f △ g)
≡ λ x <span class="ot">→</span> apply (h (f x), g x)
≡ λ x <span class="ot">→</span> h (f x) (g x)
≡ λ x <span class="ot">→</span> <span class="fu">uncurry</span> h (f x, g x)
≡ <span class="fu">uncurry</span> h ∘ (λ x <span class="ot">→</span> (f x, g x))
≡ <span class="fu">uncurry</span> h ∘ (f △ g)</code></pre>

<p>Apply to our example:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (<span class="fu">uncurry</span> (,) ∘ (exr ∘ exr △ exl ∘ exr)) △ <span class="fu">id</span>)</code></pre>

<p>We can simplify <code>uncurry (,)</code> as follows:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  <span class="fu">uncurry</span> (,)
≡ λ (x,y) <span class="ot">→</span> <span class="fu">uncurry</span> (,) (x,y)
≡ λ (x,y) <span class="ot">→</span> (,) x y
≡ λ (x,y) <span class="ot">→</span> (x,y)
≡ <span class="fu">id</span></code></pre>

<p>Together with the left identity law, our example now becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (exr ∘ exr △ exl ∘ exr) △ <span class="fu">id</span>)</code></pre>

<p>Next use the law that relates <code>(∘)</code> and <code>(△)</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f ∘ r △ g ∘ r ≡ (f △ g) ∘ r</code></pre>

<p>In our example, <code>exr ∘ exr △ exl ∘ exr</code> becomes <code>(exr △ exl) ∘ exr</code>, so we have</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> ((exr △ exl) ∘ exr) △ <span class="fu">id</span>)</code></pre>

<p>Let’s now look at how <code>apply</code>, <code>(△)</code>, and <code>curry</code> interact:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (<span class="fu">curry</span> h △ g)
≡ λ p <span class="ot">→</span> apply ((<span class="fu">curry</span> h △ g) p)
≡ λ p <span class="ot">→</span> apply (<span class="fu">curry</span> h p, g p)
≡ λ p <span class="ot">→</span> <span class="fu">curry</span> h p (g p)
≡ λ p <span class="ot">→</span> h (p, g p)
≡ h ∘ (<span class="fu">id</span> △ g)</code></pre>

<p>We can add more variety for other uses:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (<span class="fu">curry</span> h ∘ f △ g)
≡ λ p <span class="ot">→</span> apply ((<span class="fu">curry</span> h ∘ f △ g) p)
≡ λ p <span class="ot">→</span> apply (<span class="fu">curry</span> h (f p), g p)
≡ λ p <span class="ot">→</span> <span class="fu">curry</span> h (f p) (g p)
≡ λ p <span class="ot">→</span> h (f p, g p)
≡ h ∘ (f △ g)</code></pre>

<p>With this rule (even in its more specialized form),</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> ((exr △ exl) ∘ exr) △ <span class="fu">id</span>)</code></pre>

<p>becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(exr △ exl) ∘ exr ∘ (<span class="fu">id</span> △ <span class="fu">id</span>)</code></pre>

<p>Next use the universal property of <code>(△)</code>, which is that it is the unique solution of the following two equations (universally quantified over <code>f</code> and <code>g</code>):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exl ∘ (f △ g) ≡ f
exr ∘ (f △ g) ≡ g</code></pre>

<p>(See <a href="http://www.cs.ox.ac.uk/jeremy.gibbons/publications/acmmpc-calcfp.pdf" title="Paper by Jeremy Gibbons"><em>Calculating Functional Programs</em></a>, Section 1.3.6.)</p>

<p>Applying the second rule to <code>exr ∘ (id △ id)</code> gives <code>id</code>, so our <code>swap</code> example becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exr △ exl</code></pre>

<h3 id="automation">Automation</h3>

<p>By using a collection of equational properties, we’ve greatly simplified our CCC example. These properties and more are used in <a href="https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/CCC.hs" title="Source module"><code>LambdaCCC.CCC</code></a> to simplify CCC terms during construction. As a general technique, whenever building terms, rather than applying the GADT constructors directly, we’ll use so-called “smart constructors” with built-in optimizations. I’ll show a few smart constructor definitions here. See the <a href="https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/CCC.hs" title="Source module"><code>LambdaCCC.CCC</code></a> source code for others.</p>

<p>As a first simple example, consider the identity laws for composition:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f ∘ <span class="fu">id</span> ≡ f
<span class="fu">id</span> ∘ g ≡ g</code></pre>

<p>Since the top-level operator on the LHSs (left-hand sides) is <code>(∘)</code>, we can easily implement these laws in a “smart constructor” for <code>(∘)</code>, which handles special cases and uses the plain (dumb) constructor if no simplifications apply:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">infixr</span> <span class="dv">9</span> <span class="fu">@</span>∘
(<span class="fu">@</span>∘) <span class="ot">∷</span> (b ↣ c) <span class="ot">→</span> (a ↣ b) <span class="ot">→</span> (a ↣ c)
⋯ <span class="co">-- simplifications go here</span>
g <span class="fu">@</span>∘ f  <span class="fu">=</span> g <span class="fu">:</span>∘ f</code></pre>

<p>where <code>↣</code> is the GADT that represents biCCC terms, as shown in <a href="http://conal.net/blog/posts/overloading-lambda/" title="blog post"><em>Overloading lambda</em></a>.</p>

<p>The identity laws are easy to implement:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f <span class="fu">@</span>∘ <span class="dt">Id</span> <span class="fu">=</span> f
<span class="dt">Id</span> <span class="fu">@</span>∘ g <span class="fu">=</span> g</code></pre>

<p>Next, the <code>apply</code>/<code>const</code> law derived above:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">const</span> g △ f) ≡ g ∘ f</code></pre>

<p>This rule translates fairly easily:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Apply</span> <span class="fu">@</span>∘ (<span class="dt">Const</span> g <span class="fu">:</span>△ f) <span class="fu">=</span> prim g <span class="fu">@</span>∘ f</code></pre>

<p>where <code>prim</code> is a smart constructor for <code>Prim</code>.</p>

<p>There are some details worth noting:</p>

<ul>
<li>The LHS uses only dumb constructors and variables except for the smart constructor being defined (here <code>(@∘)</code>).</li>
<li>Besides variables bound on the LHS, the RHS uses only smart constructors, so that the constructed combinations are optimized as well. For instance, <code>f</code> might be <code>Id</code> here.</li>
</ul>

<p>Despite these details, this definition is inadequate in many cases. Consider the following example:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ ((<span class="fu">const</span> u △ v) ∘ w)</code></pre>

<p><em>Syntactically</em>, the LHS of our rule <em>does not</em> match this term, because the two compositions are associated to the right instead of the left. <em>Semantically</em>, the rules does match, since composition is associative. In order to apply this rule, we can first left-associate and then apply the rule.</p>

<p>We could associate <em>all</em> compositions to the left during construction, in which case this rule will apply purely via syntactic matching. However, there will be other rewrites that require <em>right</em>-association in order to apply. Instead, for rules like this one, let’s explicitly left-decompose.</p>

<p>Suppose we have a smart constructor <code>composeApply g</code> that constructs an optimized version of <code>apply ∘ g</code>. This equivalence implies the following type:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">composeApply <span class="ot">∷</span> (z ↣ (a ⇨ b) × a) <span class="ot">→</span> (z ↣ b)</code></pre>

<p>Thus</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (g ∘ f)
≡ (apply ∘ g) ∘ f
≡ composeApply g ∘ f</code></pre>

<p>Now we can define a general rule for composing <code>apply</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Apply</span> <span class="fu">@</span>∘ (decompL <span class="ot">→</span> g <span class="fu">:</span>∘ f) <span class="fu">=</span> composeApply g <span class="fu">@</span>∘ f</code></pre>

<p>The function <code>decompL</code> (defined below) does a left-decomposition and is conveniently used here in a <a href="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns" title="GHC wiki page">view pattern</a>. It decomposes a given term into <code>g ∘ f</code>, where <code>g</code> is as small as possible, but not <code>Id</code>. Where <code>decompL</code> finds such a decomposition, it yields a term with a top-level <code>(:∘)</code> constructor, and <code>composeApply</code> is used. Otherwise, the clause fails.</p>

<p>The implementation of <code>decompL</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">decompL <span class="ot">∷</span> (a ↣ c) <span class="ot">→</span> (a ↣ c)
decompL <span class="dt">Id</span>                        <span class="fu">=</span> <span class="dt">Id</span>
decompL ((decompL <span class="ot">→</span> h <span class="fu">:</span>∘ g) <span class="fu">:</span>∘ f) <span class="fu">=</span> h <span class="fu">:</span>∘ (g <span class="fu">@</span>∘ f)
decompL comp<span class="fu">@</span>(_ <span class="fu">:</span>∘ _)             <span class="fu">=</span> comp
decompL f                         <span class="fu">=</span> f <span class="fu">:</span>∘ <span class="dt">Id</span></code></pre>

<p>There’s also <code>decompR</code> for right-factoring, similarly defined.</p>

<p>Note that I broke my rule of using only smart constructors on RHSs, since I specifically want to generate a <code>(:∘)</code> term.</p>

<p>With this re-association trick in place, we can now look at compose/apply rules.</p>

<p>The equivalence</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">const</span> g △ f) ≡ g ∘ f</code></pre>

<p>becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">composeApply (<span class="dt">Const</span> p <span class="fu">:</span>△ f) <span class="fu">=</span> prim p <span class="fu">@</span>∘ f</code></pre>

<p>Likewise, the equivalence</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (h ∘ f △ g) ≡ <span class="fu">uncurry</span> h ∘ (f △ g)</code></pre>

<p>becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">composeApply (h <span class="fu">:</span>∘ f <span class="fu">:</span>△ g) <span class="fu">=</span> uncurryE h <span class="fu">@</span>∘ (f △ g)</code></pre>

<p>where <code>(△)</code> is the smart constructor for <code>(:△)</code>, and <code>uncurryE</code> is a smart constructor for <code>Uncurry</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">uncurryE <span class="ot">∷</span> (a ↣ (b ⇨ c)) <span class="ot">→</span> (a × b ↣ c)
uncurryE (<span class="dt">Curry</span> f)    <span class="fu">=</span> f
uncurryE (<span class="dt">Prim</span> <span class="dt">PairP</span>) <span class="fu">=</span> <span class="dt">Id</span>
uncurryE h            <span class="fu">=</span> <span class="dt">Uncurry</span> h</code></pre>

<p>Two more <code>(∘)</code>/<code>apply</code> properties:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ (<span class="fu">curry</span> (g ∘ exr) △ f)
≡ λ x <span class="ot">→</span> <span class="fu">curry</span> (g ∘ exr) x (f x)
≡ λ x <span class="ot">→</span> (g ∘ exr) (x, f x)
≡ λ x <span class="ot">→</span> g (f x)
≡ g ∘ f</code></pre>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply ∘ first f
≡ λ p <span class="ot">→</span> apply (first f p)
≡ λ (a,b) <span class="ot">→</span> apply (first f (a,b))
≡ λ (a,b) <span class="ot">→</span> apply (f a, b)
≡ λ (a,b) <span class="ot">→</span> f a b
≡ <span class="fu">uncurry</span> f</code></pre>

<p>The <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html#v:first"><code>first</code></a> combinator is not represented directly in our <code>(↣)</code> data type, but rather is defined via simpler parts in <a href="https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/CCC.hs" title="Source module"><code>LambdaCCC.CCC</code></a>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">first <span class="ot">∷</span> (a ↣ c) <span class="ot">→</span> (a × b ↣ c × b)
first f <span class="fu">=</span> f × <span class="dt">Id</span>

(×) <span class="ot">∷</span> (a ↣ c) <span class="ot">→</span> (b ↣ d) <span class="ot">→</span> (a × b ↣ c × d)
f × g <span class="fu">=</span> f <span class="fu">@</span>∘ <span class="dt">Exl</span> △ g <span class="fu">@</span>∘ <span class="dt">Exr</span></code></pre>

<p>Implementations of these two properties:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">composeApply (<span class="dt">Curry</span> (decompR <span class="ot">→</span> g <span class="fu">:</span>∘ <span class="dt">Exr</span>) <span class="fu">:</span>△ f) <span class="fu">=</span> g <span class="fu">@</span>∘ f

composeApply (f <span class="fu">:</span>∘ <span class="dt">Exl</span> <span class="fu">:</span>△ <span class="dt">Exr</span>) <span class="fu">=</span> uncurryE f</code></pre>

<p>These properties arose while examining CCC terms produced by translation from lambda terms. See the <a href="https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/CCC.hs" title="Source module"><code>LambdaCCC.CCC</code></a> for more optimizations. I expect that others will arise with more experience.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=537&amp;md5=6cc29dd4b357a5b351657f3fd9c166d5"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/optimizing-cccs/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Foptimizing-cccs&amp;language=en_GB&amp;category=text&amp;title=Optimizing+CCCs&amp;description=In+the+post+Overloading+lambda%2C+I+gave+a+translation+from+a+typed+lambda+calculus+into+the+vocabulary+of+cartesian+closed+categories+%28CCCs%29.+This+simple+translation+leads+to+unnecessarily+complex+expressions....&amp;tags=category%2CCCC%2Coverloading%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Overloading lambda</title>
		<link>http://conal.net/blog/posts/overloading-lambda</link>
		<comments>http://conal.net/blog/posts/overloading-lambda#comments</comments>
		<pubDate>Fri, 13 Sep 2013 16:31:40 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[category]]></category>
		<category><![CDATA[CCC]]></category>
		<category><![CDATA[overloading]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=533</guid>
		<description><![CDATA[Haskell’s type class facility is a powerful abstraction mechanism. Using it, we can overload multiple interpretations onto a single vocabulary, with each interpretation corresponding to a different type. The class laws constrain these interpretations and allow reasoning that is valid over all (law-abiding) instances—even ones not yet defined. As Haskell is a higher-order functional language [&#8230;]]]></description>
				<content:encoded><![CDATA[<p>Haskell’s type class facility is a powerful abstraction mechanism. Using it, we can overload multiple interpretations onto a single vocabulary, with each interpretation corresponding to a different type. The class laws constrain these interpretations and allow reasoning that is valid over all (law-abiding) instances—even ones not yet defined.</p>

<p>As Haskell is a higher-order functional language in the heritage of Church’s (typed) lambda calculus, it also supports “lambda abstraction”.</p>

<p>Sadly, however, these two forms of abstraction don’t go together. When we use the vocabulary of lambda abstraction (“<code>λ x → ⋯</code>”) and application (“<code>u v</code>”), our expressions can only be interpreted as one type (constructor), namely functions. (Note that I am not talking about parametric polymorphism, which is available with both lambda abstraction and type-class-style overloading.) Is it possible to overload lambda and application using type classes, or perhaps in the same spirit? The answer is yes, and there are some wonderful benefits of doing so. I’ll explain the how in this post and hint at the why, to be elaborated in futures posts.</p>

<p><span id="more-533"></span></p>

<h3 id="generalizing-functions">Generalizing functions</h3>

<p>First, let’s look at a related question. Instead of generalized interpretation of the particular <em>vocabulary</em> of lambda abstraction and application, let’s look at re-expressing functions via an alternative vocabulary that can be generalized more readily. If you are into math or have been using Haskell for a while, you may already know where I’m going: the mathematical notion of a <em>category</em> (and the embodiment in the <code>Category</code> and <code>Arrow</code> type classes).</p>

<p>Much has been written about categories, both in the setting of math and of Haskell, so I’ll give only the most cursory summary here.</p>

<p>Recall that every function has two associated sets (or types, CPOs, etc) often referred to as the function’s “domain” and “range”. (As <a href="http://math.stackexchange.com/questions/59432/domain-co-domain-range-of-a-function">explained elsewhere</a>, the term “range” can be misleading.) Moreover, there are two general building blocks (among others) for functions, namely the identity function and composition of compatibly typed functions, satisfying the following properties:</p>

<ul>
<li><em>left identity:</em>  <code>id ∘ f ≡ f</code></li>
<li><em>right identity:</em>  <code>f ∘ id ≡ f</code></li>
<li><em>associativity:</em>  <code>h ∘ (g ∘ f) ≡ (h ∘ g) ∘ f</code></li>
</ul>

<p>Now we can separate these properties from the other specifics of functions. A <em>category</em> is something that has these properties but needn’t be function-like in other ways. Each category has <em>objects</em> (e.g., sets) and <em>morphisms/arrows</em> (e.g., functions), and two building blocks <code>id</code> and <code>(∘)</code> on compatible morphisms. Rather than “domain” and “range”, we usually use the terms (a) “domain” and “codomain” or (b) “source” and “target”.</p>

<p>Examples of categories include sets &amp; functions (as we’ve seen), restricted sets &amp; functions (e.g., vector spaces &amp; linear transformations), preorders, and any monoid (as a one-object category).</p>

<p>The notion of category is very general and correspondingly weak. By imposing so few constraints, it embraces a wide range mathematical notions (including many appearing in programming) but gives correspondingly little leverage with which to define and prove more specific ideas and theorems. Thus we’ll often want additional structure, including products, coproducts (with products distributing over coproducts) and a notion of “exponential”, which is an object that represents a morphism. For the familiar terrain of set/types and functions, products correspond to pairing, coproducts to sums (and choice), and exponentials to functions as things/values. (In programming, we often refer to exponentials as the types of “first class functions”. Some languages have them, and some don’t.) These aspects—together with associated laws—are called “cartesian”, “cocartesian”, and “closed”, respectively. Altogether, we have “bicartesian closed categories”, more succinctly called “biCCCs” (or “CCCs”, without the cocartesian requirement).</p>

<p>The <em>cartesian</em> vocabulary consists a product operation on objects, <code>a × b</code>, plus three morphism building blocks:</p>

<ul>
<li><code>exl ∷ a × b ↝ a</code></li>
<li><code>exr ∷ a × b ↝ b</code></li>
<li><code>f △ g ∷ a ↝ b × c</code> where <code>f ∷ a ↝ b</code> and <code>g ∷ a ↝ c</code></li>
</ul>

<p>I’m using “<code>↝</code>” to refer to morphisms.</p>

<p>We’ll also want the dual notion of coproducts, <code>a + b</code>, with building blocks and laws exactly dual to products:</p>

<ul>
<li><code>inl ∷ a ↝ a + b</code></li>
<li><code>inr ∷ b ↝ a + b</code></li>
<li><code>f ▽ g ∷ a + b ↝ c</code> where <code>f ∷ a ↝ c</code> and <code>g ∷ b ↝ c</code></li>
</ul>

<p>You may have noticed that (a) <code>exl</code> and <code>exr</code> generalize <code>fst</code> and <code>snd</code>, (b) <code>inl</code> and <code>inr</code> generalize <code>Left</code> and <code>Right</code>, and (c) <code>(△)</code> and <code>(▽)</code> come from <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html" title="Hackage documentation"><code>Control.Arrow</code></a>, where they’re called “<code>(&amp;&amp;&amp;)</code>” and “<code>(|||)</code>”. I took the names above from <a href="http://www.cs.ox.ac.uk/jeremy.gibbons/publications/acmmpc-calcfp.pdf" title="Paper by Jeremy Gibbons"><em>Calculating Functional Programs</em></a>, where <code>(△)</code> and <code>(▽)</code> are also called “fork” and “join”.</p>

<p>For product and coproduct laws, see <a href="http://www.cs.ox.ac.uk/jeremy.gibbons/publications/acmmpc-calcfp.pdf" title="Paper by Jeremy Gibbons"><em>Calculating Functional Programs</em></a> (pp 155–156) or <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125" title="Paper by Erik Meijer, Maarten Fokkinga, and Ross Paterson"><em>Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire</em></a> (p 9).</p>

<p>The <em>closed</em> vocabulary consists of an exponential operation on objects, <code>a ⇨ b</code> (often written “<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msup><mi>b</mi><mi>a</mi></msup></mrow></math>”), plus three morphism building blocks:</p>

<ul>
<li><code>uncurry h ∷ a × b ↝ c</code> where <code>h ∷ a ↝ (b ⇨ c)</code></li>
<li><code>curry f ∷ a ↝ (b ⇨ c)</code> where <code>f ∷ a × b ↝ c</code></li>
<li><code>apply ∷ (a ⇨ b) × a ↝ b</code> (sometimes called “<code>eval</code>”)</li>
</ul>

<p>Again, there are laws associated with <code>exl</code>, <code>exr</code>, <code>(△)</code>, <code>inl</code>, <code>inr</code>, <code>(▽)</code>, and with <code>curry</code>, <code>uncurry</code>, and <code>apply</code>.</p>

<p>In reading the signatures above, the operators <code>×</code>, <code>+</code>, and <code>⇨</code> all bind more tightly than <code>↝</code>, and <code>(∘)</code> binds more tightly than <code>(△)</code> and <code>(▽)</code>.</p>

<p>Keep in mind the distinction between morphisms (“<code>↝</code>”) and exponentials (“<code>⇨</code>”). The latter is a sort of data/object representation of the former.</p>

<h3 id="where-are-we-going">Where are we going?</h3>

<p>I suggested that the <em>vocabulary</em> of the lambda calculus—namely lambda abstraction and application—can be generalized beyond functions. Then I showed something else, which is that an <em>alternative</em> vocabulary (biCCC) that applies to functions can be overloaded beyond functions. Instead of overloading the lambda calculus notation, we could simply use the alternative algebraic notation of biCCCs. Unfortunately, doing so leads to rather ugly results. The lambda calculus is a much more human-friendly notation than the algebraic language of biCCC.</p>

<p>I’m not just wasting your time and mine, however; there is a way to combine the flexibility of biCCC with the friendliness of lambda calculus: <em>automatically translate from lambda calculus to biCCC form</em>. The discovery that typed lambda calculus can be interpreted in any CCC is due to Joachim Lambek. See pointers <a href="http://math.ucr.edu/home/baez/qg-fall2006/ccc.html">on John Baez’s blog</a>. (Coproducts do not arise in translation unless the source language has a constraint like <code>if-then-else</code> or definition by cases with pattern matching.)</p>

<h3 id="overview-from-lambda-expressions-to-biccc">Overview: from lambda expressions to biCCC</h3>

<p>We’re going to need a few pieces to complete this story and have it be useful in a language like Haskell:</p>

<ul>
<li>a representation of lambda expressions,</li>
<li>a representation of biCCC expressions,</li>
<li>a translation of lambda expressions to biCCC, and</li>
<li>a translation of Haskell to lambda expressions.</li>
</ul>

<p>This last step (which is actually the first step in turning Haskell into biCCC) is already done by a typical compiler. We start with a syntactically rich language and desugar it into a much smaller lambda calculus. GHC in particular has a small language called “Core”, which is much smaller than the Haskell source language.</p>

<p>I originally intended to convert from Core directly to biCCC form, but I found it difficult to do correctly. Core is dynamically typed, so a type-correct Haskell program can manipulate Core in type-incorrect ways. In other words, a type-correct Haskell program can construct type-incorrect Core. Moreover, Core representations contain an enormous amount of type information, since all type inference has already been done and recorded, so it is tedious to get all of the type information correct and thus likely to get it incorrect. For just this reason, GHC includes an explicit type-checker, “<a href="http://www.haskell.org/ghc/docs/6.10.4/html/users_guide/options-debugging.html#checking-consistency">Core Lint</a>”, for catching type inconsistencies (but not their causes) after the fact. While Core Lint is much better than nothing, it is less helpful than static checking, which points to inconsistencies in the source code (of the Core-manipulation).</p>

<p>Because I want static checking of my source code for lambda-to-biCCC conversion, I defined my own alternative to Core, using a generalized algebraic data type (GADT). The first step of translation then is conversion from GHC Core into this GADT.</p>

<p>The source fragments I’ll show below are from the Github project <a href="https://github.com/conal/lambda-ccc" title="Github project">lambda-ccc</a>.</p>

<h3 id="a-typeful-lambda-calculus-representation">A typeful lambda calculus representation</h3>

<p>In Haskell, pair types are usually written “<code>(a,b)</code>”, sums as “<code>Either a b</code>”, and functions as “<code>a → b</code>”. For the categorical generalizations (products, coproducts, and exponentials), I’ll instead use the notation “<code>a × b</code>”, “<code>a + b</code>”, and “<code>a ⇨ b</code>”. (My blogging software typesets some operators differently from what you’ll see in the <a href="https://github.com/conal/lambda-ccc/blob/master/src/LambdaCCC/Ty.hs" title="Source module">source code</a>.)</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">infixl</span> <span class="dv">7</span> ×
<span class="kw">infixl</span> <span class="dv">6</span> <span class="fu">+</span>
<span class="kw">infixr</span> <span class="dv">1</span> ⇨</code></pre>

<p>For reasons to become clearer in future posts, I’ll want a typed representation of types. The data constructors named to reflect the types they construct:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Ty</span> <span class="ot">∷</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="kw">where</span>
  <span class="dt">Unit</span> <span class="ot">∷</span> <span class="dt">Ty</span> <span class="dt">Unit</span>
  (×)  <span class="ot">∷</span> <span class="dt">Ty</span> a <span class="ot">→</span> <span class="dt">Ty</span> b <span class="ot">→</span> <span class="dt">Ty</span> (a × b)
  (<span class="fu">+</span>)  <span class="ot">∷</span> <span class="dt">Ty</span> a <span class="ot">→</span> <span class="dt">Ty</span> b <span class="ot">→</span> <span class="dt">Ty</span> (a <span class="fu">+</span> b)
  (⇨)  <span class="ot">∷</span> <span class="dt">Ty</span> a <span class="ot">→</span> <span class="dt">Ty</span> b <span class="ot">→</span> <span class="dt">Ty</span> (a ⇨ b)</code></pre>

<p>Note that <code>Ty a</code> is a singleton or empty for every type <code>a</code>. I could instead use promoted data type constructors and <a href="http://www.cis.upenn.edu/~eir/packages/singletons/" title="Haskell library">singletons</a>.</p>

<p>Next, names and typed variables:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">Name</span> <span class="fu">=</span> <span class="dt">String</span>
<span class="kw">data</span> <span class="dt">V</span> a <span class="fu">=</span> <span class="dt">V</span> <span class="dt">Name</span> (<span class="dt">Ty</span> a)</code></pre>

<p>Lambda expressions contain binding patterns. For now, we’ll have just the unit pattern, variables, and pair of patterns:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Pat</span> <span class="ot">∷</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="kw">where</span>
  <span class="dt">UnitPat</span> <span class="ot">∷</span> <span class="dt">Pat</span> <span class="dt">Unit</span>
  <span class="dt">VarPat</span>  <span class="ot">∷</span> <span class="dt">V</span> a <span class="ot">→</span> <span class="dt">Pat</span> a
  (<span class="fu">:#</span>)    <span class="ot">∷</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">Pat</span> b <span class="ot">→</span> <span class="dt">Pat</span> (a × b)</code></pre>

<p>Finally, we have lambda expressions, with constructors for variables, constants, application, and abstraction:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">infixl</span> <span class="dv">9</span> <span class="fu">:^</span>
<span class="kw">data</span> <span class="dt">E</span> <span class="ot">∷</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="kw">where</span>
  <span class="dt">Var</span>    <span class="ot">∷</span> <span class="dt">V</span> a <span class="ot">→</span> <span class="dt">E</span> a
  <span class="dt">ConstE</span> <span class="ot">∷</span> <span class="dt">Prim</span> a <span class="ot">→</span> <span class="dt">Ty</span> a <span class="ot">→</span> <span class="dt">E</span> a
  (<span class="fu">:^</span>)   <span class="ot">∷</span> <span class="dt">E</span> (a ⇨ b) <span class="ot">→</span> <span class="dt">E</span> a <span class="ot">→</span> <span class="dt">E</span> b
  <span class="dt">Lam</span>    <span class="ot">∷</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">E</span> b <span class="ot">→</span> <span class="dt">E</span> (a ⇨ b)</code></pre>

<p>The <code>Prim</code> GADT contains typed primitives. The <code>ConstE</code> constructor accompanies a <code>Prim</code> with its specific type, since primitives can be polymorphic.</p>

<h3 id="a-typeful-biccc-representation">A typeful biCCC representation</h3>

<p>The data type <code>a ↣ b</code> contains biCCC expressions that represent morphisms from <code>a</code> to <code>b</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> (↣) <span class="ot">∷</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="kw">where</span>
  <span class="co">-- Category</span>
  <span class="dt">Id</span>      <span class="ot">∷</span> a ↣ a
  (<span class="fu">:</span>∘)    <span class="ot">∷</span> (b ↣ c) <span class="ot">→</span> (a ↣ b) <span class="ot">→</span> (a ↣ c)
  <span class="co">-- Products</span>
  <span class="dt">Exl</span>     <span class="ot">∷</span> a × b ↣ a
  <span class="dt">Exr</span>     <span class="ot">∷</span> a × b ↣ b
  (<span class="fu">:</span>△)    <span class="ot">∷</span> (a ↣ b) <span class="ot">→</span> (a ↣ c) <span class="ot">→</span> (a ↣ b × c)
  <span class="co">-- Coproducts</span>
  <span class="dt">Inl</span>     <span class="ot">∷</span> a ↣ a <span class="fu">+</span> b
  <span class="dt">Inr</span>     <span class="ot">∷</span> b ↣ a <span class="fu">+</span> b
  (<span class="fu">:</span>▽)    <span class="ot">∷</span> (b ↣ a) <span class="ot">→</span> (c ↣ a) <span class="ot">→</span> (b <span class="fu">+</span> c ↣ a)
  <span class="co">-- Exponentials</span>
  <span class="dt">Apply</span>   <span class="ot">∷</span> (a ⇨ b) × a ↣ b
  <span class="dt">Curry</span>   <span class="ot">∷</span> (a × b ↣ c) <span class="ot">→</span> (a ↣ (b ⇨ c))
  <span class="dt">Uncurry</span> <span class="ot">∷</span> (a ↣ (b ⇨ c)) <span class="ot">→</span> (a × b ↣ c)
  <span class="co">-- Primitives</span>
  <span class="dt">Prim</span>    <span class="ot">∷</span> <span class="dt">Prim</span> (a <span class="ot">→</span> b) <span class="ot">→</span> (a ↣ b)
  <span class="dt">Const</span>   <span class="ot">∷</span> <span class="dt">Prim</span>       b  <span class="ot">→</span> (a ↣ b)</code></pre>

<p>The actual representation has some constraints on the type variables involved. I could have used type classes instead of a GADT here, except that the existing classes do not allow polymorphism constraints on the methods. The <code>ConstraintKinds</code> language extension allows instance-specific constraints, but I’ve been unable to work out the details in this case.</p>

<p>I’m not happy with the similarity of <code>Prim</code> and <code>Const</code>. Perhaps there’s a simpler formulation.</p>

<h3 id="lambda-to-ccc">Lambda to CCC</h3>

<p>We’ll always convert terms of the form <code>λ p → e</code>, and we’ll keep the pattern <code>p</code> and expression <code>e</code> separate:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert <span class="ot">∷</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">E</span> b <span class="ot">→</span> (a ↣ b)</code></pre>

<p>The pattern argument gets built up from patterns appearing in lambdas and serves as a variable binding “context”. To begin, we’ll strip the pattern off of a lambda, eta-expanding if necessary:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">toCCC <span class="ot">∷</span> <span class="dt">E</span> (a ⇨ b) <span class="ot">→</span> (a ↣ b)
toCCC (<span class="dt">Lam</span> p e) <span class="fu">=</span> convert p e
toCCC e <span class="fu">=</span> toCCC (etaExpand e)</code></pre>

<p>(We could instead begin with a dummy unit pattern/context, giving <code>toCCC</code> the type <code>E c → (() ↣ c)</code>.)</p>

<p>The conversion algorithm uses a collection of simple equivalences.</p>

<p>For constants, we have a simple equivalence:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">λ p <span class="ot">→</span> c ≡ <span class="fu">const</span> c</code></pre>

<p>Thus the implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert _ (<span class="dt">ConstE</span> o _) <span class="fu">=</span> <span class="dt">Const</span> o</code></pre>

<p>For applications, split the expression in two (repeating the context), compute the function and argument parts separately, combine with <code>(△)</code>, and then <code>apply</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">λ p <span class="ot">→</span> u v ≡ apply ∘ ((λ p <span class="ot">→</span> u) △ (λ p <span class="ot">→</span> v))</code></pre>

<p>The implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert p (u <span class="fu">:^</span> v) <span class="fu">=</span> <span class="dt">Apply</span> <span class="fu">:</span>∘ (convert p u <span class="fu">:</span>△ convert p v)</code></pre>

<p>For lambda expressions, simply curry:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">λ p <span class="ot">→</span> λ q <span class="ot">→</span> e  ≡ <span class="fu">curry</span> (λ (p,q) <span class="ot">→</span> e)</code></pre>

<p>Assume that there is no variable shadowing, so that <code>p</code> and <code>q</code> have no variables in common. The implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert p (<span class="dt">Lam</span> q e) <span class="fu">=</span> <span class="dt">Curry</span> (convert (p <span class="fu">:#</span> q) e)</code></pre>

<p>Finally, we have to deal with variables. Given <code>λ p → v</code> for a pattern <code>p</code> and variable <code>v</code> appearing in <code>p</code>, either <code>v ≡ p</code>, or <code>p</code> is a pair pattern with <code>v</code> appearing in the left or the right part. To handle these three possibilities, appeal to the following equivalences:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">λ v <span class="ot">→</span> v     ≡ <span class="fu">id</span>
λ (p,q) <span class="ot">→</span> e ≡ (λ p <span class="ot">→</span> e) ∘ exl  <span class="co">-- if q not free in e</span>
λ (p,q) <span class="ot">→</span> e ≡ (λ q <span class="ot">→</span> e) ∘ exr  <span class="co">-- if p not free in e</span></code></pre>

<p>By a pattern not occurring freely, I mean that no variable in the pattern occurs freely.</p>

<p>These properties lead to an implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert (<span class="dt">VarPat</span> u) (<span class="dt">Var</span> v) <span class="fu">|</span> u ≡ v              <span class="fu">=</span> <span class="dt">Id</span>
convert (p <span class="fu">:#</span> q)   e       <span class="fu">|</span> <span class="fu">not</span> (q <span class="ot">`occurs`</span> e) <span class="fu">=</span> convert p e <span class="fu">:</span>∘ <span class="dt">Exl</span>
convert (p <span class="fu">:#</span> q)   e       <span class="fu">|</span> <span class="fu">not</span> (p <span class="ot">`occurs`</span> e) <span class="fu">=</span> convert q e <span class="fu">:</span>∘ <span class="dt">Exr</span></code></pre>

<p>There are two problems with this code. The first is a performance issue. The recursive <code>convert</code> calls will do considerable redundant work due to the recursive nature of <code>occurs</code>.</p>

<p>To fix this performance problem, handle only <code>λ p → v</code> (variables), and search through the pattern structure only once, returning a <code>Maybe (a ↣ b)</code>. The return value is <code>Nothing</code> when <code>v</code> does not occur in <code>p</code>.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert p (<span class="dt">Var</span> v) <span class="fu">=</span>
  fromMaybe (<span class="fu">error</span> (<span class="st">&quot;convert: unbound variable: &quot;</span> <span class="fu">++</span> <span class="fu">show</span> v)) <span class="fu">$</span>
  convertVar p k</code></pre>

<p>If a sub-pattern search succeeds, tack on the <code>( ∘ Exl)</code> or <code>( ∘ Exr)</code> using <code>(&lt;$&gt;)</code> (i.e., <code>fmap</code>). Backtrack using <code>mplus</code>.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convertVar <span class="ot">∷</span> <span class="ot">∀</span> b a<span class="fu">.</span> <span class="dt">V</span> b <span class="ot">→</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">Maybe</span> (a ↣ b)
convertVar u <span class="fu">=</span> conv
 <span class="kw">where</span>
   conv <span class="ot">∷</span> <span class="dt">Pat</span> c <span class="ot">→</span> <span class="dt">Maybe</span> (c ↣ b)
   conv (<span class="dt">VarPat</span> v) <span class="fu">|</span> u ≡ v    <span class="fu">=</span> <span class="kw">Just</span> <span class="dt">Id</span>
                   <span class="fu">|</span> <span class="fu">otherwise</span> <span class="fu">=</span> <span class="kw">Nothing</span>
   conv <span class="dt">UnitPat</span>  <span class="fu">=</span> <span class="kw">Nothing</span>
   conv (p <span class="fu">:#</span> q) <span class="fu">=</span> ((<span class="fu">:</span>∘ <span class="dt">Exr</span>) <span class="fu">&lt;$&gt;</span> conv q) <span class="ot">`mplus`</span> ((<span class="fu">:</span>∘ <span class="dt">Exl</span>) <span class="fu">&lt;$&gt;</span> conv p)</code></pre>

<p>(The explicit type quantification and the <code>ScopedTypeVariables</code> language extension relate the <code>b</code> the signatures of <code>convertVar</code> and <code>conv</code>. Note that we’ve solved the problem of redundant <code>occurs</code> testing, eliminating those tests altogether.</p>

<p>The second problem is more troubling: the definitions of <code>convert</code> for <code>Var</code> above do not type-check. Look again at the first try:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert <span class="ot">∷</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">E</span> b <span class="ot">→</span> (a ↣ b)
convert (<span class="dt">VarPat</span> u) (<span class="dt">Var</span> v) <span class="fu">|</span> u ≡ v <span class="fu">=</span> <span class="dt">Id</span></code></pre>

<p>The error message:</p>

<pre><code>Could not deduce (b ~ a)
...
Expected type: V a
  Actual type: V b
In the second argument of `(==)&#39;, namely `v&#39;
In the expression: u == v</code></pre>

<p>The bug here is that we cannot compare <code>u</code> and <code>v</code> for equality, because they may differ. The definition of <code>convertVar</code> has a similar type error.</p>

<h3 id="taking-care-with-types">Taking care with types</h3>

<p>There’s a trick I’ve used in many libraries to handle this situation of wanting to compare for equality two values that may or may not have the same type. For equal values, don’t return simply <code>True</code>, but rather a proof that the types do indeed match. For unequal values, we simply fail to return an equality proof. Thus the comparison operation on <code>V</code> has the following type:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">varTyEq <span class="ot">∷</span> <span class="dt">V</span> a <span class="ot">→</span> <span class="dt">V</span> b <span class="ot">→</span> <span class="dt">Maybe</span> (a <span class="fu">:=:</span> b)</code></pre>

<p>where <code>a :=: b</code> is populated only proofs that <code>a</code> and <code>b</code> are the same type.</p>

<p>The type of type equality proofs is defined in <a href="http://hackage.haskell.org/packages/archive/ty/0.1.4/doc/html/Data-Proof-EQ.html">Data.Proof.EQ</a> from the <a href="http://hackage.haskell.org/package/ty" title="Haskell package">ty</a> package:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> (<span class="fu">:=:</span>) <span class="ot">∷</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="ot">→</span> <span class="fu">*</span> <span class="kw">where</span> <span class="dt">Refl</span> <span class="ot">∷</span> a <span class="fu">:=:</span> a</code></pre>

<p>The <code>Refl</code> constructor is name to suggest the axiom of reflexivity, which says that anything is equal to itself. There are other utilities for commutativity, associativity, and lifting of equality to type constructors.</p>

<p>In fact, this pattern comes up often enough that there’s a type class in the <a href="http://hackage.haskell.org/packages/archive/ty/0.1.4/doc/html/Data-IsTy.html">Data.IsTy</a> module of the <a href="http://hackage.haskell.org/package/ty" title="Haskell package">ty</a> package:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">IsTy</span> f <span class="kw">where</span>
  tyEq <span class="ot">∷</span> f a <span class="ot">→</span> f b <span class="ot">→</span> <span class="dt">Maybe</span> (a <span class="fu">:=:</span> b)</code></pre>

<p>With this trick, we can fix our type-incorrect code above. Instead of</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert (<span class="dt">VarPat</span> u) (<span class="dt">Var</span> v) <span class="fu">|</span> u ≡ v <span class="fu">=</span> <span class="dt">Id</span></code></pre>

<p>define</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convert (<span class="dt">VarPat</span> u) (<span class="dt">Var</span> v) <span class="fu">|</span> <span class="kw">Just</span> <span class="dt">Refl</span> <span class="ot">←</span> u <span class="ot">`tyEq`</span> v <span class="fu">=</span> <span class="dt">Id</span></code></pre>

<p>During type-checking, GHC uses the guard (“<code>Just Refl ← u `tyEq` v</code>”) to deduce an additional <em>local</em> constraint to use in type-checking the right-hand side (here <code>Id</code>). That constraint (<code>a ~ b</code>) suffices to make the definition type-correct.</p>

<p>In the same way, we can fix the more efficient implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">convertVar <span class="ot">∷</span> <span class="ot">∀</span> b a<span class="fu">.</span> <span class="dt">V</span> b <span class="ot">→</span> <span class="dt">Pat</span> a <span class="ot">→</span> <span class="dt">Maybe</span> (a ↣ b)
convertVar u <span class="fu">=</span> conv
 <span class="kw">where</span>
   conv <span class="ot">∷</span> <span class="dt">Pat</span> c <span class="ot">→</span> <span class="dt">Maybe</span> (c ↣ b)
   conv (<span class="dt">VarPat</span> v) <span class="fu">|</span> <span class="kw">Just</span> <span class="dt">Refl</span> <span class="ot">←</span> v <span class="ot">`tyEq`</span> u <span class="fu">=</span> <span class="kw">Just</span> <span class="dt">Id</span>
                   <span class="fu">|</span> <span class="fu">otherwise</span>              <span class="fu">=</span> <span class="kw">Nothing</span>
   conv <span class="dt">UnitPat</span>  <span class="fu">=</span> <span class="kw">Nothing</span>
   conv (p <span class="fu">:#</span> q) <span class="fu">=</span> ((<span class="fu">:</span>∘ <span class="dt">Exr</span>) <span class="fu">&lt;$&gt;</span> conv q) <span class="ot">`mplus`</span> ((<span class="fu">:</span>∘ <span class="dt">Exl</span>) <span class="fu">&lt;$&gt;</span> conv p)</code></pre>

<h3 id="example">Example</h3>

<p>To see how conversion works in practice, consider a simple swap function:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">swap (a,b) <span class="fu">=</span> (b,a)</code></pre>

<p>When reified (as explained in a future post), we get</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">λ ds <span class="ot">→</span> (λ (a,b) <span class="ot">→</span> (b,a)) ds</code></pre>

<p>Lambda expressions can be optimized at construction, in which case an <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>η</mi></mrow></math>-reduction would yield the simpler <code>λ (a,b) → (b,a)</code>. However, to make the translation more interesting, I’ll leave the lambda term unoptimized.</p>

<p>With the conversion algorithm given above, the (unoptimized) lambda term gets translated into the following:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ∘ (<span class="fu">curry</span> (apply ∘ (apply ∘ (<span class="fu">const</span> (,) △ (<span class="fu">id</span> ∘ exr) ∘ exr) △ (<span class="fu">id</span> ∘ exl) ∘ exr)) △ <span class="fu">id</span>)</code></pre>

<p>Reformatted with line breaks:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply
<span class="fu">.</span> ( <span class="fu">curry</span> (apply ∘ ( apply ∘ (<span class="fu">const</span> (,) △ (<span class="fu">id</span> ∘ exr) ∘ exr)
                   △ (<span class="fu">id</span> ∘ exl) ∘ exr) )
  △ <span class="fu">id</span> )</code></pre>

<p>If you squint, you may be able to see how this CCC expression relates to the lambda expression. The “<code>λ ds →</code>” got stripped initially. The remaining application “<code>(λ (a,b) → (b,a)) ds</code>” became <code>apply ∘ (⋯ △ ⋯)</code>, where the right “<code>⋯</code>” is <code>id</code>, which came from <code>ds</code>. The left “<code>⋯</code>” has a <code>curry</code> from the “<code>λ (a,b) →</code>” and two <code>apply</code>s from the curried application of <code>(,)</code> to <code>b</code> and <code>a</code>. The variables <code>b</code> and <code>a</code> become <code>(id ∘ exr) ∘ exr</code> and <code>(id ∘ exl) ∘ exr</code>, which are paths to <code>b</code> and <code>a</code> in the constructed binding pattern <code>(ds,(a,b))</code>.</p>

<p>I hope this example gives you a feeling for how the lambda-to-CCC translation works in practice, <em>and</em> for the complexity of the result. Fortunately, we can simplify the CCC terms as they’re constructed. For this example, as we’ll see in the next post, we get a much simpler result:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">exr △ exl</code></pre>

<p>This combination is common enough that it pretty-prints as</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">swapP</code></pre>

<p>when CCC desugaring is turned on. (The “<code>P</code>” suffix refers to “product”, to distinguish from coproduct swap.)</p>

<h3 id="coming-up">Coming up</h3>

<p>I’ll close this blog post now to keep it digestible. Upcoming posts will address optimization of biCCC expressions, circuit generation and analysis as biCCCs, and the GHC plugin that handles conversion of Haskell code to biCCC form, among other topics.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=533&amp;md5=bccca46f4a5502ee6f9f238c3df66880"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/overloading-lambda/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Foverloading-lambda&amp;language=en_GB&amp;category=text&amp;title=Overloading+lambda&amp;description=Haskell%E2%80%99s+type+class+facility+is+a+powerful+abstraction+mechanism.+Using+it%2C+we+can+overload+multiple+interpretations+onto+a+single+vocabulary%2C+with+each+interpretation+corresponding+to+a+different+type.+The+class...&amp;tags=category%2CCCC%2Coverloading%2Cblog" type="text/html" />
	</item>
		<item>
		<title>From Haskell to hardware via cartesian closed categories</title>
		<link>http://conal.net/blog/posts/haskell-to-hardware-via-cccs</link>
		<comments>http://conal.net/blog/posts/haskell-to-hardware-via-cccs#comments</comments>
		<pubDate>Thu, 12 Sep 2013 23:20:44 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[category]]></category>
		<category><![CDATA[CCC]]></category>
		<category><![CDATA[compilation]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=523</guid>
		<description><![CDATA[Since fall of last year, I’ve been working at Tabula, a Silicon Valley start-up developing an innovative programmable hardware architecture called “Spacetime”, somewhat similar to an FPGA, but much more flexible and efficient. I met the founder, Steve Teig, at a Bay Area Haskell Hackathon in February of 2011. He described his Spacetime architecture, which [&#8230;]]]></description>
				<content:encoded><![CDATA[<p><!-- references --></p>

<p><!-- teaser --></p>

<p>Since fall of last year, I’ve been working at <a href="http://www.tabula.com">Tabula</a>, a Silicon Valley start-up developing an innovative programmable hardware architecture called “Spacetime”, somewhat similar to an FPGA, but much more flexible and efficient. I met the founder, Steve Teig, at a Bay Area Haskell Hackathon in February of 2011. He described his Spacetime architecture, which is based on the geometry of the same name, developed by Hermann Minkowski to elegantly capture Einstein’s theory of special relativity. Within the first 30 seconds or so of hearing what Steve was up to, I knew I wanted to help.</p>

<p>The vision Steve shared with me included not only a better alternative for <em>hardware</em> designers (programmed in hardware languages like Verilog and VHDL), but also a platform for massively parallel execution of <em>software</em> written in a purely functional language. Lately, I’ve been working mainly on this latter aspect, and specifically on the problem of how to compile Haskell. Our plan is to develop the Haskell compiler openly and encourage collaboration. If anything you see in this blog series interests you, and especially if have advice or you’d like to collaborate on the project, please let me know.</p>

<p>In my next series of blog posts, I’ll describe some of the technical ideas I’ve been working with for compiling Haskell for massively parallel execution. For now, I want to introduce a central idea I’m using to approach the problem.</p>

<p><span id="more-523"></span></p>

<h3 id="lambda-calculus-and-cartesian-closed-categories">Lambda calculus and cartesian closed categories</h3>

<p>I’m used to thinking of the typed lambda calculi as languages for describing functions and other mathematical values. For instance, if the type of an expression <code>e</code> is <code>Bool → Bool</code>, then the meaning of <code>e</code> is a function from Booleans to Booleans. (In non-strict pure languages like Haskell, both Boolean types include <code>⊥</code>. In hypothetically pure strict languages, the range is extend to include <code>⊥</code>, but the domain isn’t.)</p>

<p>However, there are other ways to interpret typed lambda-calculi.</p>

<p>You may have heard of “cartesian closed categories” (CCCs). CCC is an abstraction having a small vocabulary with associated laws:</p>

<ul>
<li>The “category” part means we have a notion of “morphisms” (or “arrows”) each having a domain and codomain “object”. There is an identity morphism for and associative composition operator. If this description of morphisms and objects sounds like functions and types (or sets), it’s because functions and types are one example, with <code>id</code> and <code>(∘)</code>.</li>
<li>The “cartesian” part means that we have products, with projection functions and an operator to combine two functions into a pair-producing function. For Haskell functions, these operations are <code>fst</code> and <code>snd</code>, together with <code>(&amp;&amp;&amp;)</code> from <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html" title="Hackage documentation"><code>Control.Arrow</code></a>.</li>
<li>The “closed” part means that we have a way to represent morphisms via objects, referred to as “exponentials”. The corresponding operations are <code>curry</code>, <code>uncurry</code>, and <code>apply</code>. Since Haskell is a higher-order language, these exponential objects are simply (first class) functions.</li>
</ul>

<p>A wonderful thing about the CCC interface is that it suffices to translate any lambda expression, as discovered by Joachim Lambek. In other words, lambda expressions can be systematically translated into the CCC vocabulary. Any (law-abiding) interpretation of that vocabulary is thus an interpretation of the lambda calculus.</p>

<p>Besides intellectual curiosity, why might one care about interpreting lambda expressions in terms of CCCs other than the one we usually think of for functional programs? I got interested because I’ve been thinking about how to compile Haskell programs to “circuits”, both the standard static kind and more dynamic variants. Since Haskell is a typed lambda calculus, if we can formulate circuits as a CCC, we’ll have our Haskell-to-circuit compiler. Other interpretations enable analysis of timing and demand propagation (including strictness).</p>

<h3 id="some-future-topics">Some future topics</h3>

<ul>
<li>Converting lambda expressions to CCC form.</li>
<li>Optimizing CCC expressions.</li>
<li>Plugging into GHC, to convert from Haskell source to CCC.</li>
<li>Applications of this translation, including the following:
<ul>
<li>Circuits</li>
<li>Timing analysis</li>
<li>Strictness/demand analysis</li>
<li>Type simplification (normalization)</li>
</ul></li>
</ul>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=523&amp;md5=68f0f234c70be944131a783073bc4349"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/haskell-to-hardware-via-cccs/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fhaskell-to-hardware-via-cccs&amp;language=en_GB&amp;category=text&amp;title=From+Haskell+to+hardware+via+cartesian+closed+categories&amp;description=Since+fall+of+last+year%2C+I%E2%80%99ve+been+working+at+Tabula%2C+a+Silicon+Valley+start-up+developing+an+innovative+programmable+hardware+architecture+called+%E2%80%9CSpacetime%E2%80%9D%2C+somewhat+similar+to+an+FPGA%2C+but+much+more...&amp;tags=category%2CCCC%2Ccompilation%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Reimagining matrices</title>
		<link>http://conal.net/blog/posts/reimagining-matrices</link>
		<comments>http://conal.net/blog/posts/reimagining-matrices#comments</comments>
		<pubDate>Mon, 17 Dec 2012 02:45:42 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[category]]></category>
		<category><![CDATA[denotational design]]></category>
		<category><![CDATA[linear algebra]]></category>
		<category><![CDATA[type class morphism]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=503</guid>
		<description><![CDATA[The function of the imagination is notto make strange things settled, so much asto make settled things strange.- G.K. Chesterton Why is matrix multiplication defined so very differently from matrix addition? If we didn’t know these procedures, could we derive them from first principles? What might those principles be? This post gives a simple semantic [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- LaTeX macros -->

<!-- teaser -->

<div class=flushright>
<em>The function of the imagination is not<br />to make strange things settled, so much as<br />to make settled things strange.</em><br />- G.K. Chesterton
</div>

<p>Why is matrix multiplication defined so very differently from matrix addition? If we didn’t know these procedures, could we derive them from first principles? What might those principles be?</p>

<p>This post gives a simple semantic model for matrices and then uses it to systematically <em>derive</em> the implementations that we call matrix addition and multiplication. The development illustrates what I call “denotational design”, particularly with type class morphisms. On the way, I give a somewhat unusual formulation of matrices and accompanying definition of matrix “multiplication”.</p>

<p>For more details, see the <a href="https://github.com/conal/linear-map-gadt" title="github repository">linear-map-gadt</a> source code.</p>

<p><strong>Edits:</strong></p>

<ul>
<li>2012–12–17: Replaced lost <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>B</mi></mrow></math> entries in description of matrix addition. Thanks to Travis Cardwell.</li>
<li>2012–12018: Added note about math/browser compatibility.</li>
</ul>

<p><strong>Note:</strong> I’m using MathML for the math below, which appears to work well on Firefox but on neither Safari nor Chrome. I use Pandoc to generate the HTML+MathML from markdown+lhs+LaTeX. There’s probably a workaround using different Pandoc settings and requiring some tweaks to my WordPress installation. If anyone knows how (especially the WordPress end), I’d appreciate some pointers.</p>

<p><span id="more-503"></span></p>

<h3 id="matrices">Matrices</h3>

<p>For now, I’ll write matrices in the usual form: <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mrow><mn>1</mn><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow></mrow></math></p>

<h4 id="addition">Addition</h4>

<p>To add two matrices, we add their corresponding components. If <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mspace width="0.167em"></mspace><mspace width="0.167em"></mspace><mrow><mtext mathvariant="normal">and </mtext><mspace width="0.333em"></mspace></mrow><mi>B</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>B</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>B</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>B</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>B</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mo>,</mo></mrow></math> then <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mo>+</mo><mi>B</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub><mo>+</mo><msub><mi>B</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub><mo>+</mo><msub><mi>B</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>B</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub><mo>+</mo><msub><mi>B</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mo>.</mo></mrow></math> More succinctly, <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>A</mi><mo>+</mo><mi>B</mi><msub><mo stretchy="false">)</mo><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><msub><mi>A</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>+</mo><msub><mi>B</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>.</mo></mrow></math></p>

<h4 id="multiplication">Multiplication</h4>

<p>Multiplication, on the other hand, works quite differently. If <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mspace width="0.167em"></mspace><mspace width="0.167em"></mspace><mrow><mtext mathvariant="normal">and </mtext><mspace width="0.333em"></mspace></mrow><mi>B</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>B</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>B</mi><mrow><mn>1</mn><mi>p</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>B</mi><mrow><mi>m</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>B</mi><mrow><mi>m</mi><mi>p</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mo>,</mo></mrow></math> then <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>A</mi><mo>∙</mo><mi>B</mi><msub><mo stretchy="false">)</mo><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><msub><mi>A</mi><mrow><mi>i</mi><mi>k</mi></mrow></msub><mo>⋅</mo><msub><mi>B</mi><mrow><mi>k</mi><mi>j</mi></mrow></msub><mo>.</mo></mrow></math> This time, we form the dot product of each <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi></mrow></math> row and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>B</mi></mrow></math> column.</p>

<p>Why are these two matrix operations defined so differently? Perhaps these two operations are <em>implementations</em> of more fundamental <em>specifications</em>. If so, then making those specifications explicit could lead us to clear and compelling explanations of matrix addition and multiplication.</p>

<h4 id="transforming-vectors">Transforming vectors</h4>

<p>Simplifying from matrix multiplication, we have transformation of a vector by a matrix. If <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mn>11</mn></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub></mtd><mtd><mo>⋯</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mspace width="0.167em"></mspace><mspace width="0.167em"></mspace><mrow><mtext mathvariant="normal">and </mtext><mspace width="0.333em"></mspace></mrow><mi>x</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>x</mi><mn>1</mn></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>x</mi><mi>m</mi></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow><mo>,</mo></mrow></math> then <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mo>⋅</mo><mi>x</mi><mo>=</mo><mrow><mo stretchy="true">(</mo><mtable><mtr><mtd><msub><mi>A</mi><mrow><mn>1</mn><mn>1</mn></mrow></msub><mo>⋅</mo><msub><mi>x</mi><mn>1</mn></msub></mtd><mtd><mo>+</mo></mtd><mtd><mo>⋯</mo></mtd><mtd><mo>+</mo></mtd><mtd><msub><mi>A</mi><mrow><mn>1</mn><mi>m</mi></mrow></msub><mo>⋅</mo><msub><mi>x</mi><mi>m</mi></msub></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd></mtd><mtd><mo>⋱</mo></mtd><mtd></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><msub><mi>A</mi><mrow><mi>n</mi><mn>1</mn></mrow></msub><mo>⋅</mo><msub><mi>x</mi><mn>1</mn></msub></mtd><mtd><mo>+</mo></mtd><mtd><mo>⋯</mo></mtd><mtd><mo>+</mo></mtd><mtd><msub><mi>A</mi><mrow><mi>n</mi><mi>m</mi></mrow></msub><mo>⋅</mo><msub><mi>x</mi><mi>m</mi></msub></mtd></mtr></mtable><mo stretchy="true">)</mo></mrow></mrow></math> More succinctly, <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>A</mi><mo>⋅</mo><mi>x</mi><msub><mo stretchy="false">)</mo><mi>i</mi></msub><mo>=</mo><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderover><msub><mi>A</mi><mrow><mi>i</mi><mi>k</mi></mrow></msub><mo>⋅</mo><msub><mi>x</mi><mi>k</mi></msub><mo>.</mo></mrow></math></p>

<h3 id="whats-it-all-about">What’s it all about?</h3>

<p>We can interpret matrices <em>as</em> transformations. Matrix addition then <em>adds</em> transformations:</p>

<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>A</mi><mo>+</mo><mi>B</mi><mo stretchy="false">)</mo><mspace width="0.167em"></mspace><mi>x</mi><mo>=</mo><mi>A</mi><mspace width="0.167em"></mspace><mi>x</mi><mo>+</mo><mi>B</mi><mspace width="0.167em"></mspace><mi>x</mi></mrow></math></p>

<p>Matrix “multiplication” <em>composes</em> transformations:</p>

<p><math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>A</mi><mo>∙</mo><mi>B</mi><mo stretchy="false">)</mo><mspace width="0.167em"></mspace><mi>x</mi><mo>=</mo><mi>A</mi><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>B</mi><mspace width="0.167em"></mspace><mi>x</mi><mo stretchy="false">)</mo></mrow></math></p>

<p>What kinds of transformations?</p>

<h4 id="linear-transformations">Linear transformations</h4>

<p>Matrices represent <em>linear</em> transformations. To say that a transformation (or “function” or “map”) <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>f</mi></mrow></math> is “linear” means that <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>f</mi></mrow></math> preserves the structure of addition and scalar multiplication. In other words, <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mtable><mtr><mtd columnalign="right"><mi>f</mi><mspace width="0.167em"></mspace><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>x</mi><mo>+</mo><mi>y</mi><mo stretchy="false">)</mo></mtd><mtd columnalign="center"><mo>=</mo></mtd><mtd columnalign="left"><mi>f</mi><mspace width="0.167em"></mspace><mi>x</mi><mo>+</mo><mi>f</mi><mspace width="0.167em"></mspace><mi>y</mi></mtd></mtr><mtr><mtd columnalign="right"><mi>f</mi><mspace width="0.167em"></mspace><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>c</mi><mo>⋅</mo><mi>x</mi><mo stretchy="false">)</mo></mtd><mtd columnalign="center"><mo>=</mo></mtd><mtd columnalign="left"><mi>c</mi><mo>⋅</mo><mi>f</mi><mspace width="0.167em"></mspace><mi>x</mi></mtd></mtr></mtable></mrow></math> Equivalently, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>f</mi></mrow></math> preserves all <em>linear combinations</em>: <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>f</mi><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><msub><mi>c</mi><mn>1</mn></msub><mo>⋅</mo><msub><mi>x</mi><mn>1</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>c</mi><mi>m</mi></msub><mo>⋅</mo><msub><mi>x</mi><mi>m</mi></msub><mo stretchy="false">)</mo><mo>=</mo><msub><mi>c</mi><mn>1</mn></msub><mo>⋅</mo><mi>f</mi><mspace width="0.167em"></mspace><msub><mi>x</mi><mn>1</mn></msub><mo>+</mo><mo>⋯</mo><mo>+</mo><msub><mi>c</mi><mi>m</mi></msub><mo>⋅</mo><mi>f</mi><mspace width="0.167em"></mspace><msub><mi>x</mi><mi>m</mi></msub></mrow></math></p>

<p>What does it mean to say that “matrices represent linear transformations”? As we saw in the previous section, we can use a matrix to transform a vector. Our semantic function will exactly be this use, i.e., the <em>meaning</em> of matrix is as a function (map) from vectors to vectors. Moreover, these functions will satisfy the linearity properties above.</p>

<h4 id="representation">Representation</h4>

<p>For simplicity, I’m going structure matrices in a unconventional way. Instead of a rectangular arrangement of numbers, use the following generalized algebraic data type (GADT):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> a ⊸ b <span class="kw">where</span>
  <span class="dt">Dot</span>   <span class="ot">∷</span> <span class="dt">InnerSpace</span> b <span class="ot">⇒</span>
          b <span class="ot">→</span> (b ⊸ <span class="dt">Scalar</span> b)
  (<span class="fu">:&amp;&amp;</span>) <span class="ot">∷</span> <span class="dt">VS3</span> a c d <span class="ot">⇒</span>  <span class="co">-- vector spaces with same scalar field</span>
          (a ⊸ c) <span class="ot">→</span> (a ⊸ d) <span class="ot">→</span> (a ⊸ c × d)</code></pre>

<p>I’m using the notation “<code>c × d</code>” in place of the usual “<code>(c,d)</code>”. Precedences are such that “<code>×</code>” binds more tightly than “<code>⊸</code>”, which binds more tightly than “<code>→</code>”.</p>

<p>This definition builds on the <a href="http://hackage.haskell.org/packages/archive/vector-space/latest/doc/html/Data-VectorSpace.html#t:VectorSpace" title="Hackage documentation"><code>VectorSpace</code></a> class, with its associated <code>Scalar</code> type and <a href="http://hackage.haskell.org/packages/archive/vector-space/latest/doc/html/Data-VectorSpace.html#t:InnerSpace" title="Hackage documentation"><code>InnerSpace</code></a> subclass. Using <code>VectorSpace</code> is overkill for linear maps. It suffices to use <a href="http://en.wikipedia.org/wiki/Module_%28mathematics%29" title="Wikipedia entry">module</a>s over <a href="http://en.wikipedia.org/wiki/Semiring" title="Wikipedia entry">semiring</a>s, which means that we don’t assume multiplicative or additive inverses. The more general setting enables many more useful applications than vector spaces do, some of which I will describe in future posts.</p>

<p>The idea here is that a linear map results in either (a) a scalar, in which case it’s equivalent to <code>dot v</code> (partially applied dot product) for some <code>v</code>, or (b) a product, in which case it can be decomposed into two linear maps with simpler range types. Each row in a conventional matrix corresponds to <code>Dot v</code> for some vector <code>v</code>, and the stacking of rows corresponds to nested applications of <code>(:&amp;&amp;)</code>.</p>

<h4 id="semantics">Semantics</h4>

<p>The semantic function, <code>apply</code>, interprets a representation of a linear map as a function (satisfying linearity):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply <span class="ot">∷</span> (a ⊸ b) <span class="ot">→</span> (a <span class="ot">→</span> b)
apply (<span class="dt">Dot</span> b)   <span class="fu">=</span> dot b
apply (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">=</span> apply f <span class="fu">&amp;&amp;&amp;</span> apply g</code></pre>

<p>where, <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html#v:-38--38--38-" title="Hackage documentation"><code>(&amp;&amp;&amp;)</code></a> is from <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Arrow.html" title="Hackage documentation"><code>Control.Arrow</code></a>.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(<span class="fu">&amp;&amp;&amp;</span>) <span class="ot">∷</span> <span class="dt">Arrow</span> (↝) <span class="ot">⇒</span> (a ↝ b) <span class="ot">→</span> (a ↝ c) <span class="ot">→</span> (a ↝ (b,c))</code></pre>

<p>For functions,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(f <span class="fu">&amp;&amp;&amp;</span> g) a <span class="fu">=</span> (f a, g a)</code></pre>

<h3 id="functions-linearity-and-multilinearity">Functions, linearity, and multilinearity</h3>

<p>Functions form a vector space, with scaling and addition defined “pointwise”. Instances from the <a href="http://hackage.haskell.org/package/vector-space" title="Hackage package">vector-space</a> package:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">AdditiveGroup</span> v <span class="ot">⇒</span> <span class="dt">AdditiveGroup</span> (a <span class="ot">→</span> v) <span class="kw">where</span>
  zeroV   <span class="fu">=</span> pure   zeroV
  (<span class="fu">^+^</span>)   <span class="fu">=</span> liftA2 (<span class="fu">^+^</span>)
  negateV <span class="fu">=</span> <span class="fu">fmap</span>   negateV

<span class="kw">instance</span> <span class="dt">VectorSpace</span> v <span class="ot">⇒</span> <span class="dt">VectorSpace</span> (a <span class="ot">→</span> v) <span class="kw">where</span>
  <span class="kw">type</span> <span class="dt">Scalar</span> (a <span class="ot">→</span> v) <span class="fu">=</span> a <span class="ot">→</span> <span class="dt">Scalar</span> v
  (<span class="fu">*^</span>) s <span class="fu">=</span> <span class="fu">fmap</span> (s <span class="fu">*^</span>)</code></pre>

<p>I wrote the definitions in this form to fit a template for applicative functors in general. Inlining the definitions of <code>pure</code>, <code>liftA2</code>, and <code>fmap</code> on functions, we get the following equivalent instances:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">AdditiveGroup</span> v <span class="ot">⇒</span> <span class="dt">AdditiveGroup</span> (a <span class="ot">→</span> v) <span class="kw">where</span>
  zeroV     <span class="fu">=</span> λ _ <span class="ot">→</span> zeroV
  f <span class="fu">^+^</span> g   <span class="fu">=</span> λ a <span class="ot">→</span> f a <span class="fu">^+^</span> g a
  negateV f <span class="fu">=</span> λ a <span class="ot">→</span> negateV (f a)

<span class="kw">instance</span> <span class="dt">VectorSpace</span> v <span class="ot">⇒</span> <span class="dt">VectorSpace</span> (a <span class="ot">→</span> v) <span class="kw">where</span>
  <span class="kw">type</span> <span class="dt">Scalar</span> (a <span class="ot">→</span> v) <span class="fu">=</span> a <span class="ot">→</span> <span class="dt">Scalar</span> v
  s <span class="fu">*^</span> f <span class="fu">=</span> λ a <span class="ot">→</span> s <span class="fu">*^</span> f a</code></pre>

<p>In math, we usually say that dot product is “bilinear”, or “linear in each argument”, i.e.,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (s <span class="fu">*^</span> u,v) ≡ s <span class="fu">*^</span> dot (u,v)
dot (u <span class="fu">^+^</span> w, v) ≡ dot (u,v) <span class="fu">^+^</span> dot (w,v)</code></pre>

<p>Similarly for the second argument:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (u,s <span class="fu">*^</span> v) ≡ s <span class="fu">*^</span> dot (u,v)
dot (u, v <span class="fu">^+^</span> w) ≡ dot (u,v) <span class="fu">^+^</span> dot (u,w)</code></pre>

<p>Now recast the first of these properties in a curried form:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (s <span class="fu">*^</span> u) v ≡ s <span class="fu">*^</span> dot u v</code></pre>

<p>i.e.,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (s <span class="fu">*^</span> u)
 ≡ <span class="co">{- η-expand -}</span>
λ v <span class="ot">→</span> dot (s <span class="fu">*^</span> u) v
 ≡ <span class="co">{- &quot;bilinearity&quot; -}</span>
λ v <span class="ot">→</span> s <span class="fu">*^</span> dot u v
 ≡ <span class="co">{- (*^) on functions -}</span>
λ v <span class="ot">→</span> (s <span class="fu">*^</span> dot u) v
 ≡ <span class="co">{- η-contract -}</span>
s <span class="fu">*^</span> dot u</code></pre>

<p>Likewise,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (u <span class="fu">^+^</span> v)
 ≡ <span class="co">{- η-expand -}</span>
λ w <span class="ot">→</span> dot (u <span class="fu">^+^</span> v) w
 ≡ <span class="co">{- &quot;bilinearity&quot; -}</span>
λ w <span class="ot">→</span> dot u w <span class="fu">^+^</span> dot v w
 ≡ <span class="co">{- (^+^) on functions -}</span>
dot u <span class="fu">^+^</span> dot v</code></pre>

<p>Thus, when “bilinearity” is recast in terms of curried functions, it becomes just linearity. (The same reasoning applies more generally to multilinearity.)</p>

<p>Note that we could also define function addition as follows:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f <span class="fu">^+^</span> g <span class="fu">=</span> add ∘ (f <span class="fu">&amp;&amp;&amp;</span> g)</code></pre>

<p>where</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="fu">=</span> <span class="fu">uncurry</span> (<span class="fu">^+^</span>)</code></pre>

<p>This uncurried form will come in handy in derivations below.</p>

<h3 id="deriving-matrix-operations">Deriving matrix operations</h3>

<h4 id="addition-1">Addition</h4>

<p>We’ll add two linear maps using the <a href="http://hackage.haskell.org/packages/archive/vector-space/latest/doc/html/Data-AdditiveGroup.html#v:-94--43--94-" title="Hackage documentation"><code>(^+^)</code></a> operation from <a href="http://hackage.haskell.org/packages/archive/vector-space/latest/doc/html/Data-AdditiveGroup.html" title="Hackage documentation"><code>Data.AdditiveGroup</code></a>.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(<span class="fu">^+^</span>) <span class="ot">∷</span> (a ⊸ b) <span class="ot">→</span> (a ⊸ b) <span class="ot">→</span> (a ⊸ b)</code></pre>

<p>Following the principle of semantic <a href="http://conal.net/blog/tag/type-class-morphism/" title="Posts on type class morphisms">type class morphism</a>s, the specification simply says that the meaning of the sum is the sum of the meanings:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">^+^</span> g) ≡ apply f <span class="fu">^+^</span> apply g</code></pre>

<p>which is half of the definition of “linearity” for <code>apply</code>.</p>

<p>The game plan (as always) is to use the semantic specification to derive (or “calculate”) a correct implementation of each operation. For addition, this goal means we want to come up with a definition like</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">f <span class="fu">^+^</span> g <span class="fu">=</span> <span class="fu">&lt;</span>rhs<span class="fu">&gt;</span></code></pre>

<p>where <code>&lt;rhs&gt;</code> is some expression in terms of <code>f</code> and <code>g</code> whose <em>meaning</em> is the same as the meaning as <code>f ^+^ g</code>, i.e., where</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">^+^</span> g) ≡ apply <span class="fu">&lt;</span>rhs<span class="fu">&gt;</span></code></pre>

<p>Since Haskell has convenient pattern matching, we’ll use it for our definition of <code>(^+^)</code> above. Addition has two arguments, and our data type has two constructors, there are at most four different cases to consider.</p>

<p>First, add <code>Dot</code> and <code>Dot</code>. The specification</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">^+^</span> g) ≡ apply f <span class="fu">^+^</span> apply g</code></pre>

<p>specializes to</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (<span class="dt">Dot</span> b <span class="fu">^+^</span> <span class="dt">Dot</span> c) ≡ apply (<span class="dt">Dot</span> b) <span class="fu">^+^</span> apply (<span class="dt">Dot</span> c)</code></pre>

<p>Now simplify the right-hand side (RHS):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (<span class="dt">Dot</span> b) <span class="fu">^+^</span> apply (<span class="dt">Dot</span> c)
 ≡ <span class="co">{- apply definition -}</span>
dot b <span class="fu">^+^</span> dot c
 ≡ <span class="co">{- (bi)linearity of dot, as described above -}</span>
dot (b <span class="fu">^+^</span> c)
 ≡ <span class="co">{- apply definition -}</span>
apply (<span class="dt">Dot</span> (b <span class="fu">^+^</span> c))</code></pre>

<p>So our specialized specification becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (<span class="dt">Dot</span> b <span class="fu">^+^</span> <span class="dt">Dot</span> c) ≡ apply (<span class="dt">Dot</span> (b <span class="fu">^+^</span> c))</code></pre>

<p>which is implied by</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Dot</span> b <span class="fu">^+^</span> <span class="dt">Dot</span> c ≡ <span class="dt">Dot</span> (b <span class="fu">^+^</span> c)</code></pre>

<p>and easily satisfied by the following partial definition (replacing “<code>≡</code>” by “<code>=</code>”):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="dt">Dot</span> b <span class="fu">^+^</span> <span class="dt">Dot</span> c <span class="fu">=</span> <span class="dt">Dot</span> (b <span class="fu">^+^</span> c)</code></pre>

<p>Now consider the case of addition with two <code>(:&amp;&amp;)</code> constructors:</p>

<p>The specification specializes to</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ((f <span class="fu">:&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">:&amp;&amp;</span> k)) ≡ apply (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">^+^</span> apply (h <span class="fu">:&amp;&amp;</span> k)</code></pre>

<p>As with <code>Dot</code>, simplify the RHS:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">^+^</span> apply (h <span class="fu">:&amp;&amp;</span> k)
 ≡ <span class="co">{- apply definition -}</span>
(apply f <span class="fu">&amp;&amp;&amp;</span> apply g) <span class="fu">^+^</span> (apply h <span class="fu">&amp;&amp;&amp;</span> apply k)
 ≡ <span class="co">{- See below -}</span>
(apply f <span class="fu">^+^</span> apply h) <span class="fu">&amp;&amp;&amp;</span> (apply g <span class="fu">^+^</span> apply k)
 ≡ <span class="co">{- induction -}</span>
apply (f <span class="fu">^+^</span> h) <span class="fu">&amp;&amp;&amp;</span> apply (g <span class="fu">^+^</span> k)
 ≡ <span class="co">{- apply definition -}</span>
apply ((f <span class="fu">^+^</span> h) <span class="fu">:&amp;&amp;</span> (g <span class="fu">^+^</span> k))</code></pre>

<p>I used the following property (on functions):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(f <span class="fu">&amp;&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">&amp;&amp;&amp;</span> k) ≡ (f <span class="fu">^+^</span> h) <span class="fu">&amp;&amp;&amp;</span> (g <span class="fu">^+^</span> k)</code></pre>

<p>Proof:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(f <span class="fu">&amp;&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">&amp;&amp;&amp;</span> k)
 ≡ <span class="co">{- η-expand -}</span>
λ x <span class="ot">→</span> ((f <span class="fu">&amp;&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">&amp;&amp;&amp;</span> k)) x
 ≡ <span class="co">{- (&amp;&amp;&amp;) definition for functions -}</span>
λ x <span class="ot">→</span> (f x, g x) <span class="fu">^+^</span> (h x, k x)
 ≡ <span class="co">{- (^+^) definition for pairs -}</span>
λ x <span class="ot">→</span> (f x <span class="fu">^+^</span> h x, g x <span class="fu">^+^</span> k x)
 ≡ <span class="co">{- (^+^) definition for functions -}</span>
λ x <span class="ot">→</span> ((f <span class="fu">^+^</span> h) x, (g <span class="fu">^+^</span> k) x)
 ≡ <span class="co">{- (&amp;&amp;&amp;) definition for functions -}</span>
(f <span class="fu">^+^</span> h) <span class="fu">&amp;&amp;&amp;</span> (g <span class="fu">^+^</span> k)</code></pre>

<p>The specification becomes</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ((f <span class="fu">:&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">:&amp;&amp;</span> k)) ≡ apply ((f <span class="fu">^+^</span> h) <span class="fu">:&amp;&amp;</span> (g <span class="fu">^+^</span> k))</code></pre>

<p>which is easily satisfied by the following partial definition</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(f <span class="fu">:&amp;&amp;</span> g) <span class="fu">^+^</span> (h <span class="fu">:&amp;&amp;</span> k) <span class="fu">=</span> (f <span class="fu">^+^</span> h) <span class="fu">:&amp;&amp;</span> (g <span class="fu">^+^</span> k)</code></pre>

<p>The other two cases are (a) <code>Dot</code> and <code>(:&amp;&amp;)</code>, and (b) <code>(:&amp;&amp;)</code> and <code>Dot</code>, but they don’t type-check (assuming that pairs are not scalars).</p>

<h3 id="composing-linear-maps">Composing linear maps</h3>

<p>I’ll write linear map composition as “<code>g ∘ f</code>”, with type</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(∘) <span class="ot">∷</span> (b ⊸ c) <span class="ot">→</span> (a ⊸ b) <span class="ot">→</span> (a ⊸ c)</code></pre>

<p>This notation is thanks to a <code>Category</code> instance, which depends on a generalized <code>Category</code> class that uses the recent <code>ConstraintKinds</code> language extension. (See the <a href="https://github.com/conal/linear-map-gadt" title="github repository">source code</a>.)</p>

<p>Following the semantic <a href="http://conal.net/blog/tag/type-class-morphism/" title="Posts on type class morphisms">type class morphism</a> principle again, the specification says that the meaning of the composition is the composition of the meanings:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (g ∘ f) ≡ apply g ∘ apply f</code></pre>

<p>In the following, note that the <code>∘</code> operator binds more tightly than <code>&amp;&amp;&amp;</code>, so <code>f ∘ h &amp;&amp;&amp; g ∘ h</code> means <code>(f ∘ h) &amp;&amp;&amp; (g ∘ h)</code>.</p>

<h4 id="derivation">Derivation</h4>

<p>Again, since there are two constructors, we have four possible cases cases. We can handle two of these cases together, namely <code>(:&amp;&amp;)</code> and anything. The specification:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply ((f <span class="fu">:&amp;&amp;</span> g) ∘ h) ≡ apply (f <span class="fu">:&amp;&amp;</span> g) ∘ apply h</code></pre>

<p>Reasoning proceeds as above, simplifying the RHS of the constructor-specialized specification.</p>

<p>Simplify the RHS:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">:&amp;&amp;</span> g) ∘ apply h
 ≡ <span class="co">{- apply definition -}</span>
(apply f <span class="fu">&amp;&amp;&amp;</span> apply g) ∘ apply h
 ≡ <span class="co">{- see below -}</span>
apply f ∘ apply h <span class="fu">&amp;&amp;&amp;</span> apply g ∘ apply h
 ≡ <span class="co">{- induction -}</span>
apply (f ∘ h) <span class="fu">&amp;&amp;&amp;</span> apply (g ∘ h)
 ≡ <span class="co">{- apply definition -}</span>
apply (f ∘ h <span class="fu">:&amp;&amp;</span> g ∘ h)</code></pre>

<p>This simplification uses the following property of functions:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(p <span class="fu">&amp;&amp;&amp;</span> q) ∘ r ≡ p ∘ r <span class="fu">&amp;&amp;&amp;</span> q ∘ r</code></pre>

<p>Sufficient definition:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(f <span class="fu">:&amp;&amp;</span> g) ∘ h <span class="fu">=</span> f ∘ h <span class="fu">:&amp;&amp;</span> g ∘ h</code></pre>

<p>We have two more cases, specified as follows:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (<span class="dt">Dot</span> c ∘ <span class="dt">Dot</span> b) ≡ apply (<span class="dt">Dot</span> c) ∘ apply (<span class="dt">Dot</span> b)

apply (<span class="dt">Dot</span> c ∘ (f <span class="fu">:&amp;&amp;</span> g)) ≡ apply (<span class="dt">Dot</span> c) ∘ apply (f <span class="fu">:&amp;&amp;</span> g)</code></pre>

<p>Based on types, <code>c</code> must be a scalar in the first case and a pair in the second. (<code>Dot b</code> produces a scalar, while <code>f :&amp;&amp; g</code> produces a pair.) Thus, we can write these two cases more specifically:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (<span class="dt">Dot</span> s ∘ <span class="dt">Dot</span> b) ≡ apply (<span class="dt">Dot</span> s) ∘ apply (<span class="dt">Dot</span> b)

apply (<span class="dt">Dot</span> (a,b) ∘ (f <span class="fu">:&amp;&amp;</span> g)) ≡ apply (<span class="dt">Dot</span> (a,b)) ∘ apply (f <span class="fu">:&amp;&amp;</span> g)</code></pre>

<p>In the derivation, I won’t spell out as many details as before. Simplify the RHSs:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply (<span class="dt">Dot</span> s) ∘ apply (<span class="dt">Dot</span> b)
≡ dot s ∘ dot b
≡ dot (s <span class="fu">*^</span> b)
≡ apply (<span class="dt">Dot</span> (s <span class="fu">*^</span> b))</code></pre>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply (<span class="dt">Dot</span> (a,b)) ∘ apply (f <span class="fu">:&amp;&amp;</span> g)
≡ dot (a,b) ∘ (apply f <span class="fu">&amp;&amp;&amp;</span> apply g)
≡ add ∘ (dot a ∘ apply f <span class="fu">&amp;&amp;&amp;</span> dot b ∘ apply g)
≡ dot a ∘ apply f <span class="fu">^+^</span> dot b ∘ apply g
≡ apply (<span class="dt">Dot</span> a ∘ f <span class="fu">^+^</span> <span class="dt">Dot</span> b ∘ g)</code></pre>

<p>I’ve used the following properties of functions:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot (a,b)             ≡ add ∘ (dot a <span class="fu">***</span> dot b)

(r <span class="fu">***</span> s) ∘ (p <span class="fu">&amp;&amp;&amp;</span> q) ≡ r ∘ p <span class="fu">&amp;&amp;&amp;</span> s ∘ q

add ∘ (p <span class="fu">&amp;&amp;&amp;</span> q)       ≡ p <span class="fu">^+^</span> q

apply (f <span class="fu">^+^</span> g)       ≡ apply f <span class="fu">^+^</span> apply g</code></pre>

<p>Implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"> <span class="dt">Dot</span> s     ∘ <span class="dt">Dot</span> b     <span class="fu">=</span> <span class="dt">Dot</span> (s <span class="fu">*^</span> b)
 <span class="dt">Dot</span> (a,b) ∘ (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">=</span> <span class="dt">Dot</span> a ∘ f <span class="fu">^+^</span> <span class="dt">Dot</span> b ∘ g</code></pre>

<h3 id="cross-products">Cross products</h3>

<p>Another <code>Arrow</code> operation handy for linear maps is the parallel composition (product):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">(<span class="fu">***</span>) <span class="ot">∷</span> (a ⊸ c) <span class="ot">→</span> (b ⊸ d) <span class="ot">→</span> (a × b ⊸ c × d)</code></pre>

<p>The specification says that <code>apply</code> distributes over <code>(***)</code>. In other words, the meaning of the product is the product of the meanings.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (f <span class="fu">***</span> g) <span class="fu">=</span> apply f <span class="fu">***</span> apply g</code></pre>

<p>Where, on functions,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">p <span class="fu">***</span> q <span class="fu">=</span> λ (a,b) <span class="ot">→</span> (p a, q b)
        ≡ p ∘ <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> q ∘ <span class="fu">snd</span></code></pre>

<p>Simplify the specifications RHS:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply f <span class="fu">***</span> apply g
≡ apply f ∘ <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> apply g ∘ <span class="fu">snd</span></code></pre>

<p>If we knew how to represent <code>fst</code> and <code>snd</code> via our linear map constructors, we’d be nearly done. Instead, let’s suppose we have the following functions.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">compFst <span class="ot">∷</span> <span class="dt">VS3</span> a b c <span class="ot">⇒</span> a ⊸ c <span class="ot">→</span> a × b ⊸ c
compSnd <span class="ot">∷</span> <span class="dt">VS3</span> a b c <span class="ot">⇒</span> b ⊸ c <span class="ot">→</span> a × b ⊸ c</code></pre>

<p>specified as follows:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">apply (compFst f) ≡ apply f ∘ <span class="fu">fst</span>
apply (compSnd g) ≡ apply g ∘ <span class="fu">snd</span></code></pre>

<p>With these two functions (to be defined) in hand, let’s try again.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">  apply f <span class="fu">***</span> apply g
≡ apply f ∘ <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> apply g ∘ <span class="fu">snd</span>
≡ apply (compFst f) <span class="fu">&amp;&amp;&amp;</span> apply (compSnd g)
≡ apply (compFst f <span class="fu">:&amp;&amp;</span> compSnd g)</code></pre>

<h4 id="composing-with-fst-and-snd">Composing with <code>fst</code> and <code>snd</code></h4>

<p>I’ll elide even more of the derivation this time, focusing reasoning on the meanings. Relating to the representation is left as an exercise. The key steps in the derivation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">dot a     ∘ <span class="fu">fst</span> ≡ dot (a,<span class="dv">0</span>)

(f <span class="fu">&amp;&amp;&amp;</span> g) ∘ <span class="fu">fst</span> ≡ f ∘ <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> g ∘ <span class="fu">fst</span>

dot b     ∘ <span class="fu">snd</span> ≡ dot (<span class="dv">0</span>,b)

(f <span class="fu">&amp;&amp;&amp;</span> g) ∘ <span class="fu">snd</span> ≡ f ∘ <span class="fu">snd</span> <span class="fu">&amp;&amp;&amp;</span> g ∘ <span class="fu">snd</span></code></pre>

<p>Implementation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">compFst (<span class="dt">Dot</span> a)   <span class="fu">=</span> <span class="dt">Dot</span> (a,zeroV)
compFst (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">=</span> compFst f <span class="fu">&amp;&amp;&amp;</span> compFst g

compSnd (<span class="dt">Dot</span> b)   <span class="fu">=</span> <span class="dt">Dot</span> (zeroV,b)
compSnd (f <span class="fu">:&amp;&amp;</span> g) <span class="fu">=</span> compSnd f <span class="fu">&amp;&amp;&amp;</span> compSnd g</code></pre>

<p>where <code>zeroV</code> is the zero vector.</p>

<p>Given <code>compFst</code> and <code>compSnd</code>, we can implement <code>fst</code> and <code>snd</code> as linear maps simply as <code>compFst id</code> and <code>compSnd id</code>, where <code>id</code> is the (polymorphic) identity linear map.</p>

<h3 id="reflections">Reflections</h3>

<p>This post reflects an approach to programming that I apply wherever I’m able. As a summary:</p>

<ul>
<li>Look for an elegant <em>what</em> behind a familiar <em>how</em>.</li>
<li><em>Define</em> a semantic function for each data type.</li>
<li><em>Derive</em> a correct implementation from the semantics.</li>
</ul>

<p>You can find more examples of this methodology elsewhere in this blog and in the paper <a href="http://conal.net/papers/type-class-morphisms/"><em>Denotational design with type class morphisms</em></a>.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=503&amp;md5=061f6fb371675342c9a006b8ec5376e9"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/reimagining-matrices/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Freimagining-matrices&amp;language=en_GB&amp;category=text&amp;title=Reimagining+matrices&amp;description=The+function+of+the+imagination+is+notto+make+strange+things+settled%2C+so+much+asto+make+settled+things+strange.-+G.K.+Chesterton+Why+is+matrix+multiplication+defined+so+very+differently+from+matrix...&amp;tags=category%2Cdenotational+design%2Clinear+algebra%2Ctype+class+morphism%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Parallel speculative addition via memoization</title>
		<link>http://conal.net/blog/posts/parallel-speculative-addition-via-memoization</link>
		<comments>http://conal.net/blog/posts/parallel-speculative-addition-via-memoization#comments</comments>
		<pubDate>Tue, 27 Nov 2012 23:39:42 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[number]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[speculation]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=493</guid>
		<description><![CDATA[I’ve been thinking much more about parallel computation for the last couple of years, especially since starting to work at Tabula a year ago. Until getting into parallelism explicitly, I’d naïvely thought that my pure functional programming style was mostly free of sequential bias. After all, functional programming lacks the implicit accidental dependencies imposed by [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p>I’ve been thinking much more about parallel computation for the last couple of years, especially since starting to work at <a href="http://www.tabula.com">Tabula</a> a year ago. Until getting into parallelism explicitly, I’d naïvely thought that my pure functional programming style was mostly free of sequential bias. After all, functional programming lacks the implicit accidental dependencies imposed by the imperative model. Now, however, I’m coming to see that designing parallel-friendly algorithms takes attention to minimizing the depth of the remaining, explicit data dependencies.</p>

<p>As an example, consider binary addition, carried out from least to most significant bit (as usual). We can immediately compute the first (least significant) bit of the result, but in order to compute the second bit, we’ll have to know whether or not a carry resulted from the first addition. More generally, the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>n</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></math><em>th</em> sum &amp; carry require knowing the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math><em>th</em> carry, so this algorithm does not allow parallel execution. Even if we have one processor per bit position, only one processor will be able to work at a time, due to the linear chain of dependencies.</p>

<p>One general technique for improving parallelism is <em>speculation</em>—doing more work than might be needed so that we don’t have to wait to find out exactly what <em>will</em> be needed. In this post, we’ll see a progression of definitions for bitwise addition. We’ll start with a linear-depth chain of carry dependencies and end with logarithmic depth. Moreover, by making careful use of abstraction, these versions will be simply different type specializations of a single polymorphic definition with an extremely terse definition.</p>

<p><span id="more-493"></span></p>

<h3 id="a-full-adder">A full adder</h3>

<p>Let’s start with an adder for two one-bit numbers. Because of the possibility of overflow, the result will be two bits, which I’ll call “sum” and “carry”. So that we can chain these one-bit adders, we’ll also add a carry input.</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addB <span class="ot">∷</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">Bool</span> <span class="ot">→</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>)</code></pre>

<p>In the result, the first <code>Bool</code> will be the sum, and the second will be the carry. I’ve curried the carry input to make it stand out from the (other) addends.</p>

<p>There are a few ways to define <code>addB</code> in terms of logic operations. I like the following definition, as it shares a little work between sum &amp; carry:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addB (a,b) cin <span class="fu">=</span> (axb ≠ cin, anb ∨ (cin ∧ axb))
 <span class="kw">where</span>
   axb <span class="fu">=</span> a ≠ b
   anb <span class="fu">=</span> a ∧ b</code></pre>

<p>I’m using <code>(≠)</code> on <code>Bool</code> for exclusive or.</p>

<h3 id="a-ripple-carry-adder">A ripple carry adder</h3>

<p>Now suppose we have not just two bits, but two <em>sequences</em> of bits, interpreted as binary numbers arranged from least to most significant bit. For simplicity, I’d like to assume that these sequences to have the same length, so rather than taking a pair of bit lists, let’s take a list of bit pairs:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="ot">∷</span> [(<span class="dt">Bool</span>,<span class="dt">Bool</span>)] <span class="ot">→</span> <span class="dt">Bool</span> <span class="ot">→</span> ([<span class="dt">Bool</span>],<span class="dt">Bool</span>)</code></pre>

<p>To implement <code>add</code>, traverse the list of bit pairs, threading the carries:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add [] c     <span class="fu">=</span> ([]  , c)
add (p<span class="fu">:</span>ps) c <span class="fu">=</span> (s<span class="fu">:</span>ss, c&#39;&#39;)
 <span class="kw">where</span>
   (s ,c&#39; ) <span class="fu">=</span> addB p c
   (ss,c&#39;&#39;) <span class="fu">=</span> add ps c&#39;</code></pre>

<h3 id="state">State</h3>

<p>This <code>add</code> definition contains a familiar pattern. The carry values act as a sort of <em>state</em> that gets updated in a linear (non-branching) way. The <code>State</code> monad captures this pattern of computation:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">State</span> s a <span class="fu">=</span> <span class="dt">State</span> (s <span class="ot">→</span> (a,s))</code></pre>

<p>By using <code>State</code> and its <code>Monad</code> instance, we can shorten our <code>add</code> definition. First we’ll need a new full adder definition, tweaked for <code>State</code>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addB <span class="ot">∷</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> <span class="dt">Bool</span>
addB (a,b) <span class="fu">=</span> <span class="kw">do</span> cin <span class="ot">←</span> get
                put (anb ∨ cin ∧ axb)
                <span class="fu">return</span> (axb ≠ cin)
 <span class="kw">where</span>
   anb <span class="fu">=</span> a ∧ b
   axb <span class="fu">=</span> a ≠ b</code></pre>

<p>And then the multi-bit adder:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="ot">∷</span> [(<span class="dt">Bool</span>,<span class="dt">Bool</span>)] <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> [<span class="dt">Bool</span>]
add []     <span class="fu">=</span> <span class="fu">return</span> []
add (p<span class="fu">:</span>ps) <span class="fu">=</span> <span class="kw">do</span> s  <span class="ot">←</span> addB p
                ss <span class="ot">←</span> add ps
                <span class="fu">return</span> (s<span class="fu">:</span>ss)</code></pre>

<p>We don’t really need the <code>Monad</code> interface to define <code>add</code>. The simpler and more general <code>Applicative</code> interface suffices:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add []     <span class="fu">=</span> pure []
add (p<span class="fu">:</span>ps) <span class="fu">=</span> liftA2 (<span class="fu">:</span>) (addB p) (add ps)</code></pre>

<p>This pattern also looks familiar. Oh — the <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Traversable.html#t:Traversable"><code>Traversable</code></a> instance for lists makes for a very compact definition:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="fu">=</span> traverse addB</code></pre>

<p>Wow. The definition is now so simple that it doesn’t depend on the specific choice of lists. To find out the most general type <code>add</code> can have (with this definition), remove the type signature, turn off the monomorphism restriction, and see what GHCi has to say:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="ot">∷</span> <span class="kw">Traversable</span> t <span class="ot">⇒</span> t (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> (t <span class="dt">Bool</span>)</code></pre>

<p>This constraint is <em>very</em> lenient. <code>Traversable</code> can be derived automatically for <em>all</em> algebraic data types, including nested/non-regular ones.</p>

<p>For instance,</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Tree</span> a <span class="fu">=</span> <span class="dt">Leaf</span> a <span class="fu">|</span> <span class="dt">Branch</span> (<span class="dt">Tree</span> a) (<span class="dt">Tree</span> a)
  <span class="kw">deriving</span> (<span class="kw">Functor</span>,<span class="kw">Foldable</span>,<span class="kw">Traversable</span>)</code></pre>

<p>We can now specialize this general <code>add</code> back to lists:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addLS <span class="ot">∷</span> [(<span class="dt">Bool</span>,<span class="dt">Bool</span>)] <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> [<span class="dt">Bool</span>]
addLS <span class="fu">=</span> add</code></pre>

<p>We can also specialize for trees:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addTS <span class="ot">∷</span> <span class="dt">Tree</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> (<span class="dt">Tree</span> <span class="dt">Bool</span>)
addTS <span class="fu">=</span> add</code></pre>

<p>Or for depth-typed perfect trees (e.g., as described in <a href="http://conal.net/blog/posts/from-tries-to-trees/" title="blog post"><em>From tries to trees</em></a>):</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addTnS <span class="ot">∷</span> <span class="dt">IsNat</span> n <span class="ot">⇒</span>
         <span class="dt">T</span> n (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">State</span> <span class="dt">Bool</span> (<span class="dt">T</span> n <span class="dt">Bool</span>)
addTnS <span class="fu">=</span> add</code></pre>

<p>Binary trees are often better than lists for parallelism, because they allow quick recursive splitting and joining. In the case of ripple adders, we don’t really get parallelism, however, because of the single-threaded (linear) nature of <code>State</code>. Can we get around this unfortunate linearization?</p>

<h3 id="speculation">Speculation</h3>

<p>The linearity of carry propagation interferes with parallel execution even when using a tree representation. The problem is that each <code>addB</code> (full adder) invocation must access the carry out from the previous (immediately less significant) bit position and so must wait for that carry to be computed. Since each bit addition must wait for the previous one to finish, we get linear running time, even with unlimited parallel processing available. If we didn’t have to wait for carries, we could instead get logarithmic running time using the tree representation, since subtrees could be added in parallel.</p>

<p>A way out of this dilemma is to speculatively compute the bit sums for <em>both</em> possibilities, i.e., for carry and no carry. We’ll do more work, but much less waiting.</p>

<h3 id="state-memoization">State memoization</h3>

<p>Recall the <code>State</code> definition:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">State</span> s a <span class="fu">=</span> <span class="dt">State</span> (s <span class="ot">→</span> (a,s))</code></pre>

<p>Rather than using a <em>function</em> of <code>s</code>, let’s use a <em>table</em> indexed by <code>s</code>. Since <code>s</code> is <code>Bool</code> in our use, a table is simply a uniform pair, so we could replace <code>State Bool a</code> with the following:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">BoolStateTable</span> a <span class="fu">=</span> <span class="dt">BST</span> ((a,<span class="dt">Bool</span>), (a,<span class="dt">Bool</span>))</code></pre>

<p><em>Exercise:</em> define <code>Functor</code>, <code>Applicative</code>, and <code>Monad</code> instances for <code>BoolStateTable</code>.</p>

<p>Rather than defining such a specialized type, let’s stand back and consider what’s going on. We’re replacing a function by an isomorphic data type. This replacement is exactly what memoization is about. So let’s define a general <em>memoizing state monad</em>:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">StateTrie</span> s a <span class="fu">=</span> <span class="dt">StateTrie</span> (s ⇰ (a,s))</code></pre>

<p>Note that the definition of memoizing state is nearly identical to <code>State</code>. I’ve simply replaced “<code>→</code>” by “<code>⇰</code>”, i.e., <a href="http://conal.net/blog/tag/memoization/" title="Posts on memoization">memo</a> <a href="http://conal.net/blog/tag/trie/" title="Posts on tries">tries</a>. For the (simple) source code of <code>StateTrie</code>, see <a href="http://github.com/conal/state-trie.git">the github project</a>. (Poking around on Hackage, I just found <a href="http://hackage.haskell.org/package/monad-memo">monad-memo</a>, which looks related.)</p>

<p>The full-adder function <code>addB</code> is restricted to <code>State</code>, but unnecessarily so. The most general type is inferred as</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addB <span class="ot">∷</span> <span class="dt">MonadState</span> <span class="dt">Bool</span> m <span class="ot">⇒</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> m <span class="dt">Bool</span></code></pre>

<p>where the <a href="http://hackage.haskell.org/packages/archive/mtl/latest/doc/html/Control-Monad-State-Class.html#t:MonadState"><code>MonadState</code></a> class comes from the mtl package.</p>

<p>With the type-generalized <code>addB</code>, we get a more general type for <code>add</code> as well:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">add <span class="ot">∷</span> (<span class="kw">Traversable</span> t, <span class="kw">Applicative</span> m, <span class="dt">MonadState</span> <span class="dt">Bool</span> m) <span class="ot">⇒</span>
      t (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> m (t <span class="dt">Bool</span>)
add <span class="fu">=</span> traverse addB</code></pre>

<p>Now we can specialize <code>add</code> to work with memoized state:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addLM <span class="ot">∷</span> [(<span class="dt">Bool</span>,<span class="dt">Bool</span>)] <span class="ot">→</span> <span class="dt">StateTrie</span> <span class="dt">Bool</span> [<span class="dt">Bool</span>]
addLM <span class="fu">=</span> add

addTM <span class="ot">∷</span> <span class="dt">Tree</span> (<span class="dt">Bool</span>,<span class="dt">Bool</span>) <span class="ot">→</span> <span class="dt">StateTrie</span> <span class="dt">Bool</span> (<span class="dt">Tree</span> <span class="dt">Bool</span>)
addTM <span class="fu">=</span> add</code></pre>

<h3 id="what-have-we-done">What have we done?</h3>

<p>The essential tricks in this post are to (a) boost parallelism by speculative evaluation (an old idea) and (b) express speculation as memoization (new, to me at least). The technique wins for binary addition thanks to the small number of possible states, which then makes memoization (full speculation) affordable.</p>

<p>I’m not suggesting that the code above has impressive parallel execution when compiled under GHC. Perhaps it could with some <a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/lang-parallel.html#id653837"><code>par</code> and <code>pseq</code> annotations</a>. I haven’t tried. This exploration helps me understand a little of the space of hardware-oriented algorithms.</p>

<p>The <a href="http://www.aoki.ecei.tohoku.ac.jp/arith/mg/algorithm.html#fsa_csu">conditional sum adder</a> looks quite similar to the development above. It has the twist, however, of speculating carries on blocks of a few bits rather than single bits. It’s astonishingly easy to adapt the development above for such a hybrid scheme, forming traversable structures of sequences of bits:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">addH <span class="ot">∷</span> <span class="dt">Tree</span> [(<span class="dt">Bool</span>,<span class="dt">Bool</span>)] <span class="ot">→</span> <span class="dt">StateTrie</span> <span class="dt">Bool</span> (<span class="dt">Tree</span> [<span class="dt">Bool</span>])
addH <span class="fu">=</span> traverse (fromState ∘ add)</code></pre>

<p>I’m using the adapter <code>fromState</code> so that the inner list additions will use <code>State</code> while the outer tree additions will use <code>StateTrie</code>, thanks to type inference. This adapter memoizes and rewraps the state transition function:</p>

<pre class="sourceCode literate haskell"><code class="sourceCode haskell">fromState <span class="ot">∷</span> <span class="dt">HasTrie</span> s <span class="ot">⇒</span> <span class="dt">State</span> s a <span class="ot">→</span> <span class="dt">StateTrie</span> s a
fromState <span class="fu">=</span> <span class="dt">StateTrie</span> ∘ trie ∘ runState</code></pre>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=493&amp;md5=0136cef7713a1f1858ef9075530a873c"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/parallel-speculative-addition-via-memoization/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fparallel-speculative-addition-via-memoization&amp;language=en_GB&amp;category=text&amp;title=Parallel+speculative+addition+via+memoization&amp;description=I%E2%80%99ve+been+thinking+much+more+about+parallel+computation+for+the+last+couple+of+years%2C+especially+since+starting+to+work+at+Tabula+a+year+ago.+Until+getting+into+parallelism+explicitly%2C+I%E2%80%99d...&amp;tags=number%2Cparallelism%2Cspeculation%2Cblog" type="text/html" />
	</item>
		<item>
		<title>A third view on trees</title>
		<link>http://conal.net/blog/posts/a-third-view-on-trees</link>
		<comments>http://conal.net/blog/posts/a-third-view-on-trees#comments</comments>
		<pubDate>Sat, 04 Jun 2011 02:46:20 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[functor]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=460</guid>
		<description><![CDATA[A few recent posts have played with trees from two perspectives. The more commonly used I call &#34;top-down&#34;, because the top-level structure is most immediately apparent. A top-down binary tree is either a leaf or a pair of such trees, and that pair can be accessed without wading through intervening structure. Much less commonly used [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p>A few recent posts have played with trees from two perspectives. The more commonly used I call &quot;top-down&quot;, because the top-level structure is most immediately apparent. A top-down binary tree is either a leaf or a pair of such trees, and that pair can be accessed without wading through intervening structure. Much less commonly used are &quot;bottom-up&quot; trees. A bottom-up binary tree is either a leaf or a single such tree of pairs. In the non-leaf case, the pair structure of the tree elements is accessible by operations like mapping, folding, or scanning. The difference is between a pair of trees and a tree of pairs.</p>

<p>As an alternative to the top-down and bottom-up views on trees, I now want to examine a third view, which is a hybrid of the two. Instead of pairs of trees or trees of pairs, this hybrid view is of trees of trees, and more specifically of bottom-up trees of top-down trees. As we&#8217;ll see, these hybrid trees emerge naturally from the top-down and bottom-up views. A later post will show how this third view lends itself to an <em>in-place</em> (destructive) scan algorithm, suitable for execution on modern GPUs.</p>

<p><strong>Edits:</strong></p>

<ul>
<li>2011-06-04: &quot;Suppose we have a bottom-up tree of top-down trees, i.e., <code>t ∷ TB (TT a)</code>. Was backwards. (Thanks to Noah Easterly.)</li>
<li>2011-06-04: Notation: &quot;<code>f ➶ n</code>&quot; and &quot;<code>f ➴ n</code>&quot;.</li>
</ul>

<p><span id="more-460"></span></p>

<p>The post <a href="http://conal.net/blog/posts/parallel-tree-scanning-by-composition/" title="blog post"><em>Parallel tree scanning by composition</em></a> defines &quot;top-down&quot; and a &quot;bottom-up&quot; binary trees as follows (modulo type and constructor names):</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">TT</span> a <span class="fu">=</span> <span class="kw">LT</span> a <span class="fu">|</span> <span class="dt">BT</span> { unBT <span class="ot">&#8759;</span> <span class="dt">Pair</span> (<span class="dt">TT</span> a) } <span class="kw">deriving</span> <span class="kw">Functor</span><br /><br /><span class="kw">data</span> <span class="dt">TB</span> a <span class="fu">=</span> <span class="dt">LB</span> a <span class="fu">|</span> <span class="dt">BB</span> { unBB <span class="ot">&#8759;</span> <span class="dt">TB</span> (<span class="dt">Pair</span> a) } <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>So, while a non-leaf <code>TT</code> (top-down tree) has a pair at the top (outside), a non-leaf <code>TB</code> (bottom-up tree) has pairs at the bottom (inside).</p>

<p>Combining these two observations leads to an interesting possibility. Suppose we have a bottom-up tree of top-down trees, i.e., <code>t ∷ TB (TT a)</code>. If <code>t</code> is not a leaf, then <code>t ≡ BB tt</code> where <code>tt</code> is a bottom-up tree whose leaves are pairs of top-down trees, i.e., <code>tt ∷ TB (Pair (TT a))</code>. Each of those leaves of type <code>Pair (TT a)</code> can be converted to type <code>TT a</code> (single tree), simply by applying the <code>BT</code> constructor. Moreover, this transformation is invertible. For convenience, define a type alias for hybrid trees:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">TH</span> a <span class="fu">=</span> <span class="dt">TB</span> (<span class="dt">TT</span> a)</code></pre>

<p>Then the two conversions:</p>

<pre class="sourceCode"><code class="sourceCode haskell">upT   <span class="ot">&#8759;</span> <span class="dt">TH</span> a <span class="ot">&#8594;</span> <span class="dt">TH</span> a<br />upT   <span class="fu">=</span> <span class="fu">fmap</span> <span class="dt">BT</span> &#8728; unBB<br /><br />downT <span class="ot">&#8759;</span> <span class="dt">TH</span> a <span class="ot">&#8594;</span> <span class="dt">TH</span> a<br />downT <span class="fu">=</span> <span class="dt">BB</span> &#8728; <span class="fu">fmap</span> unBT</code></pre>

<div class=exercise>

<p><em>Exercise:</em> Prove <code>upT</code> and <code>downT</code> are inverses where defined.</p>
<p>Answer:</p>

<div class=toggle>

<pre class="sourceCode"><code class="sourceCode haskell">  upT &#8728; downT<br />&#8801; <span class="fu">fmap</span> <span class="dt">BT</span> &#8728; unBB &#8728; <span class="dt">BB</span> &#8728; <span class="fu">fmap</span> unBT<br />&#8801; <span class="fu">fmap</span> <span class="dt">BT</span> &#8728; <span class="fu">fmap</span> unBT<br />&#8801; <span class="fu">fmap</span> (<span class="dt">BT</span> &#8728; unBT)<br />&#8801; <span class="fu">fmap</span> <span class="fu">id</span><br />&#8801; <span class="fu">id</span><br /><br />  downT &#8728; upT<br />&#8801; <span class="dt">BB</span> &#8728; <span class="fu">fmap</span> unBT &#8728; <span class="fu">fmap</span> <span class="dt">BT</span> &#8728; unBB<br />&#8801; <span class="dt">BB</span> &#8728; <span class="fu">fmap</span> (unBT &#8728; <span class="dt">BT</span>) &#8728; unBB<br />&#8801; <span class="dt">BB</span> &#8728; <span class="fu">fmap</span> <span class="fu">id</span> &#8728; unBB<br />&#8801; <span class="dt">BB</span> &#8728; <span class="fu">id</span> &#8728; unBB<br />&#8801; <span class="dt">BB</span> &#8728; unBB<br />&#8801; <span class="fu">id</span></code></pre>

</div>


</div>

<p>Consider a perfect binary leaf tree of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math>, i.e., an <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math>-deep binary tree with each level full and data only at the leaves (where a leaf is depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0</mn></mrow></math> tree.) We can view such a tree as top-down, or bottom-up, or as a hybrid.</p>

<p>Each of these three views is really <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></math> views:</p>

<ul>
<li>Top-down: a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> tree, or a pair of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mn>1</mn></mrow></math> trees, or a pair of pairs of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math> trees, etc.</li>
<li>Bottom-up: a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> tree, or a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mn>1</mn></mrow></math> tree of pairs, or a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mn>2</mn></mrow></math> tree of pairs of pairs, etc.</li>
<li>Hybrid: a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> tree of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0</mn></mrow></math> trees, or a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mn>1</mn></mrow></math> tree of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>1</mn></mrow></math> trees, or, &#8230;, or a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0</mn></mrow></math> tree of depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> trees.</li>
</ul>

<p>In the hybrid case, counting from <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0</mn></mrow></math> to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math>, the <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msup><mi>k</mi><mrow><mi>t</mi><mi>h</mi></mrow></msup></mrow></math> such view is a depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>-</mo><mi>k</mi></mrow></math> bottom-up tree whose elements (leaf values) are depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>k</mi></mrow></math> top-down trees. When <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>k</mi><mo>=</mo><mi>n</mi></mrow></math>, we have a bottom-up tree whose leaves are all single-leaf trees, and when <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow></math>, we have a single-leaf bottom-up tree containing a top-down tree. Imagine a horizontal line at depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>k</mi></mrow></math>, dividing the bottom-up outer structure from the top-down inner structure. The <code>downT</code> function moves the dividing line downward, and the <code>upT</code> function moves the line upward. Both functions are partial.</p>

<h3 id="generalizing">Generalizing</h3>

<p>The role of <code>Pair</code> in the tree types above is simple and regular. We can abstract out this particular type constructor, generalizing to an arbitrary functor. I&#8217;ll call this generalization &quot;functor trees&quot;. Again, there are top-down and bottom-up versions:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">FT</span> f a <span class="fu">=</span> <span class="dt">FLT</span> a <span class="fu">|</span> <span class="dt">FBT</span> { unFBT <span class="ot">&#8759;</span> f (<span class="dt">FT</span> f a) } <span class="kw">deriving</span> <span class="kw">Functor</span><br /><br /><span class="kw">data</span> <span class="dt">FB</span> f a <span class="fu">=</span> <span class="dt">FLB</span> a <span class="fu">|</span> <span class="dt">FBB</span> { unFBB <span class="ot">&#8759;</span> <span class="dt">FB</span> f (f a) } <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>And a hybrid version, with generalized versions of <code>upT</code> and <code>downT</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">FH</span> f a <span class="fu">=</span> <span class="dt">FB</span> f (<span class="dt">FT</span> f a)<br /><br />upH   <span class="ot">&#8759;</span> <span class="kw">Functor</span> f <span class="ot">&#8658;</span> <span class="dt">FH</span> f a <span class="ot">&#8594;</span> <span class="dt">FH</span> f a<br />upH   <span class="fu">=</span> <span class="fu">fmap</span> <span class="dt">FBT</span> &#8728; unFBB<br /><br />downH <span class="ot">&#8759;</span> <span class="kw">Functor</span> f <span class="ot">&#8658;</span> <span class="dt">FH</span> f a <span class="ot">&#8594;</span> <span class="dt">FH</span> f a<br />downH <span class="fu">=</span> <span class="dt">FBB</span> &#8728; <span class="fu">fmap</span> unFBT</code></pre>

<p>These definitions specialize to the ones (for binary trees) by substituting <code>Pair</code> for the parameter <code>f</code>.</p>

<h3 id="depth-typing">Depth-typing</h3>

<p>The upward and downward view-changing functions above are partial, as they can fail at extreme tree views (at depth <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>0</mn></mrow></math> or <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math>). We could make this partiality explicit by changing the result type to <code>Maybe (TH a)</code> for binary hybrid trees and to <code>Maybe (FH f a)</code> for the functor generalization. Alternatively, make the tree sizes <em>explicit</em> in the types, as in a few recent posts, including <a href="http://conal.net/blog/posts/a-trie-for-length-typed-vectors/" title="blog post"><em>A trie for length-typed vectors</em></a>. (In those posts, I used the terms &quot;right-folded&quot; and &quot;left-folded&quot; in place of &quot;top-down&quot; and &quot;bottom-up&quot;, reflecting the right- or left-folding of functor composition. The &quot;folded&quot; terms led to some confusion, especially in the context of data type folds and scans.) In the depth-typed versions, &quot;leaves&quot; are zero-ary compositions, and &quot;branches&quot; are <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>m</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></math>-ary compositions for some <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>m</mi></mrow></math>.</p>

<p>Top-down:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> (&#10164;) <span class="ot">&#8759;</span> (<span class="fu">*</span> <span class="ot">&#8594;</span> <span class="fu">*</span>) <span class="ot">&#8594;</span> <span class="fu">*</span> <span class="ot">&#8594;</span> (<span class="fu">*</span> <span class="ot">&#8594;</span> <span class="fu">*</span>) <span class="kw">where</span><br />  <span class="dt">ZeroT</span> <span class="ot">&#8759;</span> a <span class="ot">&#8594;</span> (f &#10164; <span class="dt">Z</span>) a<br />  <span class="dt">SuccT</span> <span class="ot">&#8759;</span> <span class="dt">IsNat</span> n <span class="ot">&#8658;</span> f ((f &#10164; n) a) <span class="ot">&#8594;</span> (f &#10164; <span class="dt">S</span> n) a<br /><br />unZeroT <span class="ot">&#8759;</span> (f &#10164; <span class="dt">Z</span>) a <span class="ot">&#8594;</span> a<br />unZeroT (<span class="dt">ZeroT</span> a) <span class="fu">=</span> a<br /><br />unSuccT <span class="ot">&#8759;</span> (f &#10164; <span class="dt">S</span> n) a <span class="ot">&#8594;</span> f ((f &#10164; n) a)<br />unSuccT (<span class="dt">SuccT</span> fsa) <span class="fu">=</span> fsa<br /><br /><span class="kw">instance</span> <span class="kw">Functor</span> f <span class="ot">&#8658;</span> <span class="kw">Functor</span> (f &#10164; n) <span class="kw">where</span><br />  <span class="fu">fmap</span> h (<span class="dt">ZeroT</span> a)  <span class="fu">=</span> <span class="dt">ZeroT</span> (h a)<br />  <span class="fu">fmap</span> h (<span class="dt">SuccT</span> fs) <span class="fu">=</span> <span class="dt">SuccT</span> ((fmap&#8728;fmap) h fs)</code></pre>

<p>Bottom-up:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> (&#10166;) <span class="ot">&#8759;</span> (<span class="fu">*</span> <span class="ot">&#8594;</span> <span class="fu">*</span>) <span class="ot">&#8594;</span> <span class="fu">*</span> <span class="ot">&#8594;</span> (<span class="fu">*</span> <span class="ot">&#8594;</span> <span class="fu">*</span>) <span class="kw">where</span><br />  <span class="dt">ZeroB</span> <span class="ot">&#8759;</span> a <span class="ot">&#8594;</span> (f &#10166; <span class="dt">Z</span>) a<br />  <span class="dt">SuccB</span> <span class="ot">&#8759;</span> <span class="dt">IsNat</span> n <span class="ot">&#8658;</span> (f &#10166; n) (f a) <span class="ot">&#8594;</span> (f &#10166; <span class="dt">S</span> n) a<br /><br />unZeroB <span class="ot">&#8759;</span> (f &#10166; <span class="dt">Z</span>) a <span class="ot">&#8594;</span> a<br />unZeroB (<span class="dt">ZeroB</span> a) <span class="fu">=</span> a<br /><br />unSuccB <span class="ot">&#8759;</span> (f &#10166; <span class="dt">S</span> n) a <span class="ot">&#8594;</span> (f &#10166; n) (f a)<br />unSuccB (<span class="dt">SuccB</span> fsa) <span class="fu">=</span> fsa<br /><br /><span class="kw">instance</span> <span class="kw">Functor</span> f <span class="ot">&#8658;</span> <span class="kw">Functor</span> (f &#10166; n) <span class="kw">where</span><br />  <span class="fu">fmap</span> h (<span class="dt">ZeroB</span> a)  <span class="fu">=</span> <span class="dt">ZeroB</span> (h a)<br />  <span class="fu">fmap</span> h (<span class="dt">SuccB</span> fs) <span class="fu">=</span> <span class="dt">SuccB</span> ((fmap&#8728;fmap) h fs)</code></pre>

<p>Hybrid:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">H</span> p q f a <span class="fu">=</span> (f &#10166; p) ((f &#10164; q) a)</code></pre>

<p>Upward and downward shift become total functions, and their types explicitly describe how the line shifts between <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo stretchy="false">(</mo><mi>p</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo><mo>/</mo><mi>q</mi></mrow></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi><mo>/</mo><mo stretchy="false">(</mo><mi>q</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></math>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">up   <span class="ot">&#8759;</span> (<span class="kw">Functor</span> f, <span class="dt">IsNat</span> q) <span class="ot">&#8658;</span> <span class="dt">H</span> (<span class="dt">S</span> p) q f a <span class="ot">&#8594;</span> <span class="dt">H</span> p (<span class="dt">S</span> q) f a<br />up   <span class="fu">=</span> <span class="fu">fmap</span> <span class="dt">SuccT</span> &#8728; unSuccB<br /><br />down <span class="ot">&#8759;</span> (<span class="kw">Functor</span> f, <span class="dt">IsNat</span> p) <span class="ot">&#8658;</span> <span class="dt">H</span> p (<span class="dt">S</span> q) f a <span class="ot">&#8594;</span> <span class="dt">H</span> (<span class="dt">S</span> p) q f a<br />down <span class="fu">=</span> <span class="dt">SuccB</span> &#8728; <span class="fu">fmap</span> unSuccT</code></pre>

<h3 id="so-what">So what?</h3>

<p>Why care about the multitude of views on trees?</p>

<ul>
<li>It&#8217;s pretty.</li>
<li>A future post will show how these hybrid trees enable an elegant formulation of parallel scanning that lends itself to an in-place, GPU-friendly implementation.</li>
</ul>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=460&amp;md5=3a9aceb3d722783b8e463cbfc78be6e4"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/a-third-view-on-trees/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fa-third-view-on-trees&amp;language=en_GB&amp;category=text&amp;title=A+third+view+on+trees&amp;description=A+few+recent+posts+have+played+with+trees+from+two+perspectives.+The+more+commonly+used+I+call+%26quot%3Btop-down%26quot%3B%2C+because+the+top-level+structure+is+most+immediately+apparent.+A+top-down+binary+tree...&amp;tags=functor%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Parallel tree scanning by composition</title>
		<link>http://conal.net/blog/posts/parallel-tree-scanning-by-composition</link>
		<comments>http://conal.net/blog/posts/parallel-tree-scanning-by-composition#comments</comments>
		<pubDate>Tue, 24 May 2011 20:31:23 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[program derivation]]></category>
		<category><![CDATA[scan]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=429</guid>
		<description><![CDATA[My last few blog posts have been on the theme of scans, and particularly on parallel scans. In Composable parallel scanning, I tackled parallel scanning in a very general setting. There are five simple building blocks out of which a vast assortment of data structures can be built, namely constant (no value), identity (one value), [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p>My last few blog posts have been on the theme of <em>scans</em>, and particularly on <em>parallel</em> scans. In <a href="http://conal.net/blog/posts/composable-parallel-scanning/" title="blog post"><em>Composable parallel scanning</em></a>, I tackled parallel scanning in a very general setting. There are five simple building blocks out of which a vast assortment of data structures can be built, namely constant (no value), identity (one value), sum, product, and composition. The post defined parallel prefix and suffix scan for each of these five &quot;functor combinators&quot;, in terms of the same scan operation on each of the component functors. Every functor built out of this basic set thus has a parallel scan. Functors defined more conventionally can be given scan implementations simply by converting to a composition of the basic set, scanning, and then back to the original functor. Moreover, I expect this implementation could be generated automatically, similarly to GHC&#8217;s <code>DerivingFunctor</code> extension.</p>

<p>Now I&#8217;d like to show two examples of parallel scan composition in terms of binary trees, namely the top-down and bottom-up variants of perfect binary leaf trees used in previous posts. (In previous posts, I used the terms &quot;right-folded&quot; and &quot;left-folded&quot; instead of &quot;top-down&quot; and &quot;bottom-up&quot;.) The resulting two algorithms are expressed nearly identically, but have differ significantly in the work performed. The top-down version does <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi><mo stretchy="false">)</mo></mrow></math> work, while the bottom-up version does only <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow></math>, and thus the latter algorithm is work-efficient, while the former is not. Moreover, with a <em>very</em> simple optimization, the bottom-up tree algorithm corresponds closely to Guy Blelloch&#8217;s parallel prefix scan for arrays, given in <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.5739" title="Paper by Guy Blelloch"><em>Programming parallel algorithms</em></a>. I&#8217;m delighted with this result, as I had been wondering how to think about Guy&#8217;s algorithm.</p>

<p><strong>Edit:</strong></p>

<ul>
<li>2011-05-31: Added <code>Scan</code> and <code>Applicative</code> instances for <code>T2</code> and <code>T4</code>.</li>
</ul>

<p><span id="more-429"></span></p>

<h3 id="scanning-via-functor-combinators">Scanning via functor combinators</h3>

<p>In <a href="http://conal.net/blog/posts/composable-parallel-scanning/" title="blog post"><em>Composable parallel scanning</em></a>, we saw the <code>Scan</code> class:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Scan</span> f <span class="kw">where</span><br />  prefixScan, suffixScan <span class="ot">&#8759;</span> <span class="dt">Monoid</span> m <span class="ot">&#8658;</span> f m <span class="ot">&#8594;</span> (m, f m)</code></pre>

<p>Given a structure of values, the prefix and suffix scan methods generate the overall <code>fold</code> (of type <code>m</code>), plus a structure of the same type as the input. (In contrast, the usual Haskell <code>scanl</code> and <code>scanr</code> functions on lists yield a single list with one more element than the source list. I changed the interface for generality and composability.) The <a href="http://conal.net/blog/posts/composable-parallel-scanning/" title="blog post">post</a> gave instances for the basic set of five functor combinators.</p>

<p>Most functors are not defined via the basic combinators, but as mentioned above, we can scan by conversion to and from the basic set. For convenience, encapsulate this conversion in a type class:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">EncodeF</span> f <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> f <span class="ot">&#8759;</span> <span class="fu">*</span> <span class="ot">&#8594;</span> <span class="fu">*</span><br />  encode <span class="ot">&#8759;</span> f a <span class="ot">&#8594;</span> <span class="dt">Enc</span> f a<br />  decode <span class="ot">&#8759;</span> <span class="dt">Enc</span> f a <span class="ot">&#8594;</span> f a</code></pre>

<p>and define scan functions via <code>EncodeF</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">prefixScanEnc, suffixScanEnc <span class="ot">&#8759;</span><br />  (<span class="dt">EncodeF</span> f, <span class="dt">Scan</span> (<span class="dt">Enc</span> f), <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> f m <span class="ot">&#8594;</span> (m, f m)<br />prefixScanEnc <span class="fu">=</span> second decode &#8728; prefixScan &#8728; encode<br />suffixScanEnc <span class="fu">=</span> second decode &#8728; suffixScan &#8728; encode</code></pre>

<h4 id="lists">Lists</h4>

<p>As a first example, consider</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> [] <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> [] <span class="fu">=</span> <span class="dt">Const</span> () <span class="fu">+</span> <span class="dt">Id</span> &#215; []<br />  encode [] <span class="fu">=</span> <span class="dt">InL</span> (<span class="dt">Const</span> ())<br />  encode (a <span class="fu">:</span> <span class="kw">as</span>) <span class="fu">=</span> <span class="dt">InR</span> (<span class="dt">Id</span> a &#215; <span class="kw">as</span>)<br />  decode (<span class="dt">InL</span> (<span class="dt">Const</span> ())) <span class="fu">=</span> []<br />  decode (<span class="dt">InR</span> (<span class="dt">Id</span> a &#215; <span class="kw">as</span>)) <span class="fu">=</span> a <span class="fu">:</span> <span class="kw">as</span></code></pre>

<p>And declare a boilerplate <code>Scan</code> instance via <code>EncodeF</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> [] <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> prefixScanEnc<br />  suffixScan <span class="fu">=</span> suffixScanEnc</code></pre>

<p>I haven&#8217;t checked the details, but I think with this instance, suffix scanning has okay performance, while prefix scan does quadratic work. The reason is the in the <code>Scan</code> instance for products, the two components are scanned independently (in parallel), and then the whole second component is adjusted for <code>prefixScan</code>, while the whole first component is adjusted for <code>suffixScan</code>. In the case of lists, the first component is the list head, and second component is the list tail.</p>

<p>For your reading convenience, here&#8217;s that <code>Scan</code> instance again:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> (<span class="dt">Scan</span> f, <span class="dt">Scan</span> g, <span class="kw">Functor</span> f, <span class="kw">Functor</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (f &#215; g) <span class="kw">where</span><br />  prefixScan (fa &#215; ga) <span class="fu">=</span> (af &#8853; ag, fa' &#215; ((af &#8853;) <span class="fu">&lt;$&gt;</span> ga'))<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> prefixScan fa<br />         (ag,ga') <span class="fu">=</span> prefixScan ga<br /><br />  suffixScan (fa &#215; ga) <span class="fu">=</span> (af &#8853; ag, ((&#8853; ag) <span class="fu">&lt;$&gt;</span> fa') &#215; ga')<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> suffixScan fa<br />         (ag,ga') <span class="fu">=</span> suffixScan ga</code></pre>

<p>The lop-sidedness of the list type thus interferes with parallelization, and makes the parallel scans perform much worse than cumulative sequential scans.</p>

<p>Let&#8217;s next look at a more balanced type.</p>

<h3 id="binary-trees">Binary Trees</h3>

<p>We&#8217;ll get better parallel performance by organizing our data so that we can cheaply partition it into roughly equal pieces. Tree types allows such partitioning.</p>

<h4 id="top-down-trees">Top-down trees</h4>

<p>We&#8217;ll try a few variations, starting with a simple binary tree.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T1</span> a <span class="fu">=</span> <span class="dt">L1</span> a <span class="fu">|</span> <span class="dt">B1</span> (<span class="dt">T1</span> a) (<span class="dt">T1</span> a) <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>Encoding and decoding is straightforward:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> <span class="dt">T1</span> <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T1</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">T1</span> &#215; <span class="dt">T1</span><br />  encode (<span class="dt">L1</span> a)   <span class="fu">=</span> <span class="dt">InL</span> (<span class="dt">Id</span> a)<br />  encode (<span class="dt">B1</span> s t) <span class="fu">=</span> <span class="dt">InR</span> (s &#215; t)<br />  decode (<span class="dt">InL</span> (<span class="dt">Id</span> a))  <span class="fu">=</span> <span class="dt">L1</span> a<br />  decode (<span class="dt">InR</span> (s &#215; t)) <span class="fu">=</span> <span class="dt">B1</span> s t<br /><br /><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">T1</span> <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> prefixScanEnc<br />  suffixScan <span class="fu">=</span> suffixScanEnc</code></pre>

<p>Note that these definitions could be generated automatically from the data type definition.</p>

<p>For <em>balanced trees</em>, prefix and suffix scan divide the problem in half at each step, solve each half, and do linear work to patch up one of the two halves. Letting <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> be the number of elements, and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>W</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow></math> the work, we have the recurrence <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>W</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo><mo>=</mo><mn>2</mn><mspace width="0.167em"></mspace><mi>W</mi><mo stretchy="false">(</mo><mi>n</mi><mo>/</mo><mn>2</mn><mo stretchy="false">)</mo><mo>+</mo><mi>c</mi><mspace width="0.167em"></mspace><mi>n</mi></mrow></math> for some constant factor <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>c</mi></mrow></math>. By the <a href="http://en.wikipedia.org/wiki/Master_theorem" title="Wikipedia entry">Master theorem</a>, therefore, the work done is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi><mo stretchy="false">)</mo></mrow></math>. (Use case 2, with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>a</mi><mo>=</mo><mi>b</mi><mo>=</mo><mn>2</mn></mrow></math>, <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo><mo>=</mo><mi>c</mi><mspace width="0.167em"></mspace><mi>n</mi></mrow></math>, and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow></math>.)</p>

<p>Again assuming a <em>balanced</em> tree, the computation dependencies have logarithmic depth, so the ideal parallel running time (assuming sufficient processors) is <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>log</mi><mi>n</mi><mo stretchy="false">)</mo></mrow></math>. Thus we have an algorithm that is depth-efficient (modulo constant factors) but work-inefficient.</p>

<h4 id="composition">Composition</h4>

<p>A binary tree as defined above is either a leaf or a pair of binary trees. We can make this pair-ness more explicit with a reformulation:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T2</span> a <span class="fu">=</span> <span class="dt">L2</span> a <span class="fu">|</span> <span class="dt">B2</span> (<span class="dt">Pair</span> (<span class="dt">T2</span> a)) <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>where <code>Pair</code>, as in <a href="http://conal.net/blog/posts/composable-parallel-scanning/" title="blog post"><em>Composable parallel scanning</em></a>, is defined as</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Pair</span> a <span class="fu">=</span> a <span class="fu">:#</span> a <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>or even</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">Pair</span> <span class="fu">=</span> <span class="dt">Id</span> &#215; <span class="dt">Id</span></code></pre>

<p>For encoding and decoding, we could use the same representation as with <code>T1</code>, but let&#8217;s instead use a more natural one for the definition of <code>T2</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> <span class="dt">T2</span> <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T2</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">Pair</span> &#8728; <span class="dt">T2</span><br />  encode (<span class="dt">L2</span> a)  <span class="fu">=</span> <span class="dt">InL</span> (<span class="dt">Id</span> a)<br />  encode (<span class="dt">B2</span> st) <span class="fu">=</span> <span class="dt">InR</span> (<span class="dt">O</span> st)<br />  decode (<span class="dt">InL</span> (<span class="dt">Id</span> a)) <span class="fu">=</span> <span class="dt">L2</span> a<br />  decode (<span class="dt">InR</span> (<span class="dt">O</span> st)) <span class="fu">=</span> <span class="dt">B2</span> st</code></pre>

<p>Boilerplate scanning:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">T2</span> <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> prefixScanEnc<br />  suffixScan <span class="fu">=</span> suffixScanEnc</code></pre>

<p>for which we&#8217;ll need an applicative instance:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Applicative</span> <span class="dt">T2</span> <span class="kw">where</span><br />  pure <span class="fu">=</span> <span class="dt">L2</span><br />  <span class="dt">L2</span> f <span class="fu">&lt;*&gt;</span> <span class="dt">L2</span> x <span class="fu">=</span> <span class="dt">L2</span> (f x)<br />  <span class="dt">B2</span> (fs <span class="fu">:#</span> gs) <span class="fu">&lt;*&gt;</span> <span class="dt">B2</span> (xs <span class="fu">:#</span> ys) <span class="fu">=</span> <span class="dt">B2</span> ((fs <span class="fu">&lt;*&gt;</span> xs) <span class="fu">:#</span> (gs <span class="fu">&lt;*&gt;</span> ys))<br />  _ <span class="fu">&lt;*&gt;</span> _ <span class="fu">=</span> <span class="fu">error</span> <span class="st">&quot;T2 (&lt;*&gt;): structure mismatch&quot;</span></code></pre>

<p>The <code>O</code> constructor is for functor composition.</p>

<p>With a small change to the tree type, we can make the composition of <code>Pair</code> and <code>T</code> more explicit:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T3</span> a <span class="fu">=</span> <span class="dt">L3</span> a <span class="fu">|</span> <span class="dt">B3</span> ((<span class="dt">Pair</span> &#8728; <span class="dt">T3</span>) a) <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>Then the conversion becomes even simpler, since there&#8217;s no need to add or remove <code>O</code> wrappers:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> <span class="dt">T3</span> <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T3</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">Pair</span> &#8728; <span class="dt">T3</span><br />  encode (<span class="dt">L3</span> a)  <span class="fu">=</span> <span class="dt">InL</span> (<span class="dt">Id</span> a)<br />  encode (<span class="dt">B3</span> st) <span class="fu">=</span> <span class="dt">InR</span> st<br />  decode (<span class="dt">InL</span> (<span class="dt">Id</span> a)) <span class="fu">=</span> <span class="dt">L3</span> a<br />  decode (<span class="dt">InR</span> st)     <span class="fu">=</span> <span class="dt">B3</span> st</code></pre>

<h4 id="bottom-up-trees">Bottom-up trees</h4>

<p>In the formulations above, a non-leaf tree consists of a pair of trees. I&#8217;ll call these trees &quot;top-down&quot;, since visible pair structure begins at the top.</p>

<p>With a very small change, we can instead use a tree of pairs:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T4</span> a <span class="fu">=</span> <span class="dt">L4</span> a <span class="fu">|</span> <span class="dt">B4</span> (<span class="dt">T4</span> (<span class="dt">Pair</span> a)) <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>Again an applicative instance allows a standard <code>Scan</code> instance:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">T4</span> <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> prefixScanEnc<br />  suffixScan <span class="fu">=</span> suffixScanEnc<br /><br /><span class="kw">instance</span> <span class="dt">Applicative</span> <span class="dt">T4</span> <span class="kw">where</span><br />  pure <span class="fu">=</span> <span class="dt">L4</span><br />  <span class="dt">L4</span> f   <span class="fu">&lt;*&gt;</span> <span class="dt">L4</span> x   <span class="fu">=</span> <span class="dt">L4</span> (f x)<br />  <span class="dt">B4</span> fgs <span class="fu">&lt;*&gt;</span> <span class="dt">B4</span> xys <span class="fu">=</span> <span class="dt">B4</span> (liftA2 h fgs xys)<br />   <span class="kw">where</span> h (f <span class="fu">:#</span> g) (x <span class="fu">:#</span> y) <span class="fu">=</span> f x <span class="fu">:#</span> g y<br />  _ <span class="fu">&lt;*&gt;</span> _ <span class="fu">=</span> <span class="fu">error</span> <span class="st">&quot;T4 (&lt;*&gt;): structure mismatch&quot;</span></code></pre>

<p>or a more explicitly composed form:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T5</span> a <span class="fu">=</span> <span class="dt">L5</span> a <span class="fu">|</span> <span class="dt">B5</span> ((<span class="dt">T5</span> &#8728; <span class="dt">Pair</span>) a) <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>I&#8217;ll call these new variations &quot;bottom-up&quot; trees, since visible pair structure begins at the bottom. After stripping off the branch constructor, <code>B4</code>, we can get at the pair-valued leaves by means of <code>fmap</code>, <code>fold</code>, or <code>traverse</code> (or variations). For <code>B5</code>, we&#8217;d also have to strip off the <code>O</code> wrapper (functor composition).</p>

<p>Encoding is nearly the same as with top-down trees. For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> <span class="dt">T4</span> <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T4</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">T4</span> &#8728; <span class="dt">Pair</span><br />  encode (<span class="dt">L4</span> a) <span class="fu">=</span> <span class="dt">InL</span> (<span class="dt">Id</span> a)<br />  encode (<span class="dt">B4</span> t) <span class="fu">=</span> <span class="dt">InR</span> (<span class="dt">O</span> t)<br />  decode (<span class="dt">InL</span> (<span class="dt">Id</span> a)) <span class="fu">=</span> <span class="dt">L4</span> a<br />  decode (<span class="dt">InR</span> (<span class="dt">O</span> t))  <span class="fu">=</span> <span class="dt">B4</span> t</code></pre>

<h3 id="scanning-pairs">Scanning pairs</h3>

<p>We&#8217;ll need to scan on the <code>Pair</code> functor. If we use the definition of <code>Pair</code> above in terms of <code>Id</code> and <code>(×)</code>, then we&#8217;ll get scanning for free. For <em>using</em> <code>Pair</code>, I find the explicit data type definition above more convenient. We can then derive a <code>Scan</code> instance by conversion. Start with a standard specification:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Pair</span> a <span class="fu">=</span> a <span class="fu">:#</span> a <span class="kw">deriving</span> <span class="kw">Functor</span></code></pre>

<p>And encode &amp; decode explicitly:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">EncodeF</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  <span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">Pair</span> <span class="fu">=</span> <span class="dt">Id</span> &#215; <span class="dt">Id</span><br />  encode (a <span class="fu">:#</span> b) <span class="fu">=</span> <span class="dt">Id</span> a &#215; <span class="dt">Id</span> b<br />  decode (<span class="dt">Id</span> a &#215; <span class="dt">Id</span> b) <span class="fu">=</span> a <span class="fu">:#</span> b</code></pre>

<p>Then use our boilerplate <code>Scan</code> instance for <code>EncodeF</code> instances:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> prefixScanEnc<br />  suffixScan <span class="fu">=</span> suffixScanEnc</code></pre>

<p>We&#8217;ve seen the <code>Scan</code> instance for <code>(×)</code> above. The instance for <code>Id</code> is very simple:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Id</span> a <span class="fu">=</span> <span class="dt">Id</span> a<br /><br /><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">Id</span> <span class="kw">where</span><br />  prefixScan (<span class="dt">Id</span> m) <span class="fu">=</span> (m, <span class="dt">Id</span> &#8709;)<br />  suffixScan        <span class="fu">=</span> prefixScan</code></pre>

<p>Given these definitions, we can calculate a more streamlined <code>Scan</code> instance for <code>Pair</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (a <span class="fu">:#</span> b)<br />&#8801;  <span class="co">{- specification -}</span><br />  prefixScanEnc (a <span class="fu">:#</span> b)<br />&#8801;  <span class="co">{- prefixScanEnc definition -}</span><br />  (second decode &#8728; prefixScan &#8728; encode) (a <span class="fu">:#</span> b)<br />&#8801;  <span class="co">{- (&#8728;) -}</span><br />  second decode (prefixScan (encode (a <span class="fu">:#</span> b)))<br />&#8801;  <span class="co">{- encode definition for Pair -}</span><br />  second decode (prefixScan (<span class="dt">Id</span> a &#215; <span class="dt">Id</span> b))<br />&#8801;  <span class="co">{- prefixScan definition for f &#215; g -}</span><br />  second decode<br />    (af &#8853; ag, fa' &#215; ((af &#8853;) <span class="fu">&lt;$&gt;</span> ga'))<br />     <span class="kw">where</span> (af,fa') <span class="fu">=</span> prefixScan (<span class="dt">Id</span> a)<br />           (ag,ga') <span class="fu">=</span> prefixScan (<span class="dt">Id</span> b)<br />&#8801;  <span class="co">{- Definition of second on functions -}</span><br />  (af &#8853; ag, decode (fa' &#215; ((af &#8853;) <span class="fu">&lt;$&gt;</span> ga')))<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> prefixScan (<span class="dt">Id</span> a)<br />         (ag,ga') <span class="fu">=</span> prefixScan (<span class="dt">Id</span> b)<br />&#8801;  <span class="co">{- prefixScan definition for Id -}</span><br />  (af &#8853; ag, decode (fa' &#215; ((af &#8853;) <span class="fu">&lt;$&gt;</span> ga')))<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> (a, <span class="dt">Id</span> &#8709;)<br />         (ag,ga') <span class="fu">=</span> (b, <span class="dt">Id</span> &#8709;)<br />&#8801;  <span class="co">{- substitution -}</span><br />  (a &#8853; b, decode (<span class="dt">Id</span> &#8709; &#215; ((a &#8853;) <span class="fu">&lt;$&gt;</span> <span class="dt">Id</span> &#8709;)))<br />&#8801;  <span class="co">{- fmap/(&lt;$&gt;) for Id -}</span><br />  (a &#8853; b, decode (<span class="dt">Id</span> &#8709; &#215; <span class="dt">Id</span> (a &#8853; &#8709;)))<br />&#8801;  <span class="co">{- Monoid law -}</span><br />  (a &#8853; b, decode (<span class="dt">Id</span> &#8709; &#215; <span class="dt">Id</span> a))<br />&#8801;  <span class="co">{- decode definition on Pair -}</span><br />  (a &#8853; b, (&#8709; <span class="fu">:#</span> a))</code></pre>

<p>Whew! And similarly for <code>suffixScan</code>.</p>

<p>Now let&#8217;s recall the <code>Scan</code> instance for <code>Pair</code> given in <a href="http://conal.net/blog/posts/composable-parallel-scanning/" title="blog post"><em>Composable parallel scanning</em></a>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  prefixScan (a <span class="fu">:#</span> b) <span class="fu">=</span> (a &#8853; b, (&#8709; <span class="fu">:#</span> a))<br />  suffixScan (a <span class="fu">:#</span> b) <span class="fu">=</span> (a &#8853; b, (b <span class="fu">:#</span> &#8709;))</code></pre>

<p>Hurray! The derivation led us to the same definition. A &quot;sufficiently smart&quot; compiler could do this derivation automatically.</p>

<p>With this warm-up derivation, let&#8217;s now turn to trees.</p>

<h3 id="scanning-trees">Scanning trees</h3>

<p>Given the tree encodings above, how does scan work? We&#8217;ll have to consult <code>Scan</code> instances for some of the functor combinators. The product instance is repeated above. We&#8217;ll also want the instances for sum and composition. Omitting the <code>suffixScan</code> definitions for brevity:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> (f <span class="fu">+</span> g) a <span class="fu">=</span> <span class="dt">InL</span> (f a) <span class="fu">|</span> <span class="dt">InR</span> (g a)<br /><br /><span class="kw">instance</span> (<span class="dt">Scan</span> f, <span class="dt">Scan</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (f <span class="fu">+</span> g) <span class="kw">where</span><br />  prefixScan (<span class="dt">InL</span> fa) <span class="fu">=</span> second <span class="dt">InL</span> (prefixScan fa)<br />  prefixScan (<span class="dt">InR</span> ga) <span class="fu">=</span> second <span class="dt">InR</span> (prefixScan ga)<br /><br /><span class="kw">newtype</span> (g &#8728; f) a <span class="fu">=</span> <span class="dt">O</span> (g (f a))<br /><br /><span class="kw">instance</span> (<span class="dt">Scan</span> g, <span class="dt">Scan</span> f, <span class="kw">Functor</span> f, <span class="dt">Applicative</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (g &#8728; f) <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>)<br />             &#8728; assocR<br />             &#8728; first prefixScan<br />             &#8728; <span class="fu">unzip</span><br />             &#8728; <span class="fu">fmap</span> prefixScan<br />             &#8728; unO</code></pre>

<p>This last definition uses a few utility functions:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">zip</span> <span class="ot">&#8759;</span> <span class="dt">Applicative</span> g <span class="ot">&#8658;</span> (g a, g b) <span class="ot">&#8594;</span> g (a,b)<br /><span class="fu">zip</span> <span class="fu">=</span> <span class="fu">uncurry</span> (liftA2 (,))<br /><br /><span class="fu">unzip</span> <span class="ot">&#8759;</span> <span class="kw">Functor</span> g <span class="ot">&#8658;</span> g (a,b) <span class="ot">&#8594;</span> (g a, g b)<br /><span class="fu">unzip</span> <span class="fu">=</span> <span class="fu">fmap</span> <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> <span class="fu">fmap</span> <span class="fu">snd</span><br /><br />assocR <span class="ot">&#8759;</span> ((a,b),c) <span class="ot">&#8594;</span> (a,(b,c))<br />assocR   ((a,b),c) <span class="fu">=</span>  (a,(b,c))<br /><br />adjustL <span class="ot">&#8759;</span> (<span class="kw">Functor</span> f, <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> (m, f m) <span class="ot">&#8594;</span> f m<br />adjustL (m, ms) <span class="fu">=</span> (m &#8853;) <span class="fu">&lt;$&gt;</span> ms</code></pre>

<p>Let&#8217;s consider how the <code>Scan (g ∘ f)</code> instance plays out for top-down vs bottom-up trees, given the functor-composition encodings above. The critical definitions:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T2</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">Pair</span> &#8728; <span class="dt">T2</span><br /><br /><span class="kw">type</span> <span class="dt">Enc</span> <span class="dt">T4</span> <span class="fu">=</span> <span class="dt">Id</span> <span class="fu">+</span> <span class="dt">T4</span> &#8728; <span class="dt">Pair</span></code></pre>

<p>Focusing on the branch case, we have <code>Pair ∘ T2</code> vs <code>T4 ∘ Pair</code>, so we&#8217;ll use the <code>Scan (g ∘ f)</code> instance either way. Let&#8217;s consider the work implied by that instance. There are two calls to <code>prefixScan</code>, plus a linear amount of other work. The meanings of those two calls differ, however:</p>

<ul>
<li>For top-down trees (<code>T2</code>), the recursive tree scans are in <code>fmap prefixScan</code>, mapping over the pair of trees. The <code>first prefixScan</code> is a pair scan and so does constant work. Since there are two recursive calls, each working on a tree of half size (assuming balance), plus linear other work, the total work <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi><mo stretchy="false">)</mo></mrow></math>, as explained above.</li>
<li>For bottom-up trees (<code>T4</code>), there is only one recursive recursive tree scan, which appears in <code>first prefixScan</code>. The <code>prefixScan</code> in <code>fmap prefixScan</code> is pair scan and so does constant work but is mapped over the half-sized tree (of pairs), and so does linear work altogether. Since there only one recursive tree scan, at half size, plus linear other work, the total work is then proportional to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>+</mo><mi>n</mi><mo>/</mo><mn>2</mn><mo>+</mo><mi>n</mi><mo>/</mo><mn>4</mn><mo>+</mo><mo>&#8230;</mo><mo>&#8776;</mo><mn>2</mn><mspace width="0.167em"></mspace><mi>n</mi><mo>=</mo><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow></math>. So we have a work-efficient algorithm!</li>
</ul>

<h3 id="looking-deeper">Looking deeper</h3>

<p>In addition to the simple analysis above of scanning over top-down and over bottom-up, let&#8217;s look in detail at what transpires and how each case can be optimized. This section may well have more detail than you&#8217;re interested in. If so, feel free to skip ahead.</p>

<h4 id="top-down">Top-down</h4>

<p>Beginning as with <code>Pair</code>,</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan t<br />&#8801;  <span class="co">{- specification -}</span><br />  prefixScanEnc t<br />&#8801;  <span class="co">{- prefixScanEnc definition -}</span><br />  (second decode &#8728; prefixScan &#8728; encode) t<br />&#8801;  <span class="co">{- (&#8728;) -}</span><br />  second decode (prefixScan (encode t))</code></pre>

<p>Take <code>T2</code>, with <code>T3</code> being quite similar. Now split into two cases for the two constructors of <code>T2</code>. First leaf:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">L2</span> m)<br />&#8801;  <span class="co">{- as above -}</span><br />  second decode (prefixScan (encode (<span class="dt">L2</span> m)))<br />&#8801;  <span class="co">{- encode for L2 -}</span><br />  second decode (prefixScan (<span class="dt">InL</span> (<span class="dt">Id</span> m)))<br />&#8801;  <span class="co">{- prefixScan for functor sum -}</span><br />  second decode (second <span class="dt">InL</span> (prefixScan (<span class="dt">Id</span> m)))<br />&#8801;  <span class="co">{- prefixScan for Id -}</span><br />  second decode (second <span class="dt">InL</span> (m, <span class="dt">Id</span> &#8709;))<br />&#8801;  <span class="co">{- second for functions -}</span><br />  second decode (m, <span class="dt">InL</span> (<span class="dt">Id</span> &#8709;))<br />&#8801;  <span class="co">{- second for functions -}</span><br />  (m, decode (<span class="dt">InL</span> (<span class="dt">Id</span> &#8709;)))<br />&#8801;  <span class="co">{- decode for L2 -}</span><br />  (m, <span class="dt">L2</span> &#8709;)</code></pre>

<p>Then branch:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">B2</span> (s <span class="fu">:#</span> t))<br />&#8801;  <span class="co">{- as above -}</span><br />  second decode (prefixScan (encode (<span class="dt">B2</span> (s <span class="fu">:#</span> t))))<br />&#8801;  <span class="co">{- encode for L2 -}</span><br />  second decode (prefixScan (<span class="dt">InR</span> (<span class="dt">O</span> (s <span class="fu">:#</span> t))))<br />&#8801;  <span class="co">{- prefixScan for (+) -}</span><br />  second decode (second <span class="dt">InR</span> (prefixScan (<span class="dt">O</span> (s <span class="fu">:#</span> t))))<br />&#8801;  <span class="co">{- property of second -}</span><br />  second (decode &#8728; <span class="dt">InR</span>) (prefixScan (<span class="dt">O</span> (s <span class="fu">:#</span> t)))</code></pre>

<p>Focus on the <code>prefixScan</code> application:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">O</span> (s <span class="fu">:#</span> t)) <span class="fu">=</span><br />&#8801;  <span class="co">{- prefixScan for (&#8728;) -}</span><br /> ( second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan<br /> &#8728; <span class="fu">unzip</span> &#8728; <span class="fu">fmap</span> prefixScan &#8728; unO ) (<span class="dt">O</span> (s <span class="fu">:#</span> t))<br />&#8801;  <span class="co">{- unO/O -}</span><br />  ( second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan<br />  &#8728; <span class="fu">unzip</span> &#8728; <span class="fu">fmap</span> prefixScan ) (s <span class="fu">:#</span> t)<br />&#8801;  <span class="co">{- fmap on Pair -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan &#8728; <span class="fu">unzip</span>)<br />    (prefixScan s <span class="fu">:#</span> prefixScan t)<br />&#8801;  <span class="co">{- expand prefixScan -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan &#8728; <span class="fu">unzip</span>)<br />    ((ms,s') <span class="fu">:#</span> (mt,t'))<br />      <span class="kw">where</span> (ms,s') <span class="fu">=</span> prefixScan s<br />            (mt,t') <span class="fu">=</span> prefixScan t<br />&#8801;  <span class="co">{- unzip -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan)<br />    ((ms <span class="fu">:#</span> mt), (s' <span class="fu">:#</span> t')) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- first -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR)<br />    (prefixScan (ms <span class="fu">:#</span> mt), (s' <span class="fu">:#</span> t')) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- prefixScan for Pair -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR)<br />    ((ms &#8853; mt, (&#8709; <span class="fu">:#</span> ms)), (s' <span class="fu">:#</span> t')) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- assocR -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>))<br />    (ms &#8853; mt, ((&#8709; <span class="fu">:#</span> ms), (s' <span class="fu">:#</span> t'))) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- second -}</span><br />  ( ms &#8853; mt<br />  , (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) ((&#8709; <span class="fu">:#</span> ms), (s' <span class="fu">:#</span> t')) ) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- zip -}</span><br />  ( ms &#8853; mt<br />  , (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL) ((&#8709;,s') <span class="fu">:#</span> (ms,t')) )  <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- fmap for Pair -}</span><br />  ( ms &#8853; mt<br />  , <span class="dt">O</span> (adjustL (&#8709;,s') <span class="fu">:#</span> adjustL (ms,t')) )  <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- adjustL -}</span><br />  ( ms &#8853; mt<br />  , <span class="dt">O</span> (((&#8709; &#8853;) <span class="fu">&lt;$&gt;</span> s') <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t')) )  <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- Monoid law (left identity) -}</span><br />  ( ms &#8853; mt<br />  , <span class="dt">O</span> ((<span class="fu">id</span> <span class="fu">&lt;$&gt;</span> s') <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t')) )  <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- Functor law (fmap id) -}</span><br />  ( ms &#8853; mt<br />  , <span class="dt">O</span> (s' <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t')) )<br />      <span class="kw">where</span> (ms,s') <span class="fu">=</span> prefixScan s<br />            (mt,t') <span class="fu">=</span> prefixScan t</code></pre>

<p>Continuing from above,</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">B2</span> (s <span class="fu">:#</span> t))<br />&#8801;  <span class="co">{- see above -}</span><br />  second (decode &#8728; <span class="dt">InR</span>) (prefixScan (<span class="dt">O</span> (s <span class="fu">:#</span> t)))<br />&#8801;  <span class="co">{- prefixScan focus from above -}</span><br />  second (decode &#8728; <span class="dt">InR</span>)<br />    ( ms &#8853; mt<br />    , <span class="dt">O</span> (s' <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t')) )<br />        <span class="kw">where</span> (ms,s') <span class="fu">=</span> prefixScan s<br />              (mt,t') <span class="fu">=</span> prefixScan t<br />&#8801;  <span class="co">{- definition of second on functions -}</span><br />    (ms &#8853; mt, (decode &#8728; <span class="dt">InR</span>) (<span class="dt">O</span> (s' <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t')))) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- (&#8728;) -}</span><br />    (ms &#8853; mt, decode (<span class="dt">InR</span> (<span class="dt">O</span> (s' <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t'))))) <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- decode for B2 -}</span><br />    (ms &#8853; mt, <span class="dt">B2</span> (s' <span class="fu">:#</span> ((ms &#8853;) <span class="fu">&lt;$&gt;</span> t'))) <span class="kw">where</span> &#8943;</code></pre>

<p>This final form is as in <a href="http://conal.net/blog/posts/deriving-parallel-tree-scans/" title="blog post"><em>Deriving parallel tree scans</em></a>, changed for the new scan interface. The derivation saved some work in wrapping &amp; unwrapping and method invocation, plus one of the two adjustment passes over the sub-trees. As explained above, this algorithm performs <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi><mo stretchy="false">)</mo></mrow></math> work.</p>

<p>I&#8217;ll leave <code>suffixScan</code> for you to do yourself.</p>

<h4 id="bottom-up">Bottom-up</h4>

<p>What happens if we switch from top-down to bottom-up binary trees? I&#8217;ll use <code>T4</code> (though <code>T5</code> would work as well):</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T4</span> a <span class="fu">=</span> <span class="dt">L4</span> a <span class="fu">|</span> <span class="dt">B4</span> (<span class="dt">T4</span> (<span class="dt">Pair</span> a))</code></pre>

<p>The leaf case is just as with <code>T2</code> above, so let&#8217;s get right to branches.</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">B4</span> t)<br />&#8801;  <span class="co">{- as above -}</span><br />  second decode (prefixScan (encode (<span class="dt">B4</span> t)))<br />&#8801;  <span class="co">{- encode for L2 -}</span><br />  second decode (prefixScan (<span class="dt">InR</span> (<span class="dt">O</span> t)))<br />&#8801;  <span class="co">{- prefixScan for (+) -}</span><br />  second decode (second <span class="dt">InR</span> (prefixScan (<span class="dt">O</span> t)))<br />&#8801;  <span class="co">{- property of second -}</span><br />  second (decode &#8728; <span class="dt">InR</span>) (prefixScan (<span class="dt">O</span> t))</code></pre>

<p>As before, now focus on the <code>prefixScan</code> call.</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">O</span> t) <span class="fu">=</span><br />&#8801;  <span class="co">{- prefixScan for (&#8728;) -}</span><br /> ( second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan<br /> &#8728; <span class="fu">unzip</span> &#8728; <span class="fu">fmap</span> prefixScan &#8728; unO ) (<span class="dt">O</span> t)<br />&#8801;  <span class="co">{- unO/O -}</span><br />  ( second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan<br />  &#8728; <span class="fu">unzip</span> &#8728; <span class="fu">fmap</span> prefixScan ) t<br />&#8801;  <span class="co">{- prefixScan on Pair (derived above) -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan &#8728; <span class="fu">unzip</span>)<br />    <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b, (&#8709; <span class="fu">:#</span> a))) t<br />&#8801;  <span class="co">{- unzip/fmap -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR &#8728; first prefixScan)<br />    ( <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b)) t<br />    , <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (&#8709; <span class="fu">:#</span> a))   t )<br />&#8801;  <span class="co">{- first on functions -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR)<br />    ( prefixScan (<span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b)) t)<br />    , <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (&#8709; <span class="fu">:#</span> a))   t )<br />&#8801;  <span class="co">{- expand prefixScan -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) &#8728; assocR)<br />    ((mp,p'), <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (&#8709; <span class="fu">:#</span> a)) t)<br />   <span class="kw">where</span> (mp,p') <span class="fu">=</span> prefixScan (<span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b)) t)<br />&#8801;  <span class="co">{- assocR -}</span><br />  (second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>))<br />    (mp, (p', <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (&#8709; <span class="fu">:#</span> a)) t))<br />   <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- second on functions -}</span><br />  (mp, (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>) (p', <span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (&#8709; <span class="fu">:#</span> a)) t))<br />    <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- fmap/zip/fmap -}</span><br />  (mp, <span class="dt">O</span> (liftA2 tweak p' t))<br />    <span class="kw">where</span> tweak s (a <span class="fu">:#</span> _) <span class="fu">=</span> adjustL (s, (&#8709; <span class="fu">:#</span> a))<br />          (mp,p') <span class="fu">=</span> prefixScan (<span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b)) t)<br />&#8801;  <span class="co">{- adjustL, then simplify -}</span><br />  (mp, <span class="dt">O</span> (liftA2 tweak p' t))<br />    <span class="kw">where</span> tweak s (a <span class="fu">:#</span> _) <span class="fu">=</span> (s <span class="fu">:#</span> s &#8853; a)<br />          (mp,p') <span class="fu">=</span> prefixScan (<span class="fu">fmap</span> (&#955; (a <span class="fu">:#</span> b) <span class="ot">&#8594;</span> (a &#8853; b)) t)</code></pre>

<p>Now re-introduce the context of <code>prefixScan (O t)</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan (<span class="dt">B4</span> t)<br />&#8801;  <span class="co">{- see above -}</span><br />  second (decode &#8728; <span class="dt">InR</span>) (prefixScan (<span class="dt">O</span> t))<br />&#8801;  <span class="co">{- see above -}</span><br />  second (decode &#8728; <span class="dt">InR</span>)<br />    (mp, <span class="dt">O</span> (liftA2 tweak p' t))<br />      <span class="kw">where</span> &#8943;<br />&#8801;  <span class="co">{- decode for T4 -}</span><br />  (mp, <span class="dt">B4</span> (liftA2 tweak p' t))<br />    <span class="kw">where</span> p <span class="fu">=</span> <span class="fu">fmap</span> (&#955; (e <span class="fu">:#</span> o) <span class="ot">&#8594;</span> (e &#8853; o)) t<br />          (mp,p') <span class="fu">=</span> prefixScan p<br />          tweak s (e <span class="fu">:#</span> _) <span class="fu">=</span> (s <span class="fu">:#</span> s &#8853; e)</code></pre>

<p>Notice how much this bottom-up tree scan algorithm differs from the top-down algorithm derived above. In particular, there&#8217;s only one recursive tree scan (on a half-sized tree) instead of two, plus linear additional work, for a total of <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mo stretchy="false">(</mo><mi>n</mi><mo stretchy="false">)</mo></mrow></math> work.</p>

<h3 id="guy-blellochs-parallel-scan-algorithm">Guy Blelloch&#8217;s parallel scan algorithm</h3>

<p>In <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.5739" title="Paper by Guy Blelloch"><em>Programming parallel algorithms</em></a>, Guy Blelloch gives the following algorithm for parallel prefix scan, expressed in the parallel functional language NESL:</p>

<pre class="sourceCode"><code class="sourceCode haskell">function scan(a) <span class="fu">=</span><br /><span class="kw">if</span> <span class="fu">#</span>a &#8801; <span class="dv">1</span> <span class="kw">then</span> [<span class="dv">0</span>]<br /><span class="kw">else</span><br />  <span class="kw">let</span> es <span class="fu">=</span> even_elts(a);<br />      os <span class="fu">=</span> odd_elts(a);<br />      ss <span class="fu">=</span> scan({e<span class="fu">+</span>o<span class="fu">:</span> e <span class="kw">in</span> es; o <span class="kw">in</span> os})<br />  <span class="kw">in</span> interleave(ss,{s<span class="fu">+</span>e<span class="fu">:</span> s <span class="kw">in</span> ss; e <span class="kw">in</span> es})</code></pre>

<p>This algorithm is nearly identical to the <code>T4</code> scan algorithm above. I was very glad to find this route to Guy&#8217;s algorithm, which had been fairly mysterious to me. I mean, I could believe that the algorithm worked, but I had no idea how I might have discovered it myself. With the functor composition approach to scanning, I now see how Guy&#8217;s algorithm emerges as well as how it generalizes to other data structures.</p>

<h3 id="nested-data-types-and-parallelism">Nested data types and parallelism</h3>

<p>Most of the recursive algebraic data types that appear in Haskell programs are <em>regular</em>, meaning that the recursive instances are instantiated with the same type parameter as the containing type. For instance, a top-down tree of elements of type <code>a</code> is either a leaf or has two trees whose elements have that same type <code>a</code>. In contrast, in a bottom-up tree, the (single) recursively contained tree is over elements of type <code>(a,a)</code>. Such non-regular data types are called &quot;nested&quot;. The two tree scan algorithms above suggest to me that nested data types are particularly useful for efficient parallel algorithms.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=429&amp;md5=a05805e935f7c2c3d368a59c3a7c2adb"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/parallel-tree-scanning-by-composition/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fparallel-tree-scanning-by-composition&amp;language=en_GB&amp;category=text&amp;title=Parallel+tree+scanning+by+composition&amp;description=My+last+few+blog+posts+have+been+on+the+theme+of+scans%2C+and+particularly+on+parallel+scans.+In+Composable+parallel+scanning%2C+I+tackled+parallel+scanning+in+a+very+general+setting....&amp;tags=functor%2Cprogram+derivation%2Cscan%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Composable parallel scanning</title>
		<link>http://conal.net/blog/posts/composable-parallel-scanning</link>
		<comments>http://conal.net/blog/posts/composable-parallel-scanning#comments</comments>
		<pubDate>Tue, 01 Mar 2011 22:33:36 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[scan]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=411</guid>
		<description><![CDATA[The post Deriving list scans gave a simple specification of the list-scanning functions scanl and scanr, and then transformed those specifications into the standard optimized implementations. Next, the post Deriving parallel tree scans adapted the specifications and derivations to a type of binary trees. The resulting implementations are parallel-friendly, but not work-efficient, in that they [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p>The post <a href="http://conal.net/blog/posts/deriving-list-scans/" title="blog post"><em>Deriving list scans</em></a> gave a simple specification of the list-scanning functions <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-List.html#v:scanl"><code>scanl</code></a> and <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-List.html#v:scanr"><code>scanr</code></a>, and then transformed those specifications into the standard optimized implementations. Next, the post <a href="http://conal.net/blog/posts/deriving-parallel-tree-scans/" title="blog post"><em>Deriving parallel tree scans</em></a> adapted the specifications and derivations to a type of binary trees. The resulting implementations are parallel-friendly, but not work-efficient, in that they perform <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi></mrow></math> work vs linear work as in the best-known sequential algorithm.</p>

<p>Besides the work-inefficiency, I don&#8217;t know how to extend the critical <code>initTs</code> and <code>tailTs</code> functions (analogs of <code>inits</code> and <code>tails</code> on lists) to depth-typed, perfectly balanced trees, of the sort I played with in <a href="http://conal.net/blog/posts/a-trie-for-length-typed-vectors/" title="blog post"><em>A trie for length-typed vectors</em></a> and <a href="http://conal.net/blog/posts/from-tries-to-trees/" title="blog post"><em>From tries to trees</em></a>. The difficulty I encounter is that the functions <code>initTs</code> and <code>tailTs</code> make unbalanced trees out of balanced ones, so I don&#8217;t know how to adapt the specifications when types prevent the existence of unbalanced trees.</p>

<p>This new post explores an approach to generalized scanning via type classes. After defining the classes and giving a simple example, I&#8217;ll give a simple &amp; general framework based on composing functor combinators.</p>

<p><strong>Edits:</strong></p>

<ul>
<li>2011-03-02: Fixed typo. &quot;constant functor is easiest&quot; (instead of &quot;identity functor&quot;). Thanks, frguybob.</li>
<li>2011-03-05: Removed final unfinished sentence.</li>
<li>2011-07-28: Replace &quot;<code>assocL</code>&quot; with &quot;<code>assocR</code>&quot; in <code>prefixScan</code> derivation for <code>g ∘ f</code>.</li>
</ul>

<p><span id="more-411"></span></p>

<h3 id="generalizing-list-scans">Generalizing list scans</h3>

<p>The left and right scan functions on lists have an awkward feature. The output list has one more element than the input list, corresponding to the fact that the number of prefixes (<a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-List.html#v:inits"><code>inits</code></a>) of a list is one more than the number of elements, and similarly for suffixes (<a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-List.html#v:tails"><code>tails</code></a>).</p>

<p>While it&#8217;s easy to extend a list by adding one more element, it&#8217;s not easy with other functors. In <a href="http://conal.net/blog/posts/deriving-parallel-tree-scans/" title="blog post"><em>Deriving parallel tree scans</em></a>, I simply removed the <code>∅</code> element from the scan. In this post, I&#8217;ll instead change the interface to produce an output of exactly the same shape, plus one extra element. The extra element will equal a <code>fold</code> over the complete input. If you recall, we had to search for that complete fold in an input subtree in order to adjust the other subtree. (See <code>headT</code> and <code>lastT</code> and their generalizations in <a href="http://conal.net/blog/posts/deriving-parallel-tree-scans/" title="blog post"><em>Deriving parallel tree scans</em></a>.) Separating out this value eliminates the search.</p>

<p>Define a class with methods for prefix and suffix scan:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">class</span> <span class="dt">Scan</span> f <span class="kw">where</span><br />  prefixScan, suffixScan <span class="ot">&#8759;</span> <span class="dt">Monoid</span> m <span class="ot">&#8658;</span> f m <span class="ot">&#8594;</span> (m, f m)</code></pre>

<p>Prefix scans (<code>prefixScan</code>) accumulate moving left-to-right, while suffix scans (<code>suffixScan</code>) accumulate moving right-to-left.</p>

<h4 id="a-simple-example-pairs">A simple example: pairs</h4>

<p>To get a first sense of generalized scans, let&#8217;s use see how to scan over a pair functor.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">Pair</span> a <span class="fu">=</span> a <span class="fu">:#</span> a <span class="kw">deriving</span> (<span class="kw">Eq</span>,<span class="kw">Ord</span>,<span class="kw">Show</span>)</code></pre>

<p>With GHC&#8217;s <code>DeriveFunctor</code> option, we could also derive a <code>Functor</code> instance, but for clarity, define it explicitly:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="kw">Functor</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  <span class="fu">fmap</span> f (a <span class="fu">:#</span> b) <span class="fu">=</span> (f a <span class="fu">:#</span> f b)</code></pre>

<p>The scans:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  prefixScan (a <span class="fu">:#</span> b) <span class="fu">=</span> (a &#8853; b, (&#8709; <span class="fu">:#</span> a))<br />  suffixScan (a <span class="fu">:#</span> b) <span class="fu">=</span> (a &#8853; b, (b <span class="fu">:#</span> &#8709;))</code></pre>

<p>As you can see, if we eliminated the <code>∅</code> elements, we could shift to the left or right and forgo the extra result.</p>

<p>Naturally, there is also a <code>Fold</code> instance, and the scans produce the fold results as well sub-folds:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Foldable</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  fold (a <span class="fu">:#</span> b) <span class="fu">=</span> a &#8853; b</code></pre>

<p>The <code>Pair</code> functor also has unsurprising instances for <code>Applicative</code> and <code>Traversable</code>.</p>

<div class=toggle>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Applicative</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  pure a <span class="fu">=</span> a <span class="fu">:#</span> a<br />  (f <span class="fu">:#</span> g) <span class="fu">&lt;*&gt;</span> (x <span class="fu">:#</span> y) <span class="fu">=</span> (f x <span class="fu">:#</span> g y)<br /><br /><span class="kw">instance</span> <span class="dt">Traversable</span> <span class="dt">Pair</span> <span class="kw">where</span><br />  sequenceA (fa <span class="fu">:#</span> fb) <span class="fu">=</span> (<span class="fu">:#</span>) <span class="fu">&lt;$&gt;</span> fa <span class="fu">&lt;*&gt;</span> fb</code></pre>

</div>

<p>We don&#8217;t really have to figure out how to define scans for every functor separately. We can instead look at how functors are are composed out of their essential building blocks.</p>

<h3 id="scans-for-functor-combinators">Scans for functor combinators</h3>

<p>To see how to scan over a broad range of functors, let&#8217;s look at each of the functor combinators, e.g., as in <a href="http://conal.net/blog/posts/elegant-memoization-with-higher-order-types/" title="blog post"><em>Elegant memoization with higher-order types</em></a>.</p>

<h4 id="constant">Constant</h4>

<p>The constant functor is easiest.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Const</span> x a <span class="fu">=</span> <span class="dt">Const</span> x</code></pre>

<p>There are no values to accumulate, so the final result (fold) is <code>∅</code>.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> (<span class="dt">Const</span> x) <span class="kw">where</span><br />  prefixScan (<span class="dt">Const</span> x) <span class="fu">=</span> (&#8709;, <span class="dt">Const</span> x)<br />  suffixScan           <span class="fu">=</span> prefixScan</code></pre>

<h4 id="identity">Identity</h4>

<p>The identity functor is nearly as easy.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Id</span> a <span class="fu">=</span> <span class="dt">Id</span> a</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Scan</span> <span class="dt">Id</span> <span class="kw">where</span><br />  prefixScan (<span class="dt">Id</span> m) <span class="fu">=</span> (m, <span class="dt">Id</span> &#8709;)<br />  suffixScan        <span class="fu">=</span> prefixScan</code></pre>

<h4 id="sum">Sum</h4>

<p>Scanning in a sum is just scanning in a summand:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> (f <span class="fu">+</span> g) a <span class="fu">=</span> <span class="dt">InL</span> (f a) <span class="fu">|</span> <span class="dt">InR</span> (g a)</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> (<span class="dt">Scan</span> f, <span class="dt">Scan</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (f <span class="fu">+</span> g) <span class="kw">where</span><br />  prefixScan (<span class="dt">InL</span> fa) <span class="fu">=</span> second <span class="dt">InL</span> (prefixScan fa)<br />  prefixScan (<span class="dt">InR</span> ga) <span class="fu">=</span> second <span class="dt">InR</span> (prefixScan ga)<br /><br />  suffixScan (<span class="dt">InL</span> fa) <span class="fu">=</span> second <span class="dt">InL</span> (suffixScan fa)<br />  suffixScan (<span class="dt">InR</span> ga) <span class="fu">=</span> second <span class="dt">InR</span> (suffixScan ga)</code></pre>

<p>These definitions correspond to simple &quot;commutative diagram&quot; properties, e.g.,</p>

<pre class="sourceCode"><code class="sourceCode haskell">prefixScan &#8728; <span class="dt">InL</span> &#8801; second <span class="dt">InL</span> &#8728; prefixScan</code></pre>

<h4 id="product">Product</h4>

<p>Product scannning is a little trickier.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> (f &#215; g) a <span class="fu">=</span> f a &#215; g a</code></pre>

<p>Scan each of the two parts separately, and then combine the final (<code>fold</code>) part of one result with each of the non-final elements of the other.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> (<span class="dt">Scan</span> f, <span class="dt">Scan</span> g, <span class="kw">Functor</span> f, <span class="kw">Functor</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (f &#215; g) <span class="kw">where</span><br />  prefixScan (fa &#215; ga) <span class="fu">=</span> (af &#8853; ag, fa' &#215; ((af &#8853;) <span class="fu">&lt;$&gt;</span> ga'))<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> prefixScan fa<br />         (ag,ga') <span class="fu">=</span> prefixScan ga<br /><br />  suffixScan (fa &#215; ga) <span class="fu">=</span> (af &#8853; ag, ((&#8853; ag) <span class="fu">&lt;$&gt;</span> fa') &#215; ga')<br />   <span class="kw">where</span> (af,fa') <span class="fu">=</span> suffixScan fa<br />         (ag,ga') <span class="fu">=</span> suffixScan ga</code></pre>

<h4 id="composition">Composition</h4>

<p>Finally, composition is the trickiest.</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> (g &#8728; f) a <span class="fu">=</span> <span class="dt">O</span> (g (f a))</code></pre>

<p>The target signatures:</p>

<pre class="sourceCode"><code class="sourceCode haskell">  prefixScan, suffixScan <span class="ot">&#8759;</span> <span class="dt">Monoid</span> m <span class="ot">&#8658;</span> (g &#8728; f) m <span class="ot">&#8594;</span> (m, (g &#8728; f) m)</code></pre>

<p>To find the prefix and suffix scan definitions, fiddle with types beginning at the domain type for <code>prefixScan</code> or <code>suffixScan</code> and arriving at the range type.</p>

<p>Some helpers:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">zip</span> <span class="ot">&#8759;</span> <span class="dt">Applicative</span> g <span class="ot">&#8658;</span> (g a, g b) <span class="ot">&#8594;</span> g (a,b)<br /><span class="fu">zip</span> <span class="fu">=</span> <span class="fu">uncurry</span> (liftA2 (,))<br /><br /><span class="fu">unzip</span> <span class="ot">&#8759;</span> <span class="kw">Functor</span> g <span class="ot">&#8658;</span> g (a,b) <span class="ot">&#8594;</span> (g a, g b)<br /><span class="fu">unzip</span> <span class="fu">=</span> <span class="fu">fmap</span> <span class="fu">fst</span> <span class="fu">&amp;&amp;&amp;</span> <span class="fu">fmap</span> <span class="fu">snd</span></code></pre>

<pre class="sourceCode"><code class="sourceCode haskell">assocR <span class="ot">&#8759;</span> ((a,b),c) <span class="ot">&#8594;</span> (a,(b,c))<br />assocR   ((a,b),c) <span class="fu">=</span>  (a,(b,c))</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell">adjustL <span class="ot">&#8759;</span> (<span class="kw">Functor</span> f, <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> (m, f m) <span class="ot">&#8594;</span> f m<br />adjustL (m, ms) <span class="fu">=</span> (m &#8853;) <span class="fu">&lt;$&gt;</span> ms<br /><br />adjustR <span class="ot">&#8759;</span> (<span class="kw">Functor</span> f, <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> (m, f m) <span class="ot">&#8594;</span> f m<br />adjustR (m, ms) <span class="fu">=</span> (&#8853; m) <span class="fu">&lt;$&gt;</span> ms</code></pre>

<p>First <code>prefixScan</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">gofm                     <span class="ot">&#8759;</span> (g &#8728; f) m<br />unO                   <span class="ch">''</span> <span class="ot">&#8759;</span> g (f m)<br /><span class="fu">fmap</span> prefixScan       <span class="ch">''</span> <span class="ot">&#8759;</span> g (m, f m)<br /><span class="fu">unzip</span>                 <span class="ch">''</span> <span class="ot">&#8759;</span> (g m, g (f m))<br />first prefixScan      <span class="ch">''</span> <span class="ot">&#8759;</span> ((m, g m), g (f m))<br />assocR                <span class="ch">''</span> <span class="ot">&#8759;</span> (m, (g m, g (f m)))<br />second <span class="fu">zip</span>            <span class="ch">''</span> <span class="ot">&#8759;</span> (m, g (m, f m))<br />second (<span class="fu">fmap</span> adjustL) <span class="ch">''</span> <span class="ot">&#8759;</span> (m, g (f m))<br />second <span class="dt">O</span>              <span class="ch">''</span> <span class="ot">&#8759;</span> (m, (g &#8728; f) m)</code></pre>

<p>Then <code>suffixScan</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">gofm                     <span class="ot">&#8759;</span> (g &#8728; f) m<br />unO                   <span class="ch">''</span> <span class="ot">&#8759;</span> g (f m)<br /><span class="fu">fmap</span> suffixScan       <span class="ch">''</span> <span class="ot">&#8759;</span> g (m, f m)<br /><span class="fu">unzip</span>                 <span class="ch">''</span> <span class="ot">&#8759;</span> (g m, g (f m))<br />first suffixScan      <span class="ch">''</span> <span class="ot">&#8759;</span> ((m, g m), g (f m))<br />assocR                <span class="ch">''</span> <span class="ot">&#8759;</span> (m, (g m, g (f m)))<br />second <span class="fu">zip</span>            <span class="ch">''</span> <span class="ot">&#8759;</span> (m, (g (m, f m)))<br />second (<span class="fu">fmap</span> adjustR) <span class="ch">''</span> <span class="ot">&#8759;</span> (m, (g (f m)))<br />second <span class="dt">O</span>              <span class="ch">''</span> <span class="ot">&#8759;</span> (m, ((g &#8728; f) m))</code></pre>

<p>Putting together the pieces and simplifying just a bit leads to the method definitions:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> (<span class="dt">Scan</span> g, <span class="dt">Scan</span> f, <span class="kw">Functor</span> f, <span class="dt">Applicative</span> g) <span class="ot">&#8658;</span> <span class="dt">Scan</span> (g &#8728; f) <span class="kw">where</span><br />  prefixScan <span class="fu">=</span> second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustL &#8728; <span class="fu">zip</span>)<br />             &#8728; assocR<br />             &#8728; first prefixScan<br />             &#8728; <span class="fu">unzip</span><br />             &#8728; <span class="fu">fmap</span> prefixScan<br />             &#8728; unO<br /><br />  suffixScan <span class="fu">=</span> second (<span class="dt">O</span> &#8728; <span class="fu">fmap</span> adjustR &#8728; <span class="fu">zip</span>)<br />             &#8728; assocR<br />             &#8728; first suffixScan<br />             &#8728; <span class="fu">unzip</span><br />             &#8728; <span class="fu">fmap</span> suffixScan<br />             &#8728; unO</code></pre>

<h3 id="whats-coming-up">What&#8217;s coming up?</h3>

<ul>
<li>What might not easy to spot at this point is that the <code>prefixScan</code> and <code>suffixScan</code> methods given in this post do essentially the same job as in <a href="http://conal.net/blog/posts/deriving-parallel-tree-scans/" title="blog post"><em>Deriving parallel tree scans</em></a>, when the binary tree type is deconstructed into functor combinators. A future post will show this connection.</li>
<li>Switch from standard (right-folded) trees to left-folded trees (in the sense of <a href="http://conal.net/blog/posts/a-trie-for-length-typed-vectors/" title="blog post"><em>A trie for length-typed vectors</em></a> and <a href="http://conal.net/blog/posts/from-tries-to-trees/" title="blog post"><em>From tries to trees</em></a>), which reduces the running time from <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi><mo stretchy="false">)</mo></mrow></math> to <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>&#920;</mo><mspace width="0.167em"></mspace><mi>n</mi></mrow></math>.</li>
<li>Scanning in place, i.e., destructively replacing the values in the input structure rather than allocating a new structure.</li>
</ul>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=411&amp;md5=9870e39e2e5552b7c42709138945e306"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/composable-parallel-scanning/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fcomposable-parallel-scanning&amp;language=en_GB&amp;category=text&amp;title=Composable+parallel+scanning&amp;description=The+post+Deriving+list+scans+gave+a+simple+specification+of+the+list-scanning+functions+scanl+and+scanr%2C+and+then+transformed+those+specifications+into+the+standard+optimized+implementations.+Next%2C+the+post+Deriving...&amp;tags=functor%2Cscan%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Deriving parallel tree scans</title>
		<link>http://conal.net/blog/posts/deriving-parallel-tree-scans</link>
		<comments>http://conal.net/blog/posts/deriving-parallel-tree-scans#comments</comments>
		<pubDate>Tue, 01 Mar 2011 20:41:09 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[program derivation]]></category>
		<category><![CDATA[scan]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=330</guid>
		<description><![CDATA[The post Deriving list scans explored folds and scans on lists and showed how the usual, efficient scan implementations can be derived from simpler specifications. Let&#8217;s see now how to apply the same techniques to scans over trees. This new post is one of a series leading toward algorithms optimized for execution on massively parallel, [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p>The post <a href="http://conal.net/blog/posts/deriving-list-scans/" title="blog post"><em>Deriving list scans</em></a> explored folds and scans on lists and showed how the usual, efficient scan implementations can be derived from simpler specifications.</p>

<p>Let&#8217;s see now how to apply the same techniques to scans over trees.</p>

<p>This new post is one of a series leading toward algorithms optimized for execution on massively parallel, consumer hardware, using CUDA or OpenCL.</p>

<p><strong>Edits:</strong></p>

<ul>
<li>2011-03-01: Added clarification about &quot;<code>∅</code>&quot; and &quot;<code>(⊕)</code>&quot;.</li>
<li>2011-03-23: corrected &quot;linear-time&quot; to &quot;linear-work&quot; in two places.</li>
</ul>

<p><span id="more-330"></span></p>

<h3 id="trees">Trees</h3>

<p>Our trees will be non-empty and binary:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">data</span> <span class="dt">T</span> a <span class="fu">=</span> <span class="dt">Leaf</span> a <span class="fu">|</span> <span class="dt">Branch</span> (<span class="dt">T</span> a) (<span class="dt">T</span> a)<br /><br /><span class="kw">instance</span> <span class="kw">Show</span> a <span class="ot">&#8658;</span> <span class="kw">Show</span> (<span class="dt">T</span> a) <span class="kw">where</span><br />  <span class="fu">show</span> (<span class="dt">Leaf</span> a)     <span class="fu">=</span> <span class="fu">show</span> a<br />  <span class="fu">show</span> (<span class="dt">Branch</span> s t) <span class="fu">=</span> <span class="st">&quot;(&quot;</span><span class="fu">++</span><span class="fu">show</span> s<span class="fu">++</span><span class="st">&quot;,&quot;</span><span class="fu">++</span><span class="fu">show</span> t<span class="fu">++</span><span class="st">&quot;)&quot;</span></code></pre>

<p>Nothing surprising in the instances:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="kw">Functor</span> <span class="dt">T</span> <span class="kw">where</span><br />  <span class="fu">fmap</span> f (<span class="dt">Leaf</span> a)     <span class="fu">=</span> <span class="dt">Leaf</span> (f a)<br />  <span class="fu">fmap</span> f (<span class="dt">Branch</span> s t) <span class="fu">=</span> <span class="dt">Branch</span> (<span class="fu">fmap</span> f s) (<span class="fu">fmap</span> f t)<br /><br /><span class="kw">instance</span> <span class="dt">Foldable</span> <span class="dt">T</span> <span class="kw">where</span><br />  fold (<span class="dt">Leaf</span> a)     <span class="fu">=</span> a<br />  fold (<span class="dt">Branch</span> s t) <span class="fu">=</span> fold s &#8853; fold t<br /><br /><span class="kw">instance</span> <span class="dt">Traversable</span> <span class="dt">T</span> <span class="kw">where</span><br />  sequenceA (<span class="dt">Leaf</span> a)     <span class="fu">=</span> <span class="fu">fmap</span> <span class="dt">Leaf</span> a<br />  sequenceA (<span class="dt">Branch</span> s t) <span class="fu">=</span><br />    liftA2 <span class="dt">Branch</span> (sequenceA s) (sequenceA t)</code></pre>

<p>BTW, <a href="https://github.com/conal/fix-symbols-gitit/">my type-setting software</a> uses &quot;<code>∅</code>&quot; and &quot;<code>(⊕)</code>&quot; for Haskell&#8217;s &quot;mempty&quot; and &quot;mappend&quot;.</p>

<p>Also handy will be extracting the first and last (i.e., leftmost and rightmost) leaves in a tree:</p>

<pre class="sourceCode"><code class="sourceCode haskell">headT <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> a<br />headT (<span class="dt">Leaf</span> a)       <span class="fu">=</span> a<br />headT (s <span class="ot">`Branch`</span> _) <span class="fu">=</span> headT s<br /><br />lastT <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> a<br />lastT (<span class="dt">Leaf</span> a)       <span class="fu">=</span> a<br />lastT (_ <span class="ot">`Branch`</span> t) <span class="fu">=</span> lastT t</code></pre>

<div class=exercise>
<p><em>Exercise:</em> Prove that</p>
<pre class="sourceCode"><code class="sourceCode haskell">headT &#8728; <span class="fu">fmap</span> f &#8801; f &#8728; headT<br />lastT &#8728; <span class="fu">fmap</span> f &#8801; f &#8728; lastT</code></pre>
<p>Answer:</p>

<div class=toggle>

<p>Consider the <code>Leaf</code> and <code>Branch</code> cases separately:</p>
<pre class="sourceCode"><code class="sourceCode haskell">  headT (<span class="fu">fmap</span> f (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- fmap on T -}</span><br />  headT (<span class="dt">Leaf</span> (f a))<br />&#8801;  <span class="co">{- headT def -}</span><br />  f a<br />&#8801;  <span class="co">{- headT def -}</span><br />  f (headT (<span class="dt">Leaf</span> a))</code></pre>
<pre class="sourceCode"><code class="sourceCode haskell">  headT (<span class="fu">fmap</span> f (<span class="dt">Branch</span> s t))<br />&#8801;  <span class="co">{- fmap on T -}</span><br />  headT (<span class="dt">Branch</span> (<span class="fu">fmap</span> f s) (<span class="fu">fmap</span> f t))<br />&#8801;  <span class="co">{- headT def -}</span><br />  headT (<span class="fu">fmap</span> f s)<br />&#8801;  <span class="co">{- induction -}</span><br />  f (headT s)<br />&#8801;  <span class="co">{- headT def -}</span><br />  f (headT (<span class="dt">Branch</span> s t))</code></pre>
<p>Similarly for <code>lastT</code>.</p>

</div>
 </div>

<h3 id="from-lists-to-trees-and-back">From lists to trees and back</h3>

<p>We can flatten trees into lists:</p>

<pre class="sourceCode"><code class="sourceCode haskell">flatten <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> [a]<br />flatten <span class="fu">=</span> fold &#8728; <span class="fu">fmap</span> (<span class="fu">:</span>[])</code></pre>

<p>Equivalently, using <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Foldable.html#v:foldMap"><code>foldMap</code></a>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">flatten <span class="fu">=</span> foldMap (<span class="fu">:</span>[])</code></pre>

<p>Alternatively, we could define <code>fold</code> via <code>flatten</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">instance</span> <span class="dt">Foldable</span> <span class="dt">T</span> <span class="kw">where</span> fold <span class="fu">=</span> fold &#8728; flatten</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell">flatten <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> [a]<br />flatten (<span class="dt">Leaf</span> a)     <span class="fu">=</span> [a]<br />flatten (<span class="dt">Branch</span> s t) <span class="fu">=</span> flatten s <span class="fu">++</span> flatten t</code></pre>

<p>We can also &quot;unflatten&quot; lists into balanced trees:</p>

<pre class="sourceCode"><code class="sourceCode haskell">unflatten <span class="ot">&#8759;</span> [a] <span class="ot">&#8594;</span> <span class="dt">T</span> a<br />unflatten []  <span class="fu">=</span> <span class="fu">error</span> <span class="st">&quot;unflatten: Oops! Empty list&quot;</span><br />unflatten [a] <span class="fu">=</span> <span class="dt">Leaf</span> a<br />unflatten xs  <span class="fu">=</span> <span class="dt">Branch</span> (unflatten prefix) (unflatten suffix)<br /> <span class="kw">where</span><br />   (prefix,suffix) <span class="fu">=</span> <span class="fu">splitAt</span> (<span class="fu">length</span> xs <span class="ot">`div`</span> <span class="dv">2</span>) xs</code></pre>

<p>Both <code>flatten</code> and <code>unflatten</code> can be implemented more efficiently.</p>

<p>For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell">t1,t2 <span class="ot">&#8759;</span> <span class="dt">T</span> <span class="dt">Int</span><br />t1 <span class="fu">=</span> unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">3</span>]<br />t2 <span class="fu">=</span> unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">16</span>]</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t1<br />(<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t2<br />((((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">3</span>,<span class="dv">4</span>)),((<span class="dv">5</span>,<span class="dv">6</span>),(<span class="dv">7</span>,<span class="dv">8</span>))),(((<span class="dv">9</span>,<span class="dv">10</span>),(<span class="dv">11</span>,<span class="dv">12</span>)),((<span class="dv">13</span>,<span class="dv">14</span>),(<span class="dv">15</span>,<span class="dv">16</span>))))</code></pre>

<h3 id="specifying-tree-scans">Specifying tree scans</h3>

<h4 id="prefixes-and-suffixes">Prefixes and suffixes</h4>

<p>The post <a href="http://conal.net/blog/posts/deriving-list-scans/" title="blog post"><em>Deriving list scans</em></a> gave specifications for list scanning in terms of <code>inits</code> and <code>tails</code>. One consequence of this specification is that the output of scanning has one more element than the input. Alternatively, we could use non-empty variants of <code>inits</code> and <code>tails</code>, so that the input &amp; output are in one-to-one correspondence.</p>

<pre class="sourceCode"><code class="sourceCode haskell">inits' <span class="ot">&#8759;</span> [a] <span class="ot">&#8594;</span> [[a]]<br />inits' []     <span class="fu">=</span> []<br />inits' (x<span class="fu">:</span>xs) <span class="fu">=</span> <span class="fu">map</span> (x<span class="fu">:</span>) ([] <span class="fu">:</span> inits' xs)</code></pre>

<p>The cons case can also be written as</p>

<pre class="sourceCode"><code class="sourceCode haskell">inits' (x<span class="fu">:</span>xs) <span class="fu">=</span> [x] <span class="fu">:</span> <span class="fu">map</span> (x<span class="fu">:</span>) (inits' xs)</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell">tails' <span class="ot">&#8759;</span> [a] <span class="ot">&#8594;</span> [[a]]<br />tails' []         <span class="fu">=</span> []<br />tails' xs<span class="fu">@</span>(_<span class="fu">:</span>xs') <span class="fu">=</span> xs <span class="fu">:</span> tails' xs'</code></pre>

<p>For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> inits' <span class="st">&quot;abcd&quot;</span><br />[<span class="st">&quot;a&quot;</span>,<span class="st">&quot;ab&quot;</span>,<span class="st">&quot;abc&quot;</span>,<span class="st">&quot;abcd&quot;</span>]<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> tails' <span class="st">&quot;abcd&quot;</span><br />[<span class="st">&quot;abcd&quot;</span>,<span class="st">&quot;bcd&quot;</span>,<span class="st">&quot;cd&quot;</span>,<span class="st">&quot;d&quot;</span>]</code></pre>

<p>Our tree functor has a symmetric definition, so we get more symmetry in the counterparts to <code>inits'</code> and <code>tails'</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell">initTs <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> <span class="dt">T</span> (<span class="dt">T</span> a)<br />initTs (<span class="dt">Leaf</span> a)       <span class="fu">=</span> <span class="dt">Leaf</span> (<span class="dt">Leaf</span> a)<br />initTs (s <span class="ot">`Branch`</span> t) <span class="fu">=</span><br />  <span class="dt">Branch</span> (initTs s) (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t))<br /><br />tailTs <span class="ot">&#8759;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> <span class="dt">T</span> (<span class="dt">T</span> a)<br />tailTs (<span class="dt">Leaf</span> a)       <span class="fu">=</span> <span class="dt">Leaf</span> (<span class="dt">Leaf</span> a)<br />tailTs (s <span class="ot">`Branch`</span> t) <span class="fu">=</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (<span class="ot">`Branch`</span> t) (tailTs s)) (tailTs t)</code></pre>

<p>Try it:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t1<br />(<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> initTs t1<br />(<span class="dv">1</span>,((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>))))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> tailTs t1<br />((<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)),((<span class="dv">2</span>,<span class="dv">3</span>),<span class="dv">3</span>))<br /><br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">5</span>]<br />((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">3</span>,(<span class="dv">4</span>,<span class="dv">5</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> initTs (unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">5</span>])<br />((<span class="dv">1</span>,(<span class="dv">1</span>,<span class="dv">2</span>)),(((<span class="dv">1</span>,<span class="dv">2</span>),<span class="dv">3</span>),(((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">3</span>,<span class="dv">4</span>)),((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">3</span>,(<span class="dv">4</span>,<span class="dv">5</span>))))))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> tailTs (unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">5</span>])<br />((((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">3</span>,(<span class="dv">4</span>,<span class="dv">5</span>))),(<span class="dv">2</span>,(<span class="dv">3</span>,(<span class="dv">4</span>,<span class="dv">5</span>)))),((<span class="dv">3</span>,(<span class="dv">4</span>,<span class="dv">5</span>)),((<span class="dv">4</span>,<span class="dv">5</span>),<span class="dv">5</span>)))</code></pre>

<div class=exercise>
<p><em>Exercise:</em> Prove that</p>
<pre class="sourceCode"><code class="sourceCode haskell">lastT &#8728; initTs &#8801; <span class="fu">id</span><br />headT &#8728; tailTs &#8801; <span class="fu">id</span></code></pre>
<p>Answer:</p>

<div class=toggle>

<pre class="sourceCode"><code class="sourceCode haskell">  lastT (initTs (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- initTs def -}</span><br />  lastT (<span class="dt">Leaf</span> (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- lastT def -}</span><br />  <span class="dt">Leaf</span> a<br /><br />  lastT (initTs (s <span class="ot">`Branch`</span> t))<br />&#8801;  <span class="co">{- initTs def -}</span><br />  lastT (<span class="dt">Branch</span> (&#8943;) (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t)))<br />&#8801;  <span class="co">{- lastT def -}</span><br />  lastT (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t))<br />&#8801;  <span class="co">{- lastT &#8728; fmap f -}</span><br />  (s <span class="ot">`Branch`</span>) (lastT (initTs t))<br />&#8801;  <span class="co">{- trivial -}</span><br />  s <span class="ot">`Branch`</span> lastT (initTs t)<br />&#8801;  <span class="co">{- induction -}</span><br />  s <span class="ot">`Branch`</span> t</code></pre>

</div>
 </div>

<h4 id="scan-specification">Scan specification</h4>

<p>Now we can specify prefix &amp; suffix scanning:</p>

<pre class="sourceCode"><code class="sourceCode haskell">scanlT, scanrT <span class="ot">&#8759;</span> <span class="dt">Monoid</span> a <span class="ot">&#8658;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> <span class="dt">T</span> a<br />scanlT <span class="fu">=</span> <span class="fu">fmap</span> fold &#8728; initTs<br />scanrT <span class="fu">=</span> <span class="fu">fmap</span> fold &#8728; tailTs</code></pre>

<p>Try it out:</p>

<pre class="sourceCode"><code class="sourceCode haskell">t3 <span class="ot">&#8759;</span> <span class="dt">T</span> <span class="dt">String</span><br />t3 <span class="fu">=</span> <span class="fu">fmap</span> (<span class="fu">:</span>[]) (unflatten <span class="st">&quot;abcde&quot;</span>)</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t3<br />((<span class="st">&quot;a&quot;</span>,<span class="st">&quot;b&quot;</span>),(<span class="st">&quot;c&quot;</span>,(<span class="st">&quot;d&quot;</span>,<span class="st">&quot;e&quot;</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> scanlT t3<br />((<span class="st">&quot;a&quot;</span>,<span class="st">&quot;ab&quot;</span>),(<span class="st">&quot;abc&quot;</span>,(<span class="st">&quot;abcd&quot;</span>,<span class="st">&quot;abcde&quot;</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> scanrT t3<br />((<span class="st">&quot;abcde&quot;</span>,<span class="st">&quot;bcde&quot;</span>),(<span class="st">&quot;cde&quot;</span>,(<span class="st">&quot;de&quot;</span>,<span class="st">&quot;e&quot;</span>)))</code></pre>

<p>To test on numbers, I&#8217;ll use a <a href="http://matt.immute.net/content/pointless-fun" title="blog post by Matt Hellige">handy notation from Matt Hellige</a> to add pre- and post-processing:</p>

<pre class="sourceCode"><code class="sourceCode haskell">(&#8605;) <span class="ot">&#8759;</span> (a' <span class="ot">&#8594;</span> a) <span class="ot">&#8594;</span> (b <span class="ot">&#8594;</span> b') <span class="ot">&#8594;</span> ((a <span class="ot">&#8594;</span> b) <span class="ot">&#8594;</span> (a' <span class="ot">&#8594;</span> b'))<br />(f &#8605; h) g <span class="fu">=</span> h &#8728; g &#8728; f</code></pre>

<p>And a version specialized to functors:</p>

<pre class="sourceCode"><code class="sourceCode haskell">(&#8605;<span class="fu">*</span>) <span class="ot">&#8759;</span> <span class="kw">Functor</span> f <span class="ot">&#8658;</span> (a' <span class="ot">&#8594;</span> a) <span class="ot">&#8594;</span> (b <span class="ot">&#8594;</span> b')<br />     <span class="ot">&#8594;</span> (f a <span class="ot">&#8594;</span> f b) <span class="ot">&#8594;</span> (f a' <span class="ot">&#8594;</span> f b')<br />f &#8605;<span class="fu">*</span> g <span class="fu">=</span> <span class="fu">fmap</span> f &#8605; <span class="fu">fmap</span> g</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell">t4 <span class="ot">&#8759;</span> <span class="dt">T</span> <span class="dt">Integer</span><br />t4 <span class="fu">=</span> unflatten [<span class="dv">1</span><span class="fu">&#8229;</span><span class="dv">6</span>]<br /><br />t5 <span class="ot">&#8759;</span> <span class="dt">T</span> <span class="dt">Integer</span><br />t5 <span class="fu">=</span> (<span class="dt">Sum</span> &#8605;<span class="fu">*</span> getSum) scanlT t4</code></pre>

<p>Try it:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t4<br />((<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)),(<span class="dv">4</span>,(<span class="dv">5</span>,<span class="dv">6</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> initTs t4<br />((<span class="dv">1</span>,((<span class="dv">1</span>,<span class="dv">2</span>),(<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)))),(((<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)),<span class="dv">4</span>),(((<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)),(<span class="dv">4</span>,<span class="dv">5</span>)),((<span class="dv">1</span>,(<span class="dv">2</span>,<span class="dv">3</span>)),(<span class="dv">4</span>,(<span class="dv">5</span>,<span class="dv">6</span>))))))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t5<br />((<span class="dv">1</span>,(<span class="dv">3</span>,<span class="dv">6</span>)),(<span class="dv">10</span>,(<span class="dv">15</span>,<span class="dv">21</span>)))</code></pre>

<div class=exercise>
<p><em>Exercise</em>: Prove that we have properties similar to the ones relating <code>fold</code>, <code>scanlT</code>, and <code>scanrT</code> on list:</p>
<pre class="sourceCode"><code class="sourceCode haskell">fold &#8801; lastT &#8728; scanlT<br />fold &#8801; headT &#8728; scanrT</code></pre>
<p>Answer:</p>

<div class=toggle>

<pre class="sourceCode"><code class="sourceCode haskell">  lastT &#8728; scanlT<br />&#8801;  <span class="co">{- scanlT spec -}</span><br />  lastT &#8728; <span class="fu">fmap</span> fold &#8728; initTs<br />&#8801;  <span class="co">{- lastT &#8728; fmap f -}</span><br />  fold &#8728; lastT &#8728; initTs<br />&#8801;  <span class="co">{- lastT &#8728; initTs -}</span><br />  fold<br /><br />  headT &#8728; scanrT <br />&#8801;  <span class="co">{- scanrT def -}</span><br />  headT &#8728; <span class="fu">fmap</span> fold &#8728; tailTs<br />&#8801;  <span class="co">{- headT &#8728; fmap f -}</span><br />  fold &#8728; headT &#8728; tailTs<br />&#8801;  <span class="co">{- headT &#8728; tailTs -}</span><br />  fold</code></pre>

</div>

<p>For instance,</p>
<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> fold t3<br /><span class="st">&quot;abcde&quot;</span><br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> (lastT &#8728; scanlT) t3<br /><span class="st">&quot;abcde&quot;</span><br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> (headT &#8728; scanrT) t3<br /><span class="st">&quot;abcde&quot;</span></code></pre>

</div>

<h3 id="deriving-faster-scans">Deriving faster scans</h3>

<p>Recall the specifications:</p>

<pre class="sourceCode"><code class="sourceCode haskell">scanlT <span class="fu">=</span> <span class="fu">fmap</span> fold &#8728; initTs<br />scanrT <span class="fu">=</span> <span class="fu">fmap</span> fold &#8728; tailTs</code></pre>

<p>To derive more efficient implementations, proceed as in <a href="http://conal.net/blog/posts/deriving-list-scans/" title="blog post"><em>Deriving list scans</em></a>. Start with prefix scan (<code>scanlT</code>), and consider the <code>Leaf</code> and <code>Branch</code> cases separately.</p>

<pre class="sourceCode"><code class="sourceCode haskell">  scanlT (<span class="dt">Leaf</span> a)<br />&#8801;  <span class="co">{- scanlT spec -}</span><br />  <span class="fu">fmap</span> fold (initTs (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- initTs def -}</span><br />  <span class="fu">fmap</span> fold (<span class="dt">Leaf</span> (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- fmap def -}</span><br />  <span class="dt">Leaf</span> (fold (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- fold def -}</span><br />  <span class="dt">Leaf</span> a<br /><br />  scanlT (s <span class="ot">`Branch`</span> t)<br />&#8801;  <span class="co">{- scanlT spec -}</span><br />  <span class="fu">fmap</span> fold (initTs (s <span class="ot">`Branch`</span> t))<br />&#8801;  <span class="co">{- initTs def -}</span><br />  <span class="fu">fmap</span> fold (<span class="dt">Branch</span> (initTs s) (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t)))<br />&#8801;  <span class="co">{- fmap def -}</span><br />   <span class="dt">Branch</span> (<span class="fu">fmap</span> fold (initTs s)) (<span class="fu">fmap</span> fold (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t)))<br />&#8801;  <span class="co">{- scanlT spec -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> fold (<span class="fu">fmap</span> (s <span class="ot">`Branch`</span>) (initTs t)))<br />&#8801;  <span class="co">{- functor law -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (fold &#8728; (s <span class="ot">`Branch`</span>)) (initTs t))<br />&#8801;  <span class="co">{- rework as &#955; -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (&#955; t' <span class="ot">&#8594;</span> fold (s <span class="ot">`Branch`</span> t')) (initTs t))<br />&#8801;  <span class="co">{- fold def -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (&#955; t' <span class="ot">&#8594;</span> fold s &#8853; fold t')) (initTs t))<br />&#8801;  <span class="co">{- rework &#955; -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> ((fold s &#8853;) &#8728; fold) (initTs t))<br />&#8801;  <span class="co">{- functor law -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (fold s &#8853;) (<span class="fu">fmap</span> fold (initTs t)))<br />&#8801;  <span class="co">{- scanlT spec -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (fold s &#8853;) (scanlT t))<br />&#8801;  <span class="co">{- lastT &#8728; scanlT &#8801; fold -}</span><br />  <span class="dt">Branch</span> (scanlT s) (<span class="fu">fmap</span> (lastT (scanlT s) &#8853;) (scanlT t))<br />&#8801;  <span class="co">{- factor out defs -}</span><br />  <span class="dt">Branch</span> s' (<span class="fu">fmap</span> (lastT s' &#8853;) t')<br />     <span class="kw">where</span> s' <span class="fu">=</span> scanlT s<br />           t' <span class="fu">=</span> scanlT t</code></pre>

<p>Suffix scan has a similar derivation.</p>

<div class=toggle>

<pre class="sourceCode"><code class="sourceCode haskell">  scanrT (<span class="dt">Leaf</span> a)<br />&#8801;  <span class="co">{- scanrT def -}</span><br />  <span class="fu">fmap</span> fold (tailTs (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- tailTs def -}</span><br />  <span class="fu">fmap</span> fold (<span class="dt">Leaf</span> (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- fmap on T -}</span><br />  <span class="dt">Leaf</span> (fold (<span class="dt">Leaf</span> a))<br />&#8801;  <span class="co">{- fold def -}</span><br />  <span class="dt">Leaf</span> a<br /><br />  scanrT (s <span class="ot">`Branch`</span> t)<br />&#8801;  <span class="co">{- scanrT spec -}</span><br />  <span class="fu">fmap</span> fold (tailTs (s <span class="ot">`Branch`</span> t))<br />&#8801;  <span class="co">{- tailTs def -}</span><br />  <span class="fu">fmap</span> fold (<span class="dt">Branch</span> (<span class="fu">fmap</span> (<span class="ot">`Branch`</span> t) (tailTs s)) (tailTs t))<br />&#8801;  <span class="co">{- fmap def -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> fold (<span class="fu">fmap</span> (<span class="ot">`Branch`</span> t) (tailTs s))) (<span class="fu">fmap</span> fold (tailTs t))<br />&#8801;  <span class="co">{- scanrT spec -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> fold (<span class="fu">fmap</span> (<span class="ot">`Branch`</span> t) (tailTs s))) (scanrT t)<br />&#8801;  <span class="co">{- functor law -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (fold &#8728; (<span class="ot">`Branch`</span> t)) (tailTs s)) (scanrT t)<br />&#8801;  <span class="co">{- rework as &#955; -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#955; s' <span class="ot">&#8594;</span> fold (s' <span class="ot">`Branch`</span> t)) (tailTs s)) (scanrT t)<br />&#8801;  <span class="co">{- functor law -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#955; s' <span class="ot">&#8594;</span> fold s' &#8853; fold t) (tailTs s)) (scanrT t)<br />&#8801;  <span class="co">{- rework &#955; -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> ((&#8853; fold t) &#8728; fold) (tailTs s)) (scanrT t)<br />&#8801;  <span class="co">{- scanrT spec -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#8853; fold t) (scanrT s)) (scanrT t)<br />&#8801;  <span class="co">{- headT &#8728; scanrT -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#8853; headT (scanrT t)) (scanrT s)) (scanrT t)<br />&#8801;  <span class="co">{- factor out defs -}</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#8853; headT t') s') t'<br />    <span class="kw">where</span> s' <span class="fu">=</span> scanrT s<br />          t' <span class="fu">=</span> scanrT t</code></pre>

</div>

<p>Extract code from these derivations:</p>

<pre class="sourceCode"><code class="sourceCode haskell">scanlT' <span class="ot">&#8759;</span> <span class="dt">Monoid</span> a <span class="ot">&#8658;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> <span class="dt">T</span> a<br />scanlT' (<span class="dt">Leaf</span> a)       <span class="fu">=</span> <span class="dt">Leaf</span> a<br />scanlT' (s <span class="ot">`Branch`</span> t) <span class="fu">=</span><br />  <span class="dt">Branch</span> s' (<span class="fu">fmap</span> (lastT s' &#8853;) t')<br />     <span class="kw">where</span> s' <span class="fu">=</span> scanlT' s<br />           t' <span class="fu">=</span> scanlT' t<br /><br />scanrT' <span class="ot">&#8759;</span> <span class="dt">Monoid</span> a <span class="ot">&#8658;</span> <span class="dt">T</span> a <span class="ot">&#8594;</span> <span class="dt">T</span> a<br />scanrT' (<span class="dt">Leaf</span> a)       <span class="fu">=</span> <span class="dt">Leaf</span> a<br />scanrT' (s <span class="ot">`Branch`</span> t) <span class="fu">=</span><br />  <span class="dt">Branch</span> (<span class="fu">fmap</span> (&#8853; headT t') s') t'<br />    <span class="kw">where</span> s' <span class="fu">=</span> scanrT' s<br />          t' <span class="fu">=</span> scanrT' t</code></pre>

<p>Try it:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> t3<br />((<span class="st">&quot;a&quot;</span>,<span class="st">&quot;b&quot;</span>),(<span class="st">&quot;c&quot;</span>,(<span class="st">&quot;d&quot;</span>,<span class="st">&quot;e&quot;</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> scanlT' t3<br />((<span class="st">&quot;a&quot;</span>,<span class="st">&quot;ab&quot;</span>),(<span class="st">&quot;abc&quot;</span>,(<span class="st">&quot;abcd&quot;</span>,<span class="st">&quot;abcde&quot;</span>)))<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> scanrT' t3<br />((<span class="st">&quot;abcde&quot;</span>,<span class="st">&quot;bcde&quot;</span>),(<span class="st">&quot;cde&quot;</span>,(<span class="st">&quot;de&quot;</span>,<span class="st">&quot;e&quot;</span>)))</code></pre>

<h3 id="efficiency">Efficiency</h3>

<p>Although I was just following my nose, without trying to get anywhere in particular, this result is exactly the algorithm I first thought of when considering how to parallelize tree scanning.</p>

<p>Let&#8217;s now consider the running time of this algorithm. Assume that the tree is <em>balanced</em>, to maximize parallelism. (I think balancing is optimal for parallelism here, but I&#8217;m not certain.)</p>

<p>For a tree with <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi></mrow></math> leaves, the work <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>W</mi><mspace width="0.167em"></mspace><mi>n</mi></mrow></math> will be constant when <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>=</mo><mn>1</mn></mrow></math> and <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>2</mn><mo>&#8901;</mo><mi>W</mi><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>n</mi><mo>/</mo><mn>2</mn><mo stretchy="false">)</mo><mo>+</mo><mi>n</mi></mrow></math> when <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mo>&gt;</mo><mn>1</mn></mrow></math>. Using <a href="http://en.wikipedia.org/wiki/Master_theorem#Case_2">the <em>Master Theorem</em></a> (explained more <a href="http://www.math.dartmouth.edu/archive/m19w03/public_html/Section5-2.pdf">here</a>), <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>W</mi><mspace width="0.167em"></mspace><mi>n</mi><mo>=</mo><mo>&#920;</mo><mspace width="0.167em"></mspace><mo stretchy="false">(</mo><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mi>n</mi><mo stretchy="false">)</mo></mrow></math>.</p>

<p>This result is disappointing, since scanning can be done with linear work by threading a single accumulator while traversing the input tree and building up the output tree.</p>

<p>I&#8217;m using the term &quot;work&quot; instead of &quot;time&quot; here, since I&#8217;m not assuming sequential execution.</p>

<p>We have a parallel algorithm that performs <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>n</mi><mspace width="0.167em"></mspace><mi>log</mi><mspace width="0.167em"></mspace><mi>n</mi></mrow></math> work, and a sequential program that performs linear work. Can we construct a linear-parallel algorithm?</p>

<p>Yes. Guy Blelloch came up with a clever linear-work parallel algorithm, which I&#8217;ll derive in another post.</p>

<h3 id="generalizing-head-and-last">Generalizing <code>head</code> and <code>last</code></h3>

<p>Can we replace the ad hoc (tree-specific) <code>headT</code> and <code>lastT</code> functions with general versions that work on all foldables? I&#8217;d want the generalization to also generalize the list functions <code>head</code> and <code>last</code> or, rather, to <em>total</em> variants (ones that cannot error due to empty list). For totality, provide a default value for when there are no elements.</p>

<pre class="sourceCode"><code class="sourceCode haskell">headF, lastF <span class="ot">&#8759;</span> <span class="dt">Foldable</span> f <span class="ot">&#8658;</span> a <span class="ot">&#8594;</span> f a <span class="ot">&#8594;</span> a</code></pre>

<p>I also want these functions to be as efficient on lists as <code>head</code> and <code>last</code> and as efficient on trees as <code>headT</code> and <code>lastT</code>.</p>

<p>The <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Monoid.html#v:First"><code>First</code></a> and <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Monoid.html#v:Last"><code>Last</code></a> monoids provide left-biased and right-biased choice. They&#8217;re implemented as <code>newtype</code> wrappers around <code>Maybe</code>:</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">First</span> a <span class="fu">=</span> <span class="dt">First</span> { getFirst <span class="ot">&#8759;</span> <span class="dt">Maybe</span> a }<br /><br /><span class="kw">instance</span> <span class="dt">Monoid</span> (<span class="dt">First</span> a) <span class="kw">where</span><br />  &#8709; <span class="fu">=</span> <span class="dt">First</span> <span class="kw">Nothing</span><br />  r<span class="fu">@</span>(<span class="dt">First</span> (<span class="kw">Just</span> _)) &#8853; _ <span class="fu">=</span> r<br />  <span class="dt">First</span> <span class="kw">Nothing</span>      &#8853; r <span class="fu">=</span> r</code></pre>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="kw">newtype</span> <span class="dt">Last</span> a <span class="fu">=</span> <span class="dt">Last</span> { getLast <span class="ot">&#8759;</span> <span class="dt">Maybe</span> a }<br /><br /><span class="kw">instance</span> <span class="dt">Monoid</span> (<span class="dt">Last</span> a) <span class="kw">where</span><br />  &#8709; <span class="fu">=</span> <span class="dt">Last</span> <span class="kw">Nothing</span><br />  _ &#8853; r<span class="fu">@</span>(<span class="dt">Last</span> (<span class="kw">Just</span> _)) <span class="fu">=</span> r<br />  r &#8853; <span class="dt">Last</span> <span class="kw">Nothing</span>      <span class="fu">=</span> r</code></pre>

<p>For <code>headF</code>, embed all of the elements into the <code>First</code> monoid (via <code>First ∘ Just</code>), fold over the result, and extract the result, using the provided default value in case there are no elements. Similarly for <code>lastF</code>.</p>

<pre class="sourceCode"><code class="sourceCode haskell">headF dflt <span class="fu">=</span> fromMaybe dflt &#8728; getFirst &#8728; foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>)<br />lastF dflt <span class="fu">=</span> fromMaybe dflt &#8728; getLast  &#8728; foldMap (<span class="dt">Last</span>  &#8728; <span class="kw">Just</span>)</code></pre>

<p>For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> headF <span class="dv">3</span> [<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">4</span>,<span class="dv">8</span>]<br /><span class="dv">1</span><br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> headF <span class="dv">3</span> []<br /><span class="dv">3</span></code></pre>

<p>When our elements belong to a monoid, we can use <code>∅</code> as the default:</p>

<pre class="sourceCode"><code class="sourceCode haskell">headFM <span class="ot">&#8759;</span> (<span class="dt">Foldable</span> f, <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> f m <span class="ot">&#8594;</span> m<br />headFM <span class="fu">=</span> headF &#8709;<br /><br />lastFM <span class="ot">&#8759;</span> (<span class="dt">Foldable</span> f, <span class="dt">Monoid</span> m) <span class="ot">&#8658;</span> f m <span class="ot">&#8594;</span> m<br />lastFM <span class="fu">=</span> headF &#8709;</code></pre>

<p>For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> lastFM ([] <span class="ot">&#8759;</span> [<span class="dt">String</span>])<br /><span class="st">&quot;&quot;</span></code></pre>

<p>Using <code>headFM</code> and <code>lastFM</code> in place of <code>headT</code> and <code>lastT</code>, we can easily handle addition of an <code>Empty</code> case to our tree functor in this post. The key choice is that <code>fold Empty ≡ ∅</code> and <code>fmap _ Empty ≡ Empty</code>. Then <code>headFM</code> will choose the first <em>leaf</em>, and <code>lastT</code></p>

<p>What about efficiency? Because <code>headF</code> and <code>lastF</code> are defined via <code>foldMap</code>, which is a composition of <code>fold</code> and <code>fmap</code>, one might think that we have to traverse the entire structure when used with functors like <code>[]</code> or <code>T</code>.</p>

<p>Laziness saves us, however, and we can even extract the head of an infinite list or a partially defined one. For instance,</p>

<pre class="sourceCode"><code class="sourceCode haskell">  foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>) [<span class="dv">5</span> <span class="fu">&#8229;</span>]<br />&#8801; foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>) (<span class="dv">5</span> <span class="fu">:</span> [<span class="dv">6</span> <span class="fu">&#8229;</span>])<br />&#8801; <span class="dt">First</span> (<span class="kw">Just</span> <span class="dv">5</span>) &#8853; foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>) [<span class="dv">6</span> <span class="fu">&#8229;</span>]<br />&#8801; <span class="dt">First</span> (<span class="kw">Just</span> <span class="dv">5</span>)</code></pre>

<p>So</p>

<pre class="sourceCode"><code class="sourceCode haskell">  headF d [<span class="dv">5</span> <span class="fu">&#8229;</span>]<br />&#8801; fromMaybe d (getFirst (foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>) [<span class="dv">5</span> <span class="fu">&#8229;</span>]))<br />&#8801; fromMaybe d (getFirst (<span class="dt">First</span> (<span class="kw">Just</span> <span class="dv">5</span>)))<br />&#8801; fromMaybe d (<span class="kw">Just</span> <span class="dv">5</span>)<br />&#8801; <span class="dv">5</span></code></pre>

<p>And, sure enough,</p>

<pre class="sourceCode"><code class="sourceCode haskell"><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> foldMap (<span class="dt">First</span> &#8728; <span class="kw">Just</span>) [<span class="dv">5</span> <span class="fu">&#8229;</span>]<br /><span class="dt">First</span> {getFirst <span class="fu">=</span> <span class="kw">Just</span> <span class="dv">5</span>}<br /><span class="fu">*</span><span class="dt">T</span><span class="fu">&gt;</span> headF &#8869; [<span class="dv">5</span> <span class="fu">&#8229;</span>]<br /><span class="dv">5</span></code></pre>

<h3 id="where-to-go-from-here">Where to go from here?</h3>

<ul>
<li>As mentioned above, the derived scanning implementations perform asymtotically more work than necessary. Future posts explore how to derive parallel-friendly, linear-work algorithms. Then we&#8217;ll see how to transform the parallel-friendly algorithms so that they work <em>destructively</em>, overwriting their input as they go, and hence suitably for execution entirely in CUDA or OpenCL.</li>
<li>The functions <code>initTs</code> and <code>tailTs</code> are still tree-specific. To generalize the specification and derivation of list and tree scanning, find a way to generalize these two functions. The types of <code>initTs</code> and <code>tailTs</code> fit with the <a href="http://hackage.haskell.org/packages/archive/comonad/1.0.1/doc/html/Data-Functor-Extend.html#v:duplicate"><code>duplicate</code></a> method on comonads. Moreover, <code>tails</code> is the usual definition of <code>duplicate</code> on lists, and I think <code>inits</code> would be <code>extend</code> for &quot;snoc lists&quot;. For trees, however, I don&#8217;t think the correspondence holds. Am I missing something?</li>
<li>In particular, I want to extend the derivation to depth-typed, perfectly balanced trees, of the sort I played with in <a href="http://conal.net/blog/posts/a-trie-for-length-typed-vectors/" title="blog post"><em>A trie for length-typed vectors</em></a> and <a href="http://conal.net/blog/posts/from-tries-to-trees/" title="blog post"><em>From tries to trees</em></a>. The functions <code>initTs</code> and <code>tailTs</code> make unbalanced trees out of balanced ones, so I don&#8217;t know how to adapt the specifications given here to the setting of depth-typed balanced trees. Maybe I could just fill up the to-be-ignored elements with <code>∅</code>.</li>
</ul>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=330&amp;md5=0fc7825e5d47f397d7ee6f3f19c7c416"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/deriving-parallel-tree-scans/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fderiving-parallel-tree-scans&amp;language=en_GB&amp;category=text&amp;title=Deriving+parallel+tree+scans&amp;description=The+post+Deriving+list+scans+explored+folds+and+scans+on+lists+and+showed+how+the+usual%2C+efficient+scan+implementations+can+be+derived+from+simpler+specifications.+Let%26%238217%3Bs+see+now+how+to...&amp;tags=program+derivation%2Cscan%2Cblog" type="text/html" />
	</item>
	</channel>
</rss>
