<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
  <channel>
    <title>Vidar Hokstad V2.0</title>
    <link>http://www.hokstad.com/</link>
    <description>Vidar Hokstad's blog posts</description>
    <pubDate>Sat, 12 Jun 2010 17:21:29 -0400</pubDate>
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/VidarHokstad" /><feedburner:info uri="vidarhokstad" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>VidarHokstad</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><feedburner:browserFriendly>This is an XML content feed. It is intended to be viewed in a newsreader or syndicated to another site.</feedburner:browserFriendly><item>
      <title>Writing a (Ruby) compiler in Ruby bottom up - step 25</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/b_UToQUyOdI/writing-a-compiler-in-ruby-bottom-up-step-25.html</link>
      <description>&lt;P&gt;&lt;SPAN style="color: red; "&gt;This is &lt;A href="http://www.hokstad.com/compiler"&gt;part of a series&lt;/A&gt; I started in March 2008 - you may want to go back and look at older parts if you're new to this series.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;First of all, here's the biggest reason I've been so excrutiatingly slow
with getting this part together (at least that's my story, I guess I've
had other things on my plate too...):&lt;/P&gt;

&lt;P&gt;&lt;IMG src="static/images/tristan1year-small.jpg" style="padding-left: 10%"&gt;&lt;/P&gt;

&lt;P&gt;Tristan is 13 months now, and a real menace to my laptop (pulling off keys
and drooling on the screen) and anything else within reach, and a real
attention seeker...&lt;/P&gt;

&lt;P&gt;So, whenever I'm slow at posting a new part, I'll blame him. I have no part
in the delays at all, of course. None. I'm completely faultless...&lt;/P&gt;

&lt;H1&gt;Towards define_method via closures&lt;/H1&gt;

&lt;P&gt;&lt;A href="http://github.com/vidarh/writing-a-compiler-in-ruby/commit/c55a2797be7c3fd396f0a21dcaa5b7428252f7a1"&gt;For starters, here's the commit that covers most of this&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;To support &lt;CODE&gt;define_method&lt;/CODE&gt; we need to support blocks with arguments.
Really, full closures.&lt;/P&gt;

&lt;P&gt;If you haven't already, you should read my &lt;A href="http://www.hokstad.com/how-to-implement-closures.html"&gt;post on closures&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;We want this:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="ident"&gt;define_method&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:foo&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;b&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
       &lt;SPAN class="comment"&gt;# Do something with a,b&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;to be mostly equivalent to:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;foo&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;b&lt;/SPAN&gt;
       &lt;SPAN class="comment"&gt;# Do something with a,b&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;For our &lt;CODE&gt;attr_*&lt;/CODE&gt; implementation we actually want more:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;attr_reader&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sym&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;define_method&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sym&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;ivar self sym&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="comment"&gt;# FIXME: Create the "ivar" s-exp directive.&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;(or, as it turns out, not really that, as I realize the above can
be simplified - if we have a lookup function to lookup the symbol
to instance var slot, we can just use our &lt;CODE&gt;index&lt;/CODE&gt; primitive to
get the address; we'll return to that, but the above is the current
state of attr_reader)&lt;/P&gt;

&lt;P&gt;What that calls for is actually proper closures. One step harder.&lt;/P&gt;

&lt;P&gt;There's the other issue that the above will be really inefficient.
As in, requiring a hashtable lookup on every call inefficient, when
we can statically determine the offset for any instance variable we
know about at compile time. We'll get back to that at some point too,
but if you've been following this series you know I care about getting
things working first, efficient second.&lt;/P&gt;

&lt;P&gt;For now attr_reader and friends still serve as a useful test case
for basic closure support.&lt;/P&gt;

&lt;H2&gt;Baby steps&lt;/H2&gt;

&lt;P&gt;The first step towards closures is actually easy (we've already done
it). We have our &lt;CODE&gt;:lambda&lt;/CODE&gt; primitive that creates an anonymous function
(which isn't really anonymous, it's just that the name is quitely created
in the compiler and not accessible to the programmer).&lt;/P&gt;

&lt;H2&gt;Adding an environment&lt;/H2&gt;

&lt;P&gt;But an anonymous function is only of relatively limited use if you can't
bind variables to it - it's not much more than a function pointer in that
case.&lt;/P&gt;

&lt;P&gt;So how do we let "sym" hang around, and how do we take advantage of that
for &lt;CODE&gt;define_method&lt;/CODE&gt;?&lt;/P&gt;

&lt;P&gt;First, let us consider how you can call a &lt;CODE&gt;lambda&lt;/CODE&gt; in Ruby:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;yield&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;Proc#call&lt;/CODE&gt;&lt;/LI&gt;
&lt;/UL&gt;


&lt;P&gt;We could implement &lt;CODE&gt;Proc#call&lt;/CODE&gt; via &lt;CODE&gt;yield&lt;/CODE&gt; by passing the block as an argument to
a method, if we wanted to, but that would probably be more inefficient than
implementing both in terms of a primitive.&lt;/P&gt;

&lt;P&gt;But this means those are the only things that has to explicitly know how
to deal with the environment.&lt;/P&gt;

&lt;P&gt;That gives us an interesting option. We could turn:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="ident"&gt;c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;d&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;somevalues&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;b&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;...&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;uses&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;d&lt;/SPAN&gt; 
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;into&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;class &lt;/SPAN&gt;&lt;SPAN class="class"&gt;SomePrivateUniqueClassName&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;initialize&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;d&lt;/SPAN&gt;
           &lt;SPAN class="attribute"&gt;@c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="attribute"&gt;@d&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;d&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
        
        &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;b&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
            &lt;SPAN class="punct"&gt;...&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;code&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;here&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;with&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;access&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;to&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;c&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;d&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;rewritten&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;And we could have it inherit &lt;CODE&gt;Proc&lt;/CODE&gt;. &lt;CODE&gt;yield&lt;/CODE&gt; in that case is just rewritten to
&lt;CODE&gt;block.call&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;This does however have the troubling implication of messing with self, and
as this irb session shows, we'd have to fix that:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&gt;&gt; lambda { p self }.call
main
=&gt; nil
&gt;&gt; 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The other alternative is to create a lightweight environment binding of sorts.
Oh wait, we already have that. It's called an object. The problem is not
the class above as such, but creating a full class for each block. We could
do something like this instead:&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;class &lt;/SPAN&gt;&lt;SPAN class="class"&gt;Proc&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;initialize&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;&amp;&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;block&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# So Proc#initialize takes a block, but we're using this&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# to create blocks... Uh oh, I sense endless recursion.&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# We need to make it work so we can initialize a Proc&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# both from a block and from a raw function pointer&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
      
      &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;call&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;args&lt;/SPAN&gt;
         &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;call @block_func (res args&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;))&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Our code needs to be rewritten, so that blocks get wrapped in code to create
the appropriate &lt;CODE&gt;Proc&lt;/CODE&gt; object. Additionally, if the method uses variables
from the surrounding scope, we need to alter the surrounding method and the
block to refer to instance variables in this object, and if any of those
variables are arguments to the surrounding scope, we need to copy them.&lt;/P&gt;

&lt;H3&gt;An example&lt;/H3&gt;

&lt;P&gt;Here's a simple example using a closure. Note the use of s-expressions here
because we're now starting to get bitten by the changes we've done to turn
Ruby code by default into method calls (the way it should be) without having
put in the pre-requisite work to implement the number classes and &lt;CODE&gt;Kernel&lt;/CODE&gt;
methods (such as &lt;CODE&gt;print&lt;/CODE&gt; to replace the libc &lt;CODE&gt;printf&lt;/CODE&gt;). Ignore that for now.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;mkcounter&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;step&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;cnt&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;lambda&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;assign cnt (add cnt step&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;))&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;printf "cnt: %d\n" cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    
    &lt;SPAN class="ident"&gt;cnt&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;mkcounter&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="number"&gt;5&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
    &lt;SPAN class="ident"&gt;cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;call&lt;/SPAN&gt;
    &lt;SPAN class="ident"&gt;puts&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;&lt;SPAN class="string"&gt;Calling again...&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;
    &lt;SPAN class="ident"&gt;cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;call&lt;/SPAN&gt;
    &lt;SPAN class="ident"&gt;puts&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;&lt;SPAN class="string"&gt;Done&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;And here's the expected output:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;cnt: 5
Calling again...
cnt: 10
Done
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And here's the syntax tree after we've done the required transformations
(more about that in a second):&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:do&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:defm&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
      &lt;SPAN class="symbol"&gt;:mkcounter&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
      &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:step&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
      &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:let&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:malloc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;8&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:step&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:do&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
         &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
         &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:defun&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;&lt;SPAN class="string"&gt;.L2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;",&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:self&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__closure__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:let&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
           &lt;SPAN class="punct"&gt;[],&lt;/SPAN&gt;
           &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
            &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
             &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
             &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:add&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]],&lt;/SPAN&gt;
           &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:printf&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;&lt;SPAN class="string"&gt;cnt: %d&lt;SPAN class="escape"&gt;\&lt;/SPAN&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;",&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]]]],&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__new_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]]]],&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:mkcounter&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;5&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]],&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:callm&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:puts&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__get_string&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:"&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;.L0&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;"&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]],&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:callm&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:cnt&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
     &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:puts&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__get_string&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:"&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;.L1&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;"&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]]]&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;That's a bit of a mouthful, so lets go through it the new bits&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:let&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;This is added to hold a pointer to the environment we create to hold
&lt;CODE&gt;cnt&lt;/CODE&gt; and &lt;CODE&gt;step&lt;/CODE&gt;, and to hold a pointer to the function we use to create
the &lt;CODE&gt;Proc&lt;/CODE&gt; respectively.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:malloc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;8&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:step&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Then we allocate the environment. We copy the argument, and from now
on we use &lt;CODE&gt;(index env 1)&lt;/CODE&gt; to refer to &lt;CODE&gt;step&lt;/CODE&gt;. We then carry out the
&lt;CODE&gt;cnt = 0&lt;/CODE&gt; statement. &lt;CODE&gt;cnt&lt;/CODE&gt; and &lt;CODE&gt;step&lt;/CODE&gt; are both moved into the environment
because they are used inside the closure.&lt;/P&gt;

&lt;P&gt;(You may have already picked up that we currently won't handle nested
closures - lets leave some fun for later)&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
         &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
         &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:defun&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;"&lt;/SPAN&gt;&lt;SPAN class="string"&gt;.L2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;",&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:self&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__closure__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Then we create a function and assign the address of the function to
&lt;CODE&gt;__tmp_proc&lt;/CODE&gt;. As you can see there are tree synthetic arguments to it:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;self&lt;/CODE&gt;,&lt;CODE&gt;__closure__&lt;/CODE&gt; and &lt;CODE&gt;__env__&lt;/CODE&gt;. The first is, as you'd expect, the
object the method is called on. The second is to be used when passing
blocks to a method, and the third is used to refer to the environment
when inside.&lt;/P&gt;

&lt;P&gt;(Something I just realized: We're begging for a name clash, if there's
a closure defined inside that needs a separate environment. Obnoxious
details; later)&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
            &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
             &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;
             &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:add&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]],&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;And this monstrosity is what &lt;CODE&gt;cnt = cnt + step&lt;/CODE&gt; was turned into.&lt;/P&gt;

&lt;P&gt;Now, to make this work, we obviously need this &lt;CODE&gt;__new_proc&lt;/CODE&gt; function,
and a &lt;CODE&gt;Proc#call&lt;/CODE&gt; that's sensible. Here's our current &lt;CODE&gt;Proc&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;class &lt;/SPAN&gt;&lt;SPAN class="class"&gt;Proc&lt;/SPAN&gt;
      &lt;SPAN class="comment"&gt;# FIXME: Add support for handling arguments (and blocks...)&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;call&lt;/SPAN&gt;
        &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;call (index self 1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;self&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;self&lt;/SPAN&gt; &lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)))&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    
    &lt;SPAN class="comment"&gt;# We can't create a Proc from a raw function directly, so we get&lt;/SPAN&gt;
    &lt;SPAN class="comment"&gt;# nasty. The advantage of this rather ugly method is that we&lt;/SPAN&gt;
    &lt;SPAN class="comment"&gt;# don't in any way expose a constructor that takes raw functions&lt;/SPAN&gt;
    &lt;SPAN class="comment"&gt;# to normal Ruby&lt;/SPAN&gt;
    &lt;SPAN class="comment"&gt;#&lt;/SPAN&gt;
    &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;defun __new_proc (addr env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
    &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;let&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;p&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
     &lt;SPAN class="comment"&gt;# Assuming 3 pointers for the instance size. Need a better way for this&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;assign&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;p&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;malloc&lt;/SPAN&gt; &lt;SPAN class="number"&gt;12&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;))&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;assign&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;p&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;Proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;assign&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;p&lt;/SPAN&gt; &lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;addr&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;assign&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;p&lt;/SPAN&gt; &lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
       &lt;SPAN class="ident"&gt;p&lt;/SPAN&gt;
    &lt;SPAN class="punct"&gt;))&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Very simplistic and rather ugly, but it does the work. As you can see,
the environment pointer is stored in the &lt;CODE&gt;Proc&lt;/CODE&gt; object, and &lt;CODE&gt;Proc#call&lt;/CODE&gt;
hides the ugliness.&lt;/P&gt;

&lt;P&gt;You may notice that this is inefficient - an extra level of indirection.
Indeed it is - we can in theory make the environment part of the &lt;CODE&gt;Proc&lt;/CODE&gt;
object and save an indexed lookup, and there's a bunch of other tricks
waiting. But again, that's optimizations we wont deal with now.&lt;/P&gt;

&lt;P&gt;For now, lets move on to how to do the appropriate changes to get the
example output above.&lt;/P&gt;

&lt;H2&gt;Some refactoring, and a few rewrites&lt;/H2&gt;

&lt;P&gt;Almost all the changes for this is in the &lt;CODE&gt;Compiler&lt;/CODE&gt; class, and the
code in question is now in the &lt;CODE&gt;transform.rb&lt;/CODE&gt; file. There are some
other small changes, but I don't think they need to be covered in
detail.&lt;/P&gt;

&lt;P&gt;Specifically, &lt;CODE&gt;transform.rb&lt;/CODE&gt; represents the start of a bit of refactoring.&lt;/P&gt;

&lt;P&gt;Some constructs in the &lt;CODE&gt;Compiler&lt;/CODE&gt; class can be built easily on top of
more primitive operations. &lt;CODE&gt;lambda&lt;/CODE&gt; was an example of that.&lt;/P&gt;

&lt;P&gt;Overall, a lot of things can be done by rewriting the parse tree. Doing
it that was has the distinct advantage that it's possible to switch
the various rewrites on/off to debug their effects, and to see the
result long before it ends up as machine code.&lt;/P&gt;

&lt;P&gt;It also helps isolate the CPU architecture specific parts further.&lt;/P&gt;

&lt;P&gt;As a first step, &lt;CODE&gt;transform.rb&lt;/CODE&gt; contains methods that solely rewrite
the syntax tree. They are still currently part of the &lt;CODE&gt;Compiler&lt;/CODE&gt; class,
and do make use of the &lt;CODE&gt;Emitter&lt;/CODE&gt; (specifically &lt;CODE&gt;Emitter.get_local&lt;/CODE&gt; to
get a unique label), but this can be changed later.&lt;/P&gt;

&lt;P&gt;Unfortunately not all of these rewrites are small and simple.&lt;/P&gt;

&lt;P&gt;In this round, we're left with three:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;rewrite_strconst&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;rewrite_let_env&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;rewrite_lambda&lt;/CODE&gt;&lt;/LI&gt;
&lt;/UL&gt;


&lt;P&gt;&lt;CODE&gt;rewrite_strconst&lt;/CODE&gt; used to be in &lt;CODE&gt;compiler.rb&lt;/CODE&gt; and is mostly unchanged.&lt;/P&gt;

&lt;H3&gt;rewrite_lambda&lt;/H3&gt;

&lt;P&gt;&lt;CODE&gt;rewrite_lambda&lt;/CODE&gt; replaces the &lt;CODE&gt;lambda&lt;/CODE&gt; handling in &lt;CODE&gt;compiler.rb&lt;/CODE&gt; with a
rewriting step that looks like this:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;rewrite_lambda&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;depth_first&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;next&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:skip&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:lambda&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;args&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;||&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[]&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;body&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;||&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;nil&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;clear&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:do&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; 
            &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:defun&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="attribute"&gt;@e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;get_local&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
              &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:self&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__closure__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]+&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;args&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;
              &lt;SPAN class="ident"&gt;body&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__new_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;What this rewrite does, is use our &lt;CODE&gt;Array#depth_first&lt;/CODE&gt; method from
&lt;CODE&gt;extensions.rb&lt;/CODE&gt; to descend down the parse tree looking for &lt;CODE&gt;:lambda&lt;/CODE&gt;
nodes. It will skip any &lt;CODE&gt;:sexp&lt;/CODE&gt; nodes, as they are used as "guards"
to mark areas that should not be touched by rewrites.&lt;/P&gt;

&lt;P&gt;When it finds a &lt;CODE&gt;:lambda&lt;/CODE&gt;, it will replace the &lt;CODE&gt;:lamda&lt;/CODE&gt; with this:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="punct"&gt;%s(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;do
       (assign __tmp_proc
               (defun [result of @e.get_local
                 (self __closure__ __env__ [ + args]&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
                   &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;body&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]))&lt;/SPAN&gt;
       &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;sexp&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;call&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;__new_proc&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;__tmp_proc&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;))&lt;/SPAN&gt;
      &lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Effectively defining the lambda (as we did before), but then
calling &lt;CODE&gt;__new_proc&lt;/CODE&gt; with the result. You'll note &lt;CODE&gt;__tmp_proc&lt;/CODE&gt;
is seemingly superfluous. It is in fact a workaround for a slight
problem: The &lt;CODE&gt;call&lt;/CODE&gt; needs to be within a &lt;CODE&gt;sexp&lt;/CODE&gt; to not be rewritten
to a &lt;CODE&gt;callm&lt;/CODE&gt; (Ruby method call), but the &lt;CODE&gt;defun&lt;/CODE&gt; can't be within a
&lt;CODE&gt;sexp&lt;/CODE&gt;, or other rewrites won't touch the body of the &lt;CODE&gt;defun&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;We'll clean that up later.&lt;/P&gt;

&lt;H3&gt;rewrite_let_env&lt;/H3&gt;

&lt;P&gt;Now for the hairy bit. Really hairy.&lt;/P&gt;

&lt;P&gt;The goal of this one is to identify variables in the methods - this
used to happen in the parser in a really simplistic way - combined with
identifying &lt;CODE&gt;:lambda&lt;/CODE&gt; nodes (and so this must run before &lt;CODE&gt;rewrite_lambda&lt;/CODE&gt;)
and lifting variables that are used inside the closure into an environment,
and update the &lt;CODE&gt;:let&lt;/CODE&gt; in the surrounding method definition to have a suitable
environment.&lt;/P&gt;

&lt;P&gt;It then also adds code to allocate the environment, as well
as to shadow method arguments. Finally it rewrites accesses to variables
that have been lifted into the environment, so that it uses &lt;CODE&gt;%s(index __env__ offset)&lt;/CODE&gt; instead of the variable name.&lt;/P&gt;

&lt;P&gt;Let's go through the code.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;rewrite_let_env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;depth_first&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:defm&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;We're hunting for &lt;CODE&gt;:defm&lt;/CODE&gt; nodes only. Everything else will be ignored.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;find_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],[&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]],&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;new&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
        
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;First step is to find a candidate set of variables. &lt;CODE&gt;find_vars&lt;/CODE&gt; is
another new method, and we'll go over that next. &lt;CODE&gt;e[3]&lt;/CODE&gt; is the body of
the method. &lt;CODE&gt;find_vars&lt;/CODE&gt; is recursive, and we pass in the methods arguments
(&lt;CODE&gt;e[2]&lt;/CODE&gt;) as the starting "scope". Third we pass in an empty &lt;CODE&gt;Set&lt;/CODE&gt; as the
the starting point for the environment.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;-=&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]].&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;to_a&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;We remove the method arguments, as we'll use &lt;CODE&gt;vars&lt;/CODE&gt; to create a &lt;CODE&gt;:let&lt;/CODE&gt;
node later on.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;size&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;&gt;&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;If &lt;CODE&gt;find_vars&lt;/CODE&gt; found any variables to make part of the environment, we
need to:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
           &lt;SPAN class="ident"&gt;body&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
           &lt;SPAN class="ident"&gt;rewrite_env_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;body&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;to_a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
           
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt; Rewrite access to the members of the environment&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
           &lt;SPAN class="ident"&gt;notargs&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;-&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]&lt;/SPAN&gt;
           &lt;SPAN class="ident"&gt;aenv&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;to_a&lt;/SPAN&gt;
           &lt;SPAN class="ident"&gt;extra_assigns&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;-&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;notargs&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;).&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;to_a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;collect&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
             &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;aenv&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)],&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
           &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
           
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Create assigns to copy all arguments that are used in the closures from the arguments on the stack into the environment      &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
           &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:sexp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:malloc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;  &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;size&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;*&lt;/SPAN&gt; &lt;SPAN class="number"&gt;4&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]]]]&lt;/SPAN&gt;
           
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Allocate the environment         &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
           &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;].&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;concat&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;extra_assigns&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
           &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;].&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;concat&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;body&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
 
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Tack on the extra assigns and the body.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
         &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# Always adding __env__ here is a waste, but it saves us (for now)&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# to have to intelligently decide whether or not to reference __env__&lt;/SPAN&gt;
         &lt;SPAN class="comment"&gt;# in the rewrite_lambda method&lt;/SPAN&gt;
         &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;&lt;&lt;&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;
         &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;&lt;&lt;&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__tmp_proc&lt;/SPAN&gt; &lt;SPAN class="comment"&gt;# Used in rewrite_lambda. Same caveats as for __env_&lt;/SPAN&gt;
    
         &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:let&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Then we create a &lt;CODE&gt;:let&lt;/CODE&gt; node with the new variable set, including &lt;CODE&gt;__env__&lt;/CODE&gt; and
&lt;CODE&gt;__tmp_proc&lt;/CODE&gt; which we've seen elsewhere.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
         &lt;SPAN class="symbol"&gt;:skip&lt;/SPAN&gt;
       &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
     &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Finally we skip any nodes below the &lt;CODE&gt;:defm&lt;/CODE&gt; so the &lt;CODE&gt;depth_first&lt;/CODE&gt; call does
not waste time.&lt;/P&gt;

&lt;H3&gt;find_vars&lt;/H3&gt;

&lt;P&gt;The purpose of &lt;CODE&gt;find_vars&lt;/CODE&gt; is to identify variables that should be put in a &lt;CODE&gt;let&lt;/CODE&gt;
node as well as that should be part of a closure environment. This one is hairy,
and a clear candidate for looking for ways to clean up later&lt;/P&gt;

&lt;P&gt;  def find_vars(e, scopes, env, in_lambda = false, in_assign = false)&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;return&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[],&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;false&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;!&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;First of all, if we've been called with a nil expression, there are now
variables to return, and the same environment passed in.  &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;!&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;is_a?&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Array&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
    &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;each&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;is_a?&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Array&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
      
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;An expression?  &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:assign&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env1&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;find_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;     &lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;+&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;new&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_lambda&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;true&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env2&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;find_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;..-&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;+&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;new&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_lambda&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env1&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;+&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env2&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;vars1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;+&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;vars2&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;each&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;{|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;v&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;push_var&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;v&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;}&lt;/SPAN&gt;
          
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;If  it's an assignment, then we want to process both the left hand and right hand,
but we need to mark the left hand since variables on the left hand introduce a new
variable in the scope.&lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="keyword"&gt;elsif&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:lambda&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;find_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;+&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Set&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;new&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;],&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="constant"&gt;true&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:let&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;*&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]]&lt;/SPAN&gt;
          
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Special casing on &lt;CODE&gt;:lambda&lt;/CODE&gt; because if we're handling a closure, and variables
found &lt;EM&gt;inside&lt;/EM&gt; the closure that were defined &lt;EM&gt;outside&lt;/EM&gt; means that those variables
needs to be lifted into the closure environment we're creating so that they're
not allocated on the stack.     &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="keyword"&gt;else&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt;    &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:callm&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;then&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sub&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;3&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;..-&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;elsif&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:call&lt;/SPAN&gt;  &lt;SPAN class="keyword"&gt;then&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sub&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;2&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;..-&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;else&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sub&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;..-&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;find_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;sub&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_lambda&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;each&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;{|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;v&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;push_var&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;v&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;}&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Finally we special case on &lt;CODE&gt;:callm&lt;/CODE&gt; and &lt;CODE&gt;:call&lt;/CODE&gt; when handling the remainder, because
they have arguments that are not sub-expressions to be considered when creating the
closure environment.  &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
      &lt;SPAN class="keyword"&gt;elsif&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;is_a?&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="constant"&gt;Symbol&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
        &lt;SPAN class="ident"&gt;sc&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;sc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;size&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;==&lt;/SPAN&gt; &lt;SPAN class="number"&gt;0&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;push_var&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_assign&lt;/SPAN&gt;
          
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;If &lt;CODE&gt;n&lt;/CODE&gt; is a variable that does not exist in any of the scopes we keep
track of, and it is not part of the environment, and it's on the left
hand of an assign, we add it to the top scope.&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;NOTE:&lt;/STRONG&gt; This current version is buggy. Not every variable potentially
occuring on the left hand side should be added, just the one assigned to.&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
        &lt;SPAN class="keyword"&gt;elsif&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;in_lambda&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;sc&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;first&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;delete&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;&lt;&lt;&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;n&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
        
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;If it &lt;EM&gt;is&lt;/EM&gt; in a scope &lt;EM&gt;and&lt;/EM&gt; we're in a lambda, then the variable needs
to be moved (deleted) from that scope and added to the closure environment.   &lt;BR&gt;
&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;return&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;scopes&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[-&lt;/SPAN&gt;&lt;SPAN class="number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;].&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;to_a&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;
  &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
&lt;/CODE&gt;&lt;/PRE&gt;


&lt;P&gt;Finally we return the innermost scope from this level and the environment&lt;/P&gt;

&lt;H3&gt;rewrite_env_vars&lt;/H3&gt;

&lt;P&gt;Finally let's look at &lt;CODE&gt;rewrite_env_vars&lt;/CODE&gt;. When we've gathered together all
the variables for the closure environment, we have an array, and any accesses
to those variables needs to be rewritten to (index __env__ num). That's
done like this, and I hope this one is simple enough not to go through line
for line:&lt;/P&gt;

&lt;PRE class="ruby"&gt;&lt;CODE&gt;
    &lt;SPAN class="keyword"&gt;def &lt;/SPAN&gt;&lt;SPAN class="method"&gt;rewrite_env_vars&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
      &lt;SPAN class="ident"&gt;exp&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;depth_first&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
        &lt;SPAN class="constant"&gt;STDERR&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;puts&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;inspect&lt;/SPAN&gt;
        &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;each_with_index&lt;/SPAN&gt; &lt;SPAN class="keyword"&gt;do&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;ex&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;i&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;|&lt;/SPAN&gt;
          &lt;SPAN class="ident"&gt;num&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;env&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;.&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;(&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;ex&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;)&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;num&lt;/SPAN&gt;
            &lt;SPAN class="ident"&gt;e&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="ident"&gt;i&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;=&lt;/SPAN&gt; &lt;SPAN class="punct"&gt;[&lt;/SPAN&gt;&lt;SPAN class="symbol"&gt;:index&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="symbol"&gt;:__env__&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;,&lt;/SPAN&gt; &lt;SPAN class="ident"&gt;num&lt;/SPAN&gt;&lt;SPAN class="punct"&gt;]&lt;/SPAN&gt;
          &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
        &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
      &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;
    &lt;SPAN class="keyword"&gt;end&lt;/SPAN&gt;

&lt;/CODE&gt;&lt;/PRE&gt;


&lt;H2&gt;Next steps&lt;/H2&gt;

&lt;P&gt;We still haven't actually added support for &lt;CODE&gt;define_method&lt;/CODE&gt;, but we now finally
have most of the plumbing. That's the next step.&lt;/P&gt;

&lt;P&gt;Then I want to start alternating between something more practical: Making the
compiler compile early/simple versions of itself, and let it "eat its own tail" so to speak, and secondly to start refactoring and cleaning it up.&lt;/P&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b_UToQUyOdI:z0k_oysVqzg:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b_UToQUyOdI:z0k_oysVqzg:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b_UToQUyOdI:z0k_oysVqzg:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b_UToQUyOdI:z0k_oysVqzg:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b_UToQUyOdI:z0k_oysVqzg:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/b_UToQUyOdI" height="1" width="1"/&gt;</description>
      <category>compiler in Ruby bottom up</category>
      <category> compiler</category>
      <category>ruby</category>
      <category> scope</category>
      <category>closures</category>
      <pubDate>Sat, 12 Jun 2010 17:21:29 -0400</pubDate>
      <dc:date>2010-06-12T17:21:29-04:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-25.html</feedburner:origLink></item>
    <item>
      <title>Minimig</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/uGOmx9bjB3I/minimig.html</link>
      <description>I just got my &lt;a href="http://en.wikipedia.org/wiki/Minimig"&gt;Minimig&lt;/a&gt; Amiga 500 re-implementation... 

&lt;a href="http://en.wikipedia.org/wiki/Lotus_(computer_games)"&gt;Lotus Esprit Turbo Challenge&lt;/a&gt; on my new 42" LED backlit LCD: (Minimig in the cabinet with the red led...)

&lt;img src="/static/lotus-minimig.jpg"&gt;

That is all.&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=uGOmx9bjB3I:GKx_0sYpNAQ:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=uGOmx9bjB3I:GKx_0sYpNAQ:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=uGOmx9bjB3I:GKx_0sYpNAQ:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=uGOmx9bjB3I:GKx_0sYpNAQ:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=uGOmx9bjB3I:GKx_0sYpNAQ:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/uGOmx9bjB3I" height="1" width="1"/&gt;</description>
      <category>lotus</category>
      <category>minimig</category>
      <category>amiga</category>
      <category>nostalgia</category>
      <category>games</category>
      <pubDate>Tue, 02 Mar 2010 19:31:58 -0500</pubDate>
      <dc:date>2010-03-02T19:31:58-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/minimig.html</feedburner:origLink></item>
    <item>
      <title>Writing a (Ruby) compiler in Ruby bottom up - step 24</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/XVNQVNs0Bh0/writing-a-compiler-in-ruby-bottom-up-step-24.html</link>
      <description>&lt;span style="color: red; "&gt;This is &lt;a href="http://www.hokstad.com/compiler"&gt;part of a series&lt;/a&gt; I started in March 2008 - you may want to go back and look at older parts if you're new to this series.&lt;/span&gt;

Apologies for the delay... This part was ready before christmas, but for various reasons I never got around to posting it. And to make matters worse I managed to post the wrong part yesterday. Sigh.

&lt;h1&gt;The Outer Scope&lt;/h1&gt;

&lt;p&gt;Stepping back from the &lt;code&gt;attr_*&lt;/code&gt; debacle for a bit... I wanted to look
at something else, mostly because it affects all my ad-hoc testing.&lt;/p&gt;

&lt;p&gt;One of the downsides of starting this without a specific design in mind
is that after I decided to go down the Ruby path we've been dragging
along quite a bit of cruft.&lt;/p&gt;

&lt;p&gt;But that is the cost of experimentation.&lt;/p&gt;

&lt;p&gt;In particular, the compiler has a weird distinction between functions
and methods: Define something with &lt;code&gt;def&lt;/code&gt; outside of a class and it
becomes a function, inside it becomes a method.&lt;/p&gt;

&lt;p&gt;We &lt;em&gt;need&lt;/em&gt; support for functions to make implementation reasonably
easy: It makes bootstrapping the object model far easier, as well as
integration with the outside C-based world.&lt;/p&gt;

&lt;p&gt;But Ruby doesn't &lt;em&gt;have&lt;/em&gt; functions.&lt;/p&gt;

&lt;p&gt;So what to do?&lt;/p&gt;

&lt;p&gt;Recently I wrote about hiding the "low level plumbing" and the
change I'm about to show you gets us a bit closer to that while at the
same time starting to clean up the function vs. method mess.&lt;/p&gt;

&lt;p&gt;In Ruby, &lt;code&gt;self&lt;/code&gt; outside of a class is the &lt;code&gt;main&lt;/code&gt; object - an instance
of &lt;code&gt;Object&lt;/code&gt;. Logically then, for a &lt;code&gt;def&lt;/code&gt; outside of a class, the method
will be defined on `Object. So, we'll create that object.&lt;/p&gt;

&lt;h2&gt;What about function calls?&lt;/h2&gt;

&lt;p&gt;Defining and calling functions will still be possible, but only using
the s-expression syntax. That reasonably cleanly "hides" the plumbing
we use functions for (in fact, it means we could stop the annoying convention
I started of prefixing the names with "__" since they effectively now
live in their own namespace, though I haven't changed that yet).&lt;/p&gt;

&lt;h2&gt;Some preparations&lt;/h2&gt;

&lt;p&gt;First we must rewrite a few functions fully as s-expressions. This is
straightforward. For example:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;__get_string&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;str&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;  &lt;span class="ident"&gt;s&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;String&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;  &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;callm s __set_raw (str&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
    &lt;span class="punct"&gt;+%s(&lt;/span&gt;&lt;span class="symbol"&gt;defun __get_string (str&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;let&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;s&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;  &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="ident"&gt;s&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;callm&lt;/span&gt; &lt;span class="constant"&gt;String&lt;/span&gt; &lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;  &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;callm&lt;/span&gt; &lt;span class="ident"&gt;s&lt;/span&gt; &lt;span class="ident"&gt;__set_raw&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;str&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
       &lt;span class="ident"&gt;s&lt;/span&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;&lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="punct"&gt;+))&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Otherwise the upcoming changes will turn them into methods, which is
not what we want.&lt;/p&gt;

&lt;p&gt;I'm not going to present all of them. The biggest one is probably
&lt;code&gt;__new_class_object&lt;/code&gt;, but all of these are straight forward translations.&lt;/p&gt;

&lt;p&gt;These changes were done on the master branch prior to starting this
feature proper.&lt;/p&gt;

&lt;h2&gt;Splitting up "defun"&lt;/h2&gt;

&lt;p&gt;You can follow these remaining changes &lt;a href="http://github.com/vidarh/writing-a-compiler-in-ruby/tree/main-object"&gt;on the main-object branch&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So far &lt;code&gt;defun&lt;/code&gt; has been made up by two mostly different branches: If
occurring inside a class, it's compiled as a method, if not it'd compile
a function.&lt;/p&gt;

&lt;p&gt;This is both messy and not very logical.&lt;/p&gt;

&lt;p&gt;Going forward we'll separate it into &lt;code&gt;defm&lt;/code&gt; that defines a Ruby style
method, and &lt;code&gt;defun&lt;/code&gt;, which, as before, defines a function. &lt;code&gt;defun&lt;/code&gt;
will be completely hidden from normal Ruby code - you'll have to dip
down into the s-expression syntax to access it.&lt;/p&gt;

&lt;p&gt;First our list of &lt;code&gt;compile_*&lt;/code&gt; methods in compiler.rb:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;                   &lt;span class="symbol"&gt;:do&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:class&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:defun&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:if&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:lambda&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;                   &lt;span class="symbol"&gt;:do&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:class&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:defun&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:defm&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:if&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:lambda&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The new, simplified &lt;code&gt;compile_defun&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="comment"&gt;# Compiles a function definition.&lt;/span&gt;
    &lt;span class="comment"&gt;# Takes the current scope, in which the function is defined,&lt;/span&gt;
    &lt;span class="comment"&gt;# the name of the function, its arguments as well as the body-expression&lt;/span&gt;
    &lt;span class="comment"&gt;# that holds the actual code for the function's body.&lt;/span&gt;
    &lt;span class="comment"&gt;#&lt;/span&gt;
    &lt;span class="comment"&gt;# Note that compile_defun is now only accessed via s-expressions&lt;/span&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;compile_defun&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;f&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Function&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;name&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;clean_method_name&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    
      &lt;span class="comment"&gt;# add function to the global list of functions defined so far&lt;/span&gt;
      &lt;span class="attribute"&gt;@global_functions&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;
    
       &lt;span class="comment"&gt;# a function is referenced by its name (in assembly this is a label).&lt;/span&gt;
       &lt;span class="comment"&gt;# wherever we encounter that name, we really need the adress of the label.&lt;/span&gt;
       &lt;span class="comment"&gt;# so we mark the function with an adress type.&lt;/span&gt;
       &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:addr&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;clean_method_name&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;)]&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This is really more or less just tearing out the method part, and brings it
roughly back to how it was before we added the method calling bit.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;compile_defm&lt;/code&gt; is also somewhat simpler than the old method part:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="comment"&gt;# Compiles a method definition and updates the&lt;/span&gt;
    &lt;span class="comment"&gt;# class vtable.&lt;/span&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;compile_defm&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;scope&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;class_scope&lt;/span&gt;
    
      &lt;span class="ident"&gt;f&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Function&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;([&lt;/span&gt;&lt;span class="symbol"&gt;:self&lt;/span&gt;&lt;span class="punct"&gt;]+&lt;/span&gt;&lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# "self" is "faked" as an argument to class methods.&lt;/span&gt;
    
      &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;comment&lt;/span&gt;&lt;span class="punct"&gt;("&lt;/span&gt;&lt;span class="string"&gt;method &lt;span class="expr"&gt;#{name}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;")&lt;/span&gt;
    
      &lt;span class="ident"&gt;body&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;depth_first&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;exp&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
        &lt;span class="ident"&gt;exp&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;n&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; 
          &lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;add_ivar&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;n&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;n&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;is_a?&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;Symbol&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="keyword"&gt;and&lt;/span&gt; &lt;span class="ident"&gt;n&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_s&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt; &lt;span class="punct"&gt;==&lt;/span&gt; &lt;span class="char"&gt;?@&lt;/span&gt; &lt;span class="punct"&gt;&amp;&amp;&lt;/span&gt; &lt;span class="ident"&gt;n&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_s&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt; &lt;span class="punct"&gt;!=&lt;/span&gt; &lt;span class="char"&gt;?@&lt;/span&gt;
        &lt;span class="keyword"&gt;end&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
    
      &lt;span class="ident"&gt;cleaned&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;clean_method_name&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;fname&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;__method_&lt;span class="expr"&gt;#{scope.name}&lt;/span&gt;_&lt;span class="expr"&gt;#{cleaned}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;"&lt;/span&gt;
      &lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set_vtable_entry&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;fname&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    
      &lt;span class="comment"&gt;# Save to the vtable.&lt;/span&gt;
      &lt;span class="ident"&gt;v&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;vtable&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
      &lt;span class="ident"&gt;compile_eval_arg&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;scope&lt;/span&gt;&lt;span class="punct"&gt;,[&lt;/span&gt;&lt;span class="symbol"&gt;:sexp&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:call&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:__set_vtable&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:self&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="ident"&gt;v&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;offset&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;fname&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_sym&lt;/span&gt;&lt;span class="punct"&gt;]]])&lt;/span&gt;
        
      &lt;span class="comment"&gt;# add the method to the global list of functions defined so far&lt;/span&gt;
      &lt;span class="comment"&gt;# with its "munged" name.&lt;/span&gt;
      &lt;span class="attribute"&gt;@global_functions&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;fname&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;
        
      &lt;span class="comment"&gt;# This is taken from compile_defun - it does not necessarily make sense for defm&lt;/span&gt;
      &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="symbol"&gt;:addr&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;clean_method_name&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fname&lt;/span&gt;&lt;span class="punct"&gt;)]&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The most important changes?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The call to &lt;code&gt;scope.class_scope&lt;/code&gt; at the top.&lt;/li&gt;
&lt;li&gt;The change to call &lt;code&gt;__set_vtable&lt;/code&gt; near the bottom instead of hardcoding the asm to set the vtable entry.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So, why those changes?&lt;/p&gt;

&lt;p&gt;If you look at &lt;code&gt;GlobalScope&lt;/code&gt;, we've added a &lt;code&gt;@class_scope&lt;/code&gt; instance variable
that is initialized to this:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="attribute"&gt;@class_scope&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;ClassScope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;self&lt;/span&gt;&lt;span class="punct"&gt;,"&lt;/span&gt;&lt;span class="string"&gt;Object&lt;/span&gt;&lt;span class="punct"&gt;",&lt;/span&gt;&lt;span class="attribute"&gt;@vtableoffsets&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This means that if you have a &lt;code&gt;:defm&lt;/code&gt; occurring in the global scope (we'll get to that)
it will be compiled in that class scope, as a method of class Object.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Poof&lt;/em&gt; and our functions are (almost) gone.&lt;/p&gt;

&lt;p&gt;(In &lt;code&gt;ClassScope&lt;/code&gt; we just return &lt;code&gt;self&lt;/code&gt; from the new &lt;code&gt;class_scope&lt;/code&gt; method.)&lt;/p&gt;

&lt;p&gt;For the second change, adding &lt;code&gt;__set_vtable&lt;/code&gt; is just the purist in me
wanting to expel as much assembler as possible. Here it is (from lib/core/class.rb):&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="comment"&gt;# FIXME: This only works correctly for the initial&lt;/span&gt;
    &lt;span class="comment"&gt;# class definition. On subsequent re-opens of the class&lt;/span&gt;
    &lt;span class="comment"&gt;# it will fail to correctly propagate vtable changes &lt;/span&gt;
    &lt;span class="comment"&gt;# downwards in the class hierarchy if the class has&lt;/span&gt;
    &lt;span class="comment"&gt;# since been overloaded.&lt;/span&gt;
    &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;defun __set_vtable (vtable off ptr&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;index&lt;/span&gt; &lt;span class="ident"&gt;vtable&lt;/span&gt; &lt;span class="ident"&gt;off&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="ident"&gt;ptr&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="punct"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Ok, so not just the purist in me. In fact, this makes re-opening classes
"cleaner" in that this function will actually get reasonably complex as we
fix this issue, and also later handle &lt;code&gt;define_method&lt;/code&gt; and otherwise deal
with cases where no vtable slot is actually available.&lt;/p&gt;

&lt;p&gt;Anything else? A few pieces of cleanups to deal with the scope changes,
and this little gem / turd depending on your point of view, in lib/core/core.rb:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;&lt;span class="comment"&gt;# OK, so perhaps this is a bit ugly...&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;&lt;span class="constant"&gt;self&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;Object&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Ehm. We probably should put that in the compiler, or at least prevent
arbitrary assigning to self in user code. But it works for now, and we've
done worse in the past. To reiterate: Get things working &lt;em&gt;first&lt;/em&gt;. Make
them fast/pretty later.&lt;/p&gt;

&lt;p&gt;The last vital change is this, in &lt;code&gt;Parser#parse_def&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="punct"&gt;-&lt;/span&gt;    &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="constant"&gt;E&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;pos&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:defun&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;exps&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
    &lt;span class="punct"&gt;+&lt;/span&gt;    &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="constant"&gt;E&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;pos&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:defm&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;args&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;exps&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt;
       &lt;span class="keyword"&gt;end&lt;/span&gt;
       
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;It does what it looks like - make the parser for the Ruby code spit out
only &lt;code&gt;:defm&lt;/code&gt; nodes instead of &lt;code&gt;:defun&lt;/code&gt; nodes. Our code is now free of
functions anywhere outside of s-expressions.&lt;/p&gt;

&lt;h2&gt;What's the effect?&lt;/h2&gt;

&lt;p&gt;These changes make this work as expectd:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Foo&lt;/span&gt;
      &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;hello&lt;/span&gt;
        &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;hello&lt;/span&gt;&lt;span class="punct"&gt;"&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
    
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;hello_world&lt;/span&gt;
      &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;hello world!&lt;/span&gt;&lt;span class="punct"&gt;"&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
    
    &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;puts "foo"&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# Calls the C function&lt;/span&gt;
    &lt;span class="constant"&gt;Foo&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;hello&lt;/span&gt;  &lt;span class="comment"&gt;# Calls Foo#hello which calls Object#puts&lt;/span&gt;
    &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;world&lt;/span&gt;&lt;span class="punct"&gt;"&lt;/span&gt;   &lt;span class="comment"&gt;# Calls Object#puts (via the "main" instance)&lt;/span&gt;
    
    &lt;span class="ident"&gt;hello_world&lt;/span&gt;    &lt;span class="comment"&gt;# Calls Object#hello_world (via the "main" instance)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Adding this on the end on the other hand still fails (but it works in
MRI) because &lt;code&gt;__set_vtable&lt;/code&gt; doesn't propagate the vtable update
downwards to subclasses of &lt;code&gt;Object&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="constant"&gt;Foo&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;hello_world&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;When re-opening a class, new methods needs to be added "properly" by
processing all child classes and updating their vtables.&lt;/p&gt;

&lt;p&gt;The alternative (which moves the cost from the point of defining a
method to the point of calling the method), is to implement a proper
default method missing that ascends the class hierarchy to look for an
implementation before giving an error.&lt;/p&gt;

&lt;p&gt;We really should do both, since we eventually need to handle methods
without vtable entries.&lt;/p&gt;

&lt;p&gt;But that's for a future installment&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=XVNQVNs0Bh0:svoVBHZculI:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=XVNQVNs0Bh0:svoVBHZculI:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=XVNQVNs0Bh0:svoVBHZculI:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=XVNQVNs0Bh0:svoVBHZculI:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=XVNQVNs0Bh0:svoVBHZculI:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/XVNQVNs0Bh0" height="1" width="1"/&gt;</description>
      <category>compiler in Ruby bottom up</category>
      <category> compiler</category>
      <category>ruby</category>
      <category> scope</category>
      <pubDate>Mon, 22 Feb 2010 12:35:05 -0500</pubDate>
      <dc:date>2010-02-22T12:35:05-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-24.html</feedburner:origLink></item>
    <item>
      <title>How to implement closures</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/44KBvsuafCU/how-to-implement-closures.html</link>
      <description>&lt;p&gt;This is a sort-of interlude to my regular &lt;a href="http://www.hokstad.com/compiler"&gt;compiler series&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The goal is to give a brief overview of some techniques for implementing &lt;a href="http://en.wikipedia.org/wiki/Closure_(computer_science"&gt;closures&lt;/a&gt; in a programming language. I will use C for my examples, mostly because it's low level enough that a further translation to assembler etc. is straight forward, and many compilers target C directly anyway.&lt;/p&gt;

&lt;p&gt;Before we start, most of the code examples in this post are available &lt;a href="http://gist.github.com/259462"&gt;in this Gist&lt;/a&gt;
so you don't need to cut and paste bits and pieces (which won't work anyway, as the text below
will omit details such as header includes)&lt;/p&gt;

&lt;p&gt;A &lt;a href="http://en.wikipedia.org/wiki/Closure_(computer_science"&gt;closure&lt;/a&gt;) is in it's simplest form
a block of code that can be passed around as a value, and that can reference variables in the
scope it was created in even after exiting from that scope.&lt;/p&gt;

&lt;p&gt;(For a more formal description look at the Wikipedia page linked above)&lt;/p&gt;

&lt;p&gt;A Ruby block forms a closure, for example:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;foo&lt;/span&gt; &lt;span class="ident"&gt;x&lt;/span&gt;
      &lt;span class="ident"&gt;printf&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;x is %d&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;",&lt;/span&gt;&lt;span class="ident"&gt;x&lt;/span&gt;

      &lt;span class="ident"&gt;lambda&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt;
        &lt;span class="ident"&gt;x&lt;/span&gt; &lt;span class="punct"&gt;+=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt;
        &lt;span class="ident"&gt;printf&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;block: x is %d&lt;span class="escape"&gt;\n&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;",&lt;/span&gt;&lt;span class="ident"&gt;x&lt;/span&gt;

      &lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
    
    &lt;span class="ident"&gt;c&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;foo&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;5&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
    &lt;span class="ident"&gt;c&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;call&lt;/span&gt;

    &lt;span class="ident"&gt;c&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;call&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Every time &lt;code&gt;foo&lt;/code&gt; is called it will return a closure that has access to the &lt;code&gt;x&lt;/code&gt; that was
passed in to &lt;code&gt;foo&lt;/code&gt;. In the example, &lt;code&gt;x&lt;/code&gt; will start out at 5 and get incremented to 7.&lt;/p&gt;

&lt;p&gt;The most important implication of this is that any variables used in the closure must be
guaranteed to live as long as the closure does.&lt;/p&gt;

&lt;p&gt;We'll get back to this example throughout this article.&lt;/p&gt;

&lt;p&gt;There are a number of ways of handling this, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Instead of a traditional stack, put activation frames (arguments and local variables)
for function/method calls on the heap, as a linked list. When creating a closure, you
just keep a reference to it. When returning from a function you unlink the frame from
the ones below. Presto, the gc handles the remaining dirty details. Some variations of this
is called a &lt;a href="http://en.wikipedia.org/wiki/Spaghetti_stack"&gt;spaghetti stack&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Rewriting. You can rewrite any function that may create a closure to create a separate
heap allocated environment, and to copy/locate all arguments and variables that may be
reused into it. The environment can safely be returned with the closure.&lt;/li&gt;

&lt;li&gt;Copy. You can copy the current activation frame into a separate closure environment when
returning the closure, ensure the closure refers to the variables via a reference, that
is replaced with a references to the environment copy.&lt;/li&gt;
&lt;li&gt;More? Probably...&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Each of these have advantages and disadvantages.&lt;/p&gt;

&lt;p&gt;We're going to look at the rewriting method mainly, though most of what you find below also
apply to the copying method.&lt;/p&gt;

&lt;p&gt;These methods are pretty similar -  they both involve returning a pointer to the function
implementing the code in the closure combined with a pointer to the data in the closure.
The difference lies in how that data is accessed.&lt;/p&gt;

&lt;p&gt;The rewriting approach creates the closure environment as early as possible, and changes the
surrounding function to refer to that environment. It incures the closure creation cost whenever
the closure generating function is created, but since local variables are initialized straight
into the environment it avoids later copying.&lt;/p&gt;

&lt;p&gt;The copy approach delays the creation of the closure environment as much as possible, and then
copies the data. It can avoid unnecessary creation cost if there are paths that don't lead
to creation of a closure, but when the creation happens, it needs to handle full copying
of any objects or object references involved, which may be more expensive.&lt;/p&gt;

&lt;p&gt;For both of these methods, care must be taken to create a single environment for any set of
closures created during the same execution of the function that creates the closures.&lt;/p&gt;

&lt;h2&gt;"Fat pointers", objects and thunks&lt;/h2&gt;

&lt;p&gt;When returning a closure we also need to be able to pass along the environment.&lt;/p&gt;

&lt;p&gt;As it turns out, there are a number of ways of handling this:
&lt;ul&gt;
&lt;li&gt;We can create a "fat pointer". I.e. instead of passing around only the address to the
   code, we also pass around a pointer to the environment, and it is the callers
   responsibility to load that address onto the stack or into a register so the code
   can get at it.
 &lt;li&gt;We can turn the closure into an object, like Ruby's &lt;code&gt;Proc&lt;/code&gt;, and simply treat the
   variables used as instance variables of that object.
 &lt;li&gt;We can create our own half-assed almost object by storing the function pointer in the
   environment.
 &lt;li&gt;We can create a "thunk" on the fly - a small piece of code that will load the address
   of the environment and jump straight into the real code - and return that instead of
   a pointer to the real code of the closure. The major downside here is that it
   requires turning off protection against executing code from the heap.&lt;/p&gt;
&lt;/ul&gt;

&lt;p&gt;The "fat pointer" approach is simple but involves more work at &lt;em&gt;every&lt;/em&gt; call site, and
doubles the size of the data that is passed around.&lt;/p&gt;

&lt;p&gt;Turning it into an object is simple, and for my Ruby compiler it's even necessary a lot
of the time since most of the time when you handle blocks in Ruby, you'll actually get a
&lt;code&gt;Proc&lt;/code&gt; object. But it has the full overhead of method dispatch.&lt;/p&gt;

&lt;p&gt;The "half-assed object" approach still requires each call site to do a little bit more
work, but less than the fat pointer approach. It also doesn't require additional data to
be passed around (the function pointer is stored in the environment instead of copied around)&lt;/p&gt;

&lt;p&gt;Creating a "thunk" also has overhead, but isn't as scary as it may sound - the code to
create is very simple, and really it consists of copying a few bytes around.&lt;/p&gt;

&lt;p&gt;We'll start with a "sort of" object, and then take a look at the thunk approach.&lt;/p&gt;

&lt;h2&gt;Simulating the rewriting method in C&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://gist.github.com/259462#file_closures_basic.c"&gt;closures-basic.c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;C lacks pretty much everything that could make this convenient and easy, so it really lays
the implementation bare, for better or worse.&lt;/p&gt;

&lt;p&gt;First, let's create a structure to hold the function pointer and environment:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;

    struct closure {
      void (* call)(struct closure *);
      int x;
    };

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;If we had more local variables in &lt;code&gt;foo&lt;/code&gt;, we'd add them to this structure.&lt;/p&gt;

&lt;p&gt;Then we need to create a function with the code for the lambda block:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    void block(struct closure * env) {
      env-&gt;x += 1;
      printf ("block: x is %d\n", env-&gt;x);
    }


&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Finally we can implement foo:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct closure * foo(int x)
    {
      struct closure * closure = (struct closure *)malloc(sizeof(struct closure *));
      closure-&gt;x = x;
      printf ("x is %d\n",closure-&gt;x);
      closure-&gt;call = &amp;block;
      return closure;
    }

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Ewww..&lt;/p&gt;

&lt;p&gt;A couple of observations: If we want to be able to return multiple
closures (say, an array of them), the variables needs to be acessed
via one more indirection, which makes this even more disgustingly
convoluted.&lt;/p&gt;

&lt;p&gt;It's also annoying, because it means lots of extra overhead in order
to handle a situation that might very well never arise.&lt;/p&gt;

&lt;p&gt;If the language requires supporting multiple closures (like Ruby), a
compiler could support both approaches to optimize - adding the cost
of the extra redirection only:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;when more than one closure can potentially be returned at the
same time.&lt;/li&gt;
&lt;li&gt;when those closures need access to the same variables.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt; For an example of these restrictions, consider:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
     &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;foo&lt;/span&gt; &lt;span class="ident"&gt;x&lt;/span&gt;
       &lt;span class="ident"&gt;a&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt;

       &lt;span class="ident"&gt;b&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;2&lt;/span&gt;
       &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;x&lt;/span&gt;
         &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="ident"&gt;lambda&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="ident"&gt;a&lt;/span&gt; &lt;span class="punct"&gt;+=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt; &lt;span class="punct"&gt;}&lt;/span&gt;

       &lt;span class="keyword"&gt;else&lt;/span&gt;
         &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="ident"&gt;lambda&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="ident"&gt;b&lt;/span&gt;&lt;span class="punct"&gt;+=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt; &lt;span class="punct"&gt;}&lt;/span&gt;

       &lt;span class="keyword"&gt;end&lt;/span&gt;
     &lt;span class="keyword"&gt;end&lt;/span&gt;
     
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;First of all, this function is guaranteed to only return one lambda at the
time, so applying that rule, we can just assing the appropriate function
pointer to the &lt;code&gt;call&lt;/code&gt; member variable, and do away with the extra indirection.&lt;/p&gt;

&lt;p&gt;Secondly, even &lt;em&gt;if&lt;/em&gt; we decide to return both of them at the same time, they
are guaranteed to never access the same variables, so we could instead create
two distinct closure environments, and still avoid the extra indirection.&lt;/p&gt;

&lt;p&gt;But let's take a look at the complete example &lt;em&gt;with&lt;/em&gt; the extra indirection
anyway, as a worst case scenario:&lt;/p&gt;

&lt;h2&gt;Indirection hell&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://gist.github.com/259462#file_closures_indirection.c"&gt;closures-indirection.c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First we create an environment for the free variables we wish to "capture":&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;

    struct env {
      int x;
    };

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Our modified closure structure holds a pointer to it, instead of holding the variables:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct closure {
      void (* call)(struct env *);
      struct env * env;
    };

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The block takes a pointer to the environment:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    void block(struct env * env) {
      env-&gt;x += 1;
      printf ("block: x is %d\n", env-&gt;x);
    }

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;And the closure function itself needs to first allocate the environment,
and then the closure:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct closure * foo(int x) {
      struct env * env = (struct env *)malloc(sizeof(struct env));
      env-&gt;x = x;
    
      printf ("x is %d\n",env-&gt;x);
      
      struct closure * closure = (struct closure *)malloc(sizeof(struct closure *));
      closure-&gt;env = env;
      closure-&gt;call = block;
    
      return closure;
    }


&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Finally we call it with the env:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    int main() {
      struct closure * c = foo(5);
    
      c-&gt;call(c-&gt;env);
      c-&gt;call(c-&gt;env);
   }
&lt;/code&gt;&lt;/pre&gt;


&lt;h2&gt;Fat pointers&lt;/h2&gt;

&lt;p&gt;The example above is actually really simple to use to illustrate the
fat pointer approach. Instead of returning a pointer to the closure
object, we simply return the object itself:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://gist.github.com/259462#file_closures_fatptr.c"&gt;closures-fatptr.c&lt;/a&gt;&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct closure foo(int x)
    {
      struct env * env = (struct env *)malloc(sizeof(struct env));
      env-&gt;x = x;
    
      printf ("x is %d\n",env-&gt;x);
      
      struct closure closure;
      closure.env = env;
      closure.call = block;
    
      return closure;
    }
    
    int main() {
      struct closure c = foo(5);
    
      c.call(c.env);
      c.call(c.env);
    }

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;As you can see it makes some things simpler (no need for the second
memory allocation step).&lt;/p&gt;

&lt;h2&gt;Variable substitution&lt;/h2&gt;

&lt;p&gt;An optimization worth keeping in mind is variable substitution. In cases
where it can be guaranteed that the variables in question keeps the same
value, the need for a separate environment may go away if variables are
substituted for their values in the closures body.&lt;/p&gt;

&lt;p&gt;Furthermore, if the variables can be guaranteed not to change in any
closure returned from the function, then whether or not the variable
changes in the function &lt;em&gt;outside&lt;/em&gt; the closures, the closures may keep
separate environments (and hence do the optimization to avoid indirection)
with copies of the variables in question.&lt;/p&gt;

&lt;p&gt;Of course this would require the function to carry out any updates once
for each generated closure.&lt;/p&gt;

&lt;p&gt;(you might have spotted here one of the advantages for functional
languages with little or no mutation of variables - they have a lot
fewer issues to worry about with respect to sharing of viarable state)&lt;/p&gt;

&lt;h2&gt;Moving on to thunks&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://gist.github.com/259462#file_closures_thunks.c"&gt;closures-thunks.c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that we have the basics down, how do we return just a "plain"
function pointer, so we can simplify the call sites?&lt;/p&gt;

&lt;p&gt;What we want to create is something like the code below.&lt;/p&gt;

&lt;p&gt;Note that we take a shortcut and don't create a proper stack frame -
this kind of thing can easily confuse gdb and other debuggers, which is not
necessarily very nice.&lt;/p&gt;

&lt;p&gt;The thunk below also does not directly allow passing any arguments to the block -
if we wanted to do that it gets hairier, since we'd need to manipulate
the parameters passed, while the caller doesn't know we've altered the
size of the parameter space set aside. In a compiler this would likely
be solved by making either the caller or callee aware that it's a closure
call, and adjusting the stack accordingly separate from the thunk.&lt;/p&gt;

&lt;p&gt;Of course, the downside of using asm here is that it's architecture
specific, and the code to generate the thunk will need to be modified
accordingly to port the code, but then if you do this in a compiler
that is outputting native assembly, you have to do that anyway.&lt;/p&gt;

&lt;pre class="asm"&gt;&lt;code&gt;
    my_closure_instance:
        pushl $my_environment ; Push the environment onto the stack as the first arg
        call  $the_block      ; Go to the real code
        addl  $4,%esp         ; Throw the environment pointer away
        ret                   ; Return to the caller
        
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Putting some arbitrary values in there, and assembling it with gcc/gas, and then doing
&lt;code&gt;objdump -D&lt;/code&gt; to the resulting binary gives this (and other bits and pieces
I've cut - use the &lt;code&gt;-nostdlib&lt;/code&gt; option to avoid dealing with a bunch of initialization
code):&lt;/p&gt;

&lt;pre class="asm"&gt;&lt;code&gt;
    08048055 &lt;my_closure_instance&gt;:
    8048055:       68 00 08 af 2f          push   $0x2faf0800
    804805a:       e8 00 00 00 00          call   804805f &lt;my_closure_instance&gt;
    804805f:       83 c4 04                add    $0x4,%esp
    8048062:       c3                      ret  
    
&lt;/my_closure_instance&gt;&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This shows us the values to put in our thunk. Couple of observations: The push is just
followed by the address itself, but the call uses an offset from the first byte
of the following instruction.&lt;/p&gt;

&lt;p&gt;This approach of using structs to generate the thunk, I've blatantly stolen from
&lt;a href="http://timetobleed.com/hot-patching-inlined-functions-with-x86_64-asm-metaprogramming/"&gt;Joe Damato&lt;/a&gt;

because I like how it makes the code that manipulates the thunk more readable:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct __attribute__((packed)) thunk {
      unsigned char push_op;
      void * env_addr;
      unsigned char call_op;
      signed long call_offset;
      unsigned char add_esp_ops[3];
      unsigned char ret_op;
    };
    
    struct thunk default_thunk = {0x68, 0, 0xe8, 0, {0x83, 0xc4, 0x04}, 0xc3};

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;(the __attribute__ stuff is the gcc specific way of avoiding padding for
alignement)&lt;/p&gt;

&lt;p&gt;We change the rest like this:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    typedef void (* cfunc)();
    
    cfunc foo (int x) {
      struct env * env = (struct env *)malloc(sizeof(struct env));
      env-&gt;x = x;
    
      printf ("x is %d\n",env-&gt;x);
    
      struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      *thunk = default_thunk;
      thunk-&gt;env_addr = env;
      thunk-&gt;call_offset = (void *)&amp;block - (void *)&amp;thunk-&gt;add_esp[0]; // Pretty!
      mprotect(thunk,sizeof(struct thunk), PROT_EXEC);
      return (cfunc)thunk;
    }


&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The typedef is a workaround for C's absolutely atrocious ptr-to-function
declarations...&lt;/p&gt;

&lt;p&gt;The interesting bit is at the end where we allocate the thunk, and fill in the addresses&lt;/p&gt;

&lt;p&gt;Then we cast the thunk data structure to a function pointer. That ought to make
you feel dirty, and a bit queasy. It's ok, though.&lt;/p&gt;

&lt;p&gt;We use mmap to avoid problems on systems with executable heaps turned off
(execshield etc. that wil cause a segmentation fault if you try to execute code
in malloc()'d memory or on the stack), and the mprotect() turns off write access
to the page after we're done. For a production approach you may want a dedicated
allocation function to properly manage this and perhaps avoid doing a separate
mmap for every thunk created.&lt;/p&gt;

&lt;p&gt;All of this could really be wrapped up into a nice generic function. Something
like this:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    struct thunk * make_thunk(struct env * env, void * code) {
      struct thunk * thunk = (struct thunk *)mmap(0,sizeof(struct thunk), PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      *thunk = default_thunk;
      thunk-&gt;env_addr = env;
      thunk-&gt;call_offset = code - (void *)&amp;thunk-&gt;add_esp[0]; // Pretty!
      mprotect(thunk,sizeof(struct thunk), PROT_EXEC);
      return thunk;
    }

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Finally, this makes our main function look like this:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    int main() {
      cfunc c = foo(5);
      
      c();
      c();
    }


&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Now that's nicer...&lt;/p&gt;

&lt;p&gt;Since this is already gcc/Linux/x86-32 specific, it can be made even nicer. gcc
supports inner functions, and with a tiny bit of restructuring and a couple
of macros I did this, mostly for fun to see how close to a "natural" syntax for closures
I could get in C without changing the compiler:&lt;/p&gt;

&lt;pre class="c"&gt;&lt;code&gt;
    #define initenv(__vars__) struct env { __vars__ ; } * env = (struct env *)malloc(sizeof(struct env));
    #define new_closure(__block__) (closure)make_thunk(env,&amp;__block__)
    
    closure foo (int x)
    {
      initenv(int x)
      env-&gt;x = x;
        
      printf ("x is %d\n",env-&gt;x);
    
      void block (struct env * env) {  
         env-&gt;x += 1;
         printf ("block: x is %d\n", env-&gt;x);
      } 
    
      return new_closure(block);
    }


&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;I'll still stick to Ruby...&lt;/p&gt;

&lt;h2&gt;Some parting comments&lt;/h2&gt;

&lt;p&gt;I wrote this while exploring various approaches to add closure support to my
Ruby compiler. I haven't quite made up my mind yet. The object approach is
tempting because Ruby already has the Proc class, but I'll probably go for
an environment + fat pointer approach that will be converted into a Proc
object if assigned to anything (as opposed to just used for &lt;code&gt;yield&lt;/code&gt;)&lt;/p&gt;

&lt;p&gt;The thunk approach is somewhat appealing too, but if turned into a Proc object it may be as much or
&lt;em&gt;more&lt;/em&gt; overhead than the fat pointer approach (and there will only be one
call site: In &lt;code&gt;Proc#call&lt;/code&gt;) and this approach would mean the fat pointer
wouldn't be passed around much - typical usage would be passing a block to
a method, and then &lt;code&gt;yield&lt;/code&gt;'ing to it, where the &lt;code&gt;yield&lt;/code&gt; would be the only
call site.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=44KBvsuafCU:uL7QRcTWFfc:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=44KBvsuafCU:uL7QRcTWFfc:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=44KBvsuafCU:uL7QRcTWFfc:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=44KBvsuafCU:uL7QRcTWFfc:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=44KBvsuafCU:uL7QRcTWFfc:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/44KBvsuafCU" height="1" width="1"/&gt;</description>
      <category>ruby</category>
      <category> c</category>
      <category> closure</category>
      <category> compiler</category>
      <category> writing a compiler in Ruby</category>
      <pubDate>Mon, 21 Dec 2009 11:45:15 -0500</pubDate>
      <dc:date>2009-12-21T11:45:15-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/how-to-implement-closures.html</feedburner:origLink></item>
    <item>
      <title>Writing a (Ruby) compiler in Ruby bottom up - step 23</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/eSAW5UdXdhQ/writing-a-compiler-in-ruby-bottom-up-step-23.html</link>
      <description>&lt;span style="color: red; "&gt;This is &lt;a href="http://www.hokstad.com/compiler"&gt;part of a series&lt;/a&gt; I started in March 2008 - you may want to go back and look at older parts if you're new to this series.&lt;/span&gt;

&lt;h2&gt;Continuing down the rabbit hole: String&lt;/h2&gt;

&lt;p&gt;A couple of parts ago we established some of the problems with supporting even the
seemingly simple &lt;code&gt;attr_reader&lt;/code&gt;, &lt;code&gt;attr_writer&lt;/code&gt; and &lt;code&gt;attr_accessor&lt;/code&gt;. To reiterate,
a naive implementation looks like this:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;def attr_accessor sym
  attr_reader sym
  attr_writer sym
end

def attr_reader sym
  define_method sym do
     %s(ivar self sym)
  end
end

def attr_writer sym
  define_method "#{sym.to_s}=".to_sym do |val|
    %s(assign (ivar self sym) val)
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We resolved the issue of having a working &lt;code&gt;sym.to_s&lt;/code&gt;, but that still leaves a number
of things to address. For example, we need at least the start of a "proper" String
class that we can add &lt;code&gt;#to_sym&lt;/code&gt; to. That's what we'll look at this time.&lt;/p&gt;

&lt;p&gt;The first bit is very similar to what we did for &lt;code&gt;Symbol&lt;/code&gt;: We won't type tag, and we'll
create String objects for literal strings by calling a low level function.&lt;/p&gt;

&lt;h2&gt;Quick sidebar: Hiding the low level implementation&lt;/h2&gt;

&lt;p&gt;MRI "hides" the gory details of String and other basic classes by not making them real
Ruby objects per-se. There's data there that is simply not accessible without dipping into
the C API. We may eventually want to take a similar tack - add a mechanism for allocating
space for, setting and retrieving low level variables via the s-expression syntax that
are completely invisible for the higher level API.&lt;/p&gt;

&lt;p&gt;Currently we don't. The "raw" string used for Symbol and for String objects is a normal
instance variable, and it's perfectly possible to expose it, and accessing it thinking
it's a real object will trivially easily cause nasty crashes.&lt;/p&gt;

&lt;p&gt;In general, we would eventually benefit from creating a mechanism to "hide" the low
level plumbing from accidental access by normal Ruby. For now we'll retain the ability
to shoot ourselves in the foot easily.&lt;/p&gt;

&lt;h2&gt;The basics&lt;/h2&gt;

&lt;p&gt;The String class will for now contain just a pointer to a C-style string. It will be
immutable in it's first incarnation. This gets us a decent step further very easily -
we can just keep the address to a C-style string constant.&lt;/p&gt;

&lt;p&gt;As it turns out, this will be an easy foundation to build on:&lt;/p&gt;

&lt;p&gt;MRI uses a number of flags for strings, for example to do "copy on write". Once we
need to make strings mutable, we'll do the same. Strings created from constants will
start out marked as "copy on write", and when doing an update, the string buffer will
be copied into freshly allocated space first. The hope is that many string constants
will never be updated.&lt;/p&gt;

&lt;p&gt;We also need to implement &lt;code&gt;String#to_sym&lt;/code&gt;. This is thankfully also reasonably easy:
We need to access the low level function we added that will retrieve a symbol based
on a "raw" c-style string.&lt;/p&gt;

&lt;h2&gt;The code&lt;/h2&gt;

&lt;p&gt;As the last couple of times, this was unfortunately committed a bit piecemeal, so
the commits aren't easy to follow. But lets go through this step by step.&lt;/p&gt;

&lt;p&gt;First of all, we're going to rewrite the tree, to alter a literal string into&lt;/p&gt;
&lt;pre class="ruby"&gt;&lt;code&gt;%s(call __get_string ConstantReferringToTheRawString)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So, we alter &lt;code&gt;compile&lt;/code&gt; to add a call to &lt;code&gt;rewrite_strconst&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;#
# Re-write string constants outside %s() t
# %s(call __get_string [original string constant])
def rewrite_strconst(exp)
  exp.depth_first do |e|
    next :skip if e[0] == :sexp
    is_call = e[0] == :call
    e.each_with_index do |s,i|
      if s.is_a?(String)
        lab = @string_constants[s]
        if !lab
          lab = @e.get_local
          @string_constants[s] = lab
        end
        e[i] = [:sexp, [:call, :__get_string, lab.to_sym]]
        # FIXME: This is a horrible workaround to deal
        # with a parser inconsistency that leaves calls
        # with a single argument with the argument "bare"
        # if it's not an array, which breaks with this rewrite.
        e[i] = [e[i]] if is_call &amp;&amp; i &gt; 1 
      end
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Effectively this visits all nodes that are not s-expressions,
and if it finds a string, it will add a string constant. It will
then use the label that is allocated to rewrite the expression like this:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;[:sexp, [:call, :__get_string, lab.to_sym]]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As the following line says, it's a nasty workaround, and it will be torn out as soon
as I get a parser, so just ignore that... pretend it's not even there ;)&lt;/p&gt;

&lt;p&gt;Now, the reason we wrap the call to &lt;code&gt;__get_string&lt;/code&gt; is that otherwise it will be
rewritten to a method call because there really is no such thing as function calls
in Ruby, but for some of our low-level plumbing we need actual C-style function calls
(such as for calling out to C code).&lt;/p&gt;

&lt;p&gt;We're also changing &lt;code&gt;strconst&lt;/code&gt;, the method used to get the "value" of a string constant
as follows:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;-  def strconst str
-    lab = @string_constants[str]
-    return lab if lab
-    lab = @e.get_local
-    @string_constants[str] = lab
-    return lab
+  def strconst(a)
+    lab = @string_constants[a]
+    if !lab # For any constants in s-expressions
+      lab = @e.get_local
+      @string_constants[a] = lab
+    end
+    return [:addr,lab]
   end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What's left? Not much, really. You can &lt;a href="http://github.com/vidarh/writing-a-compiler-in-ruby/blob/master/lib/core/string.rb"&gt;see the basi
c String class&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the important bit:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;class String
   def __set_raw(str)
     @buffer = str
   end
end

def __get_string(str)
   s = String.new
   %s(callm s __set_raw (str))
   s
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It's worthwhile reading the comments in at the link above if you want to see more about
where this is heading.&lt;/p&gt;

&lt;p&gt;We'll flesh out the String class in coming parts, but this groundwork now means we're
working on a "real" object instead of on a raw pointer to a c-style string, which means
it's mostly a runtime library issue instead of requiring much in terms of compiler changes.&lt;/p&gt;

&lt;p&gt;One of the things worth mentioning that is covered in one of the comments in lib/core/string.rb
is that the whole &lt;code&gt;__get_string&lt;/code&gt; and &lt;code&gt;__get_symbol&lt;/code&gt; thing is a bit ugly. As I mentioned
earlier in this part, hiding the low level implementation would be nice, and that includes
preventing direct calls to &lt;code&gt;__get_string&lt;/code&gt; and &lt;code&gt;_get_symbol&lt;/code&gt;, and that can be done trivially
by simply ensuring the Ruby code can't directly do function calls (as opposed to method
calls). But that doesn't solve &lt;code&gt;String#__set_raw&lt;/code&gt;. To fix that, I'll need to either remove
the need for the method, or by making it possible to "hide" certain methods entirely from
the Ruby code...&lt;/p&gt;

&lt;p&gt;It's not a priority now, though - first we get things working, then we make
it pretty.&lt;/p&gt;

&lt;p&gt;What about &lt;code&gt;String#to_sym&lt;/code&gt;? Well, this ought to do the trick:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;class String
  def to_sym
    __get_symbol(@buffer)
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So, that leaves us with a working &lt;code&gt;define_method&lt;/code&gt; and %s(lambda) with support for variables
to get &lt;code&gt;attr_reader&lt;/code&gt;,&lt;code&gt;attr_writer&lt;/code&gt; and &lt;code&gt;attr_accessor&lt;/code&gt; working... We're heading down that
road soon.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=eSAW5UdXdhQ:mps4Fh5H1qk:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=eSAW5UdXdhQ:mps4Fh5H1qk:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=eSAW5UdXdhQ:mps4Fh5H1qk:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=eSAW5UdXdhQ:mps4Fh5H1qk:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=eSAW5UdXdhQ:mps4Fh5H1qk:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/eSAW5UdXdhQ" height="1" width="1"/&gt;</description>
      <category>compiler in Ruby bottom up</category>
      <category> compiler</category>
      <category>ruby</category>
      <category> String</category>
      <pubDate>Wed, 16 Dec 2009 13:26:46 -0500</pubDate>
      <dc:date>2009-12-16T13:26:46-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-23.html</feedburner:origLink></item>
    <item>
      <title>Ruby gets a spec</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/E-ImaeFS7bA/ruby-gets-a-spec.html</link>
      <description>&lt;p&gt;A week or so ago, the &lt;a href="http://ruby-std.netlab.jp/"&gt;Ruby Draft specification made the rounds&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yes... Ruby is finally getting a standard. While &lt;a href="http://rubyspec.org/"&gt;RubySpec&lt;/a&gt;
has been around for a while, and is a great, it is an executable specification
that tells you &lt;em&gt;what&lt;/em&gt;, not &lt;em&gt;why&lt;/em&gt;, and it aims to be "complete" while the
new project aims to define a shared core suitable for implementation by
all the different Ruby implementations that are springing up.&lt;/p&gt;

&lt;p&gt;I think they're complementary enough that the Ruby Draft Specification
could be very beneficial.&lt;/p&gt;

&lt;p&gt;For &lt;a href="http://www.hokstad.com/compiler"&gt;my compiler project&lt;/a&gt; this means
there's now a compelling reason to revisit some things (such as the
parser) to look at conformance, and a clear roadmap for some others
(classes) to ensure it at least meets the spec.&lt;/p&gt;

&lt;p&gt;Of course, in many ways MRI will still be the benchmark, but the standard
will provide a minimal level of conformance that will likely be easier
to achieve.&lt;/p&gt;

&lt;p&gt;In particular, the spec includes a formal grammar for Ruby that doesn't
involve walking through thousands of lines of code written for another
implementation to verify that you're doing things right. I'm particularly
looking forward to going through my parser and aligning it more closely
with the draft spec... (I'm sure I'll be swearing a lot while doing it,
though)&lt;/p&gt;

&lt;p&gt;It also includes descriptions of expected semantics for things like the
&lt;a href="http://www.hokstad.com/ruby-object-model.html"&gt;object model&lt;/a&gt; that have
previously been something a lot of people (including me) have spent
time revers engineering to  figure out.&lt;/p&gt;

&lt;p&gt;Combined with all the Ruby implementations in progress, this is a clear
indication that Ruby is growing up and becoming a contender in areas
where it would previously not be acceptable.&lt;/p&gt;

&lt;p&gt;Keep an eye on the Ruby standard project...&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=E-ImaeFS7bA:RMzj-X42JKU:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=E-ImaeFS7bA:RMzj-X42JKU:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=E-ImaeFS7bA:RMzj-X42JKU:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=E-ImaeFS7bA:RMzj-X42JKU:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=E-ImaeFS7bA:RMzj-X42JKU:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/E-ImaeFS7bA" height="1" width="1"/&gt;</description>
      <category>ruby</category>
      <category> spec</category>
      <category> grammar</category>
      <category> compiler</category>
      <category> standard</category>
      <pubDate>Thu, 10 Dec 2009 13:53:39 -0500</pubDate>
      <dc:date>2009-12-10T13:53:39-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/ruby-gets-a-spec.html</feedburner:origLink></item>
    <item>
      <title>Virgin Media, or how to make your customers hate you</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/OmwJ6vFF5cg/virgin-media-how-to-make-your-customers-hate-you.html</link>
      <description>&lt;strong style="color:red"&gt;
UPDATE: I e-mailed most of the below to Neil Berkett, CEO at Virgin Media, earlier today. Neil had a guy called Peter call me to apologise and to confirm that they will arrange to have my account cancelled right away, as I had requested. Thanks to both of you - it goes some way to improving my impression and shows that at least some  of you do care about customer impression. Just get the message through to your front line staff as well.
&lt;/strong&gt;

I've posted this elsewhere, but it really belongs on my own blog too.

I've you follow me on Twitter you've probably seen me complain loudly about Virgin, and briefly about BT too. The difference being the BT listened (following me on Twitter, getting back to me and fixing my problem within hours of becoming aware of it) while Virgin's behaviour so far has just gotten worse and worse.

Here's the background:

A couple of weeks ago my internet, TV and phone (though we dont use the Virgin phone service, so the latter didnt matter) went down. Called Virgin, and after lots of waiting and extremely unhelpful people we were given &lt;strong&gt;two&lt;/strong&gt; different times for two different people to come look at our broadband and TV, even though both failing at the same time pretty much guaranteed it was the cable. Whatever, I thought, I can understand they dont want frontline staff making judgements like that.

Sure enough, the two people came at different times, both said it was the cable. They promised itd be fixed by 5pm same day. Nobody came or called or told us anything. So around 7pm we called. Was put through the same round of questions before anyone would even listen to the complaint about lack of follow up.

Was then finally told that someone would fix it by 6pm the following night. 6pm next evening came and went. No word from Virgin.

Called them again, and was told that oh? nobody told you? They need to schedule repair for December 8th. Two weeks away. After I'd already waited several days. At which point I lost it. It was bad enough that their estimate was so long, but I &lt;em&gt;might&lt;/em&gt; have accepted that, had I not by then had to endure hours of waiting in call queues, and repeated broken promises about when the problem was supposed to be fixed or when I would be called back (fort that matter this is not the first time I've had to deal with Virgin's poor excuse for customer service)

So I told them they had until the following morning to get someone to call me and try to find another slot.

Needless to say nobody called.

So I looked around, and Sky turned out to be &lt;strong&gt;half the price&lt;/strong&gt; of what I was paying for my broadband and TV, with more channels. Downside (or so I thought)? Having to deal with BT over ADSL.

Install date? December 3rd and 4th. Not great, but contrary to Virgin they had the excuse that they were signing up a new customer, not fixing a problem for someone who had already been a loyal paying customer (paying through the nose for their top of the line packages) for years.

So a few days later I wait for 40 minutes on a call centre line to cancel Virgin. Get through to some very unhelpful woman that wants me to call back to the &lt;strong&gt;same number&lt;/strong&gt; I've just waited 40 minutes to get through on, because she cant put me through to the same people. Despite the fact their menu system doesn't work. 

I say no. Not my problem. Ive had enough. Im notifying them that I am cancelling the service. It is not my responsibility to work around their broken calling centre system when they could take a message and pass it on later. 

(Newsflash, Virgin: Companies with decent customer service &lt;em&gt;does this&lt;/em&gt;. When I deal with my bank, or my other utilities, if something takes to long, I ask to get called back or for someone to pass on a message, and surprise, surprise, they actually do.)

After ten minutes of arguing, she finally just puts me through to a supervisor  without asking me first. I explain to the supervisor, argue with him, points out Ill just cancel my direct debit and that since I've already notified them that I'm cancelling I'll sue if they try to keep charging me (threatening to sue has an amazing effect in most cases). I'm finally promised that someone will call the next morning to confirm my cancellation.

Nobody calls.

Yeah, I'm not surprised, because if they'd called back they'd have broken a pattern of excessively bad service.

So I fill out a complaint form on their website.

&lt;strong&gt;Later that same day, on the 3rd, a Virgin repair guy show up with no warning to dig up our front garden and carry out the repair that was impossible to do before December 8th &lt;/strong&gt; Eh. If theyd treated me properly first time around instead of jerking me around for 3 days, and then had told me they could repair things by the 3rd, and called me back when they promised to, Id probably have still been a customer (and paid over the odds for the privilege but not particularly cared). So of course we told him no, were no longer Virgin customers.

Same day BT enabled our Sky ADSL. Or so we thought. Line is broken (as in, our landline stops working). I file a fault, gets told December 7th, gets pissed off and figure this is the BT Ive learned to loathe.

Complain on Twitter the following morning. @Btcare on Twitter gets involved. 4 hours later an engineer is at our house, and the line is fixed shortly thereafter.

At the same time Sky installs our new TV service, and the installation guys laughs about the familiar story when my wife recounts our woes to them.

Now were up to this morning. I receive two phone calls. First one:

&lt;strong&gt;Virgin Media complaints department:&lt;/strong&gt; A rude woman starts arguing with me over whether or not Ive cancelled and proceeds to talk over me and then turn around to repeatedly interrupt me to complain about me interrupting her, forcing me to gradually raise my voice to even be able to explain the situation without having her constantly talk me down. 

After minutes of that I'm finally given a chance to speak long enough to point out that after dealing with their dreadful call centre (and her!) theres just no way Im dealing with another one after Ive notified them both by phone and by e-mail, and had her confirm to me on the phone, that theyve received my cancellation. Its not as if they can claim in any shape or form not to have received it. Argument goes on and on, until she just hangs up on me.

&lt;strong&gt;Way to deal with complaints, the Virgin Media way:&lt;/strong&gt; Have your biggest asshole call up the customer and talk over them and repeatedly tell them they're wrong and practically taunt them into hating you and your company.

&lt;strong&gt;Then BT calls:&lt;/strong&gt; (pay attention to this Virgin) They call to apologise profusely over not dealing with our problem sooner (time from initial fault report to problem was fixed: 24 hours; time from complaint over handling to problem fixed: 6 hours; time from complaint over handling to engineer was at our door: 4 hours); wants to make sure the engineer fixed our line properly; wants to find out what might have gone wrong in the first place to make us upset.

Truly the only problem was that I got a long-ish repair estimate (four days; still miniscule compared to Virgin), which was annoying but to be honest I probably wouldn't even have bothered to complain about it if the treatment Virgin have given me over the last few weeks hadn't made my fuse incredibly short to begin with.

Overall, in BTs case their fault reporting on their website needs to improve a bit (indicate that the problem will be fixed "no later than"; inform of how to contact Btcare; make it clearer that you'll only get regular updates if you sign up for SMS, not e-mail too), but once someone got wind of me being dissatisfied the problem was fixed &lt;strong&gt;quickly&lt;/strong&gt;.

With Virgin, on the other hand? Insultingly rude and arrogant calls, and apparently they still think Im their customer.

I never thought Id see the day when I did this:

BT actually did &lt;strong&gt;well&lt;/strong&gt;. Problems happen, I accept that - I work with technology all day. What matters to me it how a company responds when they do.  BT responded properly, and while I was a bit angry initially Im &lt;em&gt;very happy with how it turned out&lt;/em&gt;. &lt;strong&gt;The call this morning was a very nice touch.&lt;/strong&gt;

&lt;strong&gt;Virgin on the other hand managed to turn me from a devoted customer who wouldnt even consider looking around (I honestly had no idea how much over the odds we were paying for Virgin, because price didn't really matter to me as long as things worked), to detesting them beyond what words can convey in two short weeks.&lt;/strong&gt;.&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=OmwJ6vFF5cg:yLtlH3qQIVs:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=OmwJ6vFF5cg:yLtlH3qQIVs:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=OmwJ6vFF5cg:yLtlH3qQIVs:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=OmwJ6vFF5cg:yLtlH3qQIVs:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=OmwJ6vFF5cg:yLtlH3qQIVs:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/OmwJ6vFF5cg" height="1" width="1"/&gt;</description>
      <category>virgin</category>
      <category>virgin media</category>
      <category> internet</category>
      <category> rants</category>
      <category>complaints</category>
      <category> customer service </category>
      <pubDate>Sat, 05 Dec 2009 11:45:38 -0500</pubDate>
      <dc:date>2009-12-05T11:45:38-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/virgin-media-how-to-make-your-customers-hate-you.html</feedburner:origLink></item>
    <item>
      <title>Writing a (Ruby) compiler in Ruby bottom up - step 22</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/sISf4w-mWug/writing-a-compiler-in-ruby-bottom-up-step-22.html</link>
      <description>&lt;span style="color: red; "&gt;This is &lt;a href="http://www.hokstad.com/compiler"&gt;part of a series&lt;/a&gt; I started in March 2008 - you may want to go back and look at older parts if you're new to this series.&lt;/span&gt;

&lt;h2&gt;A diversion into Method missing&lt;/h2&gt;

&lt;p&gt;So far the method_missing implementation has just printed a notice and quit.&lt;/p&gt;

&lt;p&gt;During the &lt;a href="http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-21.html"&gt;trip down the rabbit hole&lt;/a&gt;
that is &lt;code&gt;attr_accessor&lt;/code&gt; and friends that became a major annoyance.&lt;/p&gt;

&lt;p&gt;The problem is that this notice has not included a stack backtrace or any way
to figure out &lt;em&gt;where&lt;/em&gt; it occurred. It's also been impossible to override it
and actually figure out &lt;em&gt;what&lt;/em&gt; method was being called because we currently
get to &lt;code&gt;method_missing&lt;/code&gt; by jumping straight into the vtable.&lt;/p&gt;

&lt;p&gt;We get &lt;code&gt;method_missing&lt;/code&gt; when we hit the pointer that's installed there as
default when nothing has overridden it.&lt;/p&gt;

&lt;p&gt;So how do we get better debug output? And how do we support users overriding
&lt;code&gt;method_missing&lt;/code&gt; and actually getting a symbol to use?&lt;/p&gt;

&lt;h2&gt;Thunks&lt;/h2&gt;

&lt;p&gt;A &lt;a href="http://en.wikipedia.org/wiki/Thunk#Thunks_in_object-oriented_programming"&gt;"thunk"&lt;/a&gt; in
terms of a object oriented languages is generally a small piece of compiler generated
code that gets inserted to "adjust" a function or method call.&lt;/p&gt;

&lt;p&gt;In this specific case, we will generate a separate thunk for each vtable entry.&lt;/p&gt;

&lt;p&gt;Instead of inserting a pointer to &lt;code&gt;method_missing&lt;/code&gt; directly, we will insert the
address of a small thunk. The thunk will not even create a full stack frame,
but simply add the address of the &lt;code&gt;Symbol&lt;/code&gt; corresponding to the vtable slot
as the first argument on the stack, and then jump straight into &lt;code&gt;method_missing&lt;/code&gt;,
thereby simulating a "direct" call to method_missing with the symbol as the
first argument.&lt;/p&gt;

&lt;p&gt;It's actually very simple - we just need to pop the real return address off
the stack, push the symbol onto the stack, and then push the real return address
back on.&lt;/p&gt;

&lt;p&gt;Ok, so we still cheat a bit. Eventually we need to make our &lt;code&gt;method_missing&lt;/code&gt;
into a real method, but for now it's a function. Here's the code I've added
to create a "base" vtable that is used to initialize the vtable slots of &lt;code&gt;Object&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;output_vtable_thunks&lt;/span&gt;
      &lt;span class="attribute"&gt;@vtableoffsets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;vtable&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;name&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="ident"&gt;_&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;label&lt;/span&gt;&lt;span class="punct"&gt;("&lt;/span&gt;&lt;span class="string"&gt;__vtable_missing_thunk_&lt;span class="expr"&gt;#{clean_method_name(name)}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;")&lt;/span&gt;
        &lt;span class="comment"&gt;# FIXME: Call get_symbol for these during initalization &lt;/span&gt;
        &lt;span class="comment"&gt;# and then load them from a table instead.  &lt;/span&gt;
        &lt;span class="ident"&gt;compile_eval_arg&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;GlobalScope&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;"&lt;/span&gt;&lt;span class="string"&gt;:&lt;span class="expr"&gt;#{name.to_s}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;".&lt;/span&gt;&lt;span class="ident"&gt;to_sym&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;popl&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="symbol"&gt;:edx&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="comment"&gt;# The return address &lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;pushl&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="symbol"&gt;:eax&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;pushl&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="symbol"&gt;:edx&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;jmp&lt;/span&gt;&lt;span class="punct"&gt;("&lt;/span&gt;&lt;span class="string"&gt;__method_missing&lt;/span&gt;&lt;span class="punct"&gt;")&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
      &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;label&lt;/span&gt;&lt;span class="punct"&gt;("&lt;/span&gt;&lt;span class="string"&gt;__base_vtable&lt;/span&gt;&lt;span class="punct"&gt;")&lt;/span&gt;
      &lt;span class="comment"&gt;# For ease of implementation of __new_class_object we&lt;/span&gt;
      &lt;span class="comment"&gt;# pad this with the number of class ivar slots so that the&lt;/span&gt;
      &lt;span class="comment"&gt;# vtable layout is identical as for a normal class &lt;/span&gt;
      &lt;span class="constant"&gt;ClassScope&lt;/span&gt;&lt;span class="punct"&gt;::&lt;/span&gt;&lt;span class="constant"&gt;CLASS_IVAR_NUM&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;times&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;long&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;}&lt;/span&gt;
      &lt;span class="attribute"&gt;@vtableoffsets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;vtable&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_a&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;sort_by&lt;/span&gt; &lt;span class="punct"&gt;{|&lt;/span&gt;&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; &lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;]&lt;/span&gt; &lt;span class="punct"&gt;}.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;e&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
        &lt;span class="attribute"&gt;@e&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;long&lt;/span&gt;&lt;span class="punct"&gt;("&lt;/span&gt;&lt;span class="string"&gt;__vtable_missing_thunk_&lt;span class="expr"&gt;#{clean_method_name(e[0])}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;")&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;I hope it's reasonably easy to follow. First it generates a number of
functions that will look like this:&lt;/p&gt;

&lt;pre class="asm"&gt;&lt;code&gt;
    __vtable_missing_thunk_to_yaml:
        subl    $4, %esp
        movl    $1, %ebx
        movl    $.L110, (%esp)
        movl    $__get_symbol, %eax
        call    *%eax
        addl    $4, %esp
        popl    %edx
        pushl   %eax
        pushl   %edx
        jmp __method_missing
        
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This can be optimized a lot, but if you've followed this series, you know
getting something working is higher priority. In this case we generate a
call to &lt;code&gt;__get_symbol&lt;/code&gt;, which was introduced in the last part, and we pass
the string corresponding to the name:&lt;/p&gt;

&lt;pre class="asm"&gt;&lt;code&gt;
    .L110:
        .string "to_yaml"

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Then we adjust the stack as mentioned above.&lt;/p&gt;

&lt;p&gt;The next step is to create the &lt;code&gt;__base_vtable&lt;/code&gt;. Here's an excerpt:&lt;/p&gt;

&lt;pre class="asm"&gt;&lt;code&gt;
    __base_vtable:
        .long 0
        .long 0
        .long __vtable_missing_thunk_new
        .long __vtable_missing_thunk___send__
        .long __vtable_missing_thunk___get_symbol
        .long __vtable_missing_thunk___method_missing
        .long __vtable_missing_thunk_array
        .long __vtable_missing_thunk___new_class_object
        .long __vtable_missing_thunk_define_method
        ...

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Then we need to modify &lt;code&gt;__new_class_object&lt;/code&gt; to assign entries from
&lt;code&gt;__base_vtable&lt;/code&gt; instead of just blindly assigning a pointer to &lt;code&gt;__method_missing&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="comment"&gt;# size &lt;= ssize *always* or something is severely wrong.                                                                                                                &lt;/span&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;__new_class_object&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;size&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="ident"&gt;superclass&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;&lt;span class="ident"&gt;ssize&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;ob&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;assign ob (malloc (mul size 4&lt;/span&gt;&lt;span class="punct"&gt;)))&lt;/span&gt; &lt;span class="comment"&gt;# Assumes 32 bit                                                                                                                  &lt;/span&gt;
      &lt;span class="ident"&gt;i&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;while (lt i ssize&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="keyword"&gt;do&lt;/span&gt;
          &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;index&lt;/span&gt; &lt;span class="ident"&gt;ob&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;index&lt;/span&gt; &lt;span class="ident"&gt;superclass&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
          &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;add&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
      &lt;span class="punct"&gt;))&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;while (lt i size&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="keyword"&gt;do&lt;/span&gt;
           &lt;span class="comment"&gt;# Installing a pointer to a thunk to method_missing                                                                                                              &lt;/span&gt;
           &lt;span class="comment"&gt;# that adds a symbol matching the vtable entry as the                                                                                                            &lt;/span&gt;
           &lt;span class="comment"&gt;# first argument and then jumps straight into __method_missing                                                                                                   &lt;/span&gt;
           &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;index&lt;/span&gt; &lt;span class="ident"&gt;ob&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;index&lt;/span&gt; &lt;span class="ident"&gt;__base_vtable&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
           &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;assign&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;add&lt;/span&gt; &lt;span class="ident"&gt;i&lt;/span&gt; &lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
      &lt;span class="punct"&gt;))&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;assign (index ob 0&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="constant"&gt;Class&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="ident"&gt;ob&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
    
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Finally we make &lt;code&gt;__method_missing&lt;/code&gt; output the symbol, instead of just
spitting out "Method missing":&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;__method_missing&lt;/span&gt; &lt;span class="ident"&gt;sym&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;printf "Method missing: %s\n" (callm sym to_s&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
      &lt;span class="punct"&gt;%s(&lt;/span&gt;&lt;span class="symbol"&gt;exit 1&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="number"&gt;0&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=sISf4w-mWug:6XhdYWG8Lv8:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=sISf4w-mWug:6XhdYWG8Lv8:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=sISf4w-mWug:6XhdYWG8Lv8:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=sISf4w-mWug:6XhdYWG8Lv8:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=sISf4w-mWug:6XhdYWG8Lv8:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/sISf4w-mWug" height="1" width="1"/&gt;</description>
      <category>compiler in Ruby bottom up</category>
      <category> compiler</category>
      <category>ruby</category>
      <category> method_missing</category>
      <category> thunk</category>
      <category> vtable</category>
      <pubDate>Tue, 10 Nov 2009 09:20:49 -0500</pubDate>
      <dc:date>2009-11-10T09:20:49-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-22.html</feedburner:origLink></item>
    <item>
      <title>A pitfall of the Ruby Range class</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/b2UfbTBFYkg/a-pitfall-of-the-ruby-range-class.html</link>
      <description>&lt;p&gt;I tweeted about this, but figured it deserve a more lasting treatment.&lt;/p&gt;

&lt;p&gt;If you've ever used &lt;code&gt;Range#min&lt;/code&gt; or &lt;code&gt;Range#max&lt;/code&gt; you may have inadvertently slowed
your code significantly.&lt;/p&gt;

&lt;p&gt;Both of those ought to be O(1) - constant time. After all, a Range in Ruby
consist of two values, and though you can't be assured whether or not the
first one or the last one is the smallest/biggest one, the obvious implementation
is this:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;min&lt;/span&gt;
      &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;&lt;=&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;.. and the equivalent one for &lt;code&gt;Range#max&lt;/code&gt;. Except that's not what happens,
as you can easily convince yourself by doing:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="global"&gt;$ &lt;/span&gt;&lt;span class="ident"&gt;irb&lt;/span&gt;
    &lt;span class="ident"&gt;irb&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;main&lt;/span&gt;&lt;span class="punct"&gt;):&lt;/span&gt;&lt;span class="number"&gt;001&lt;/span&gt;&lt;span class="punct"&gt;:&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;&gt;&lt;/span&gt; &lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="number"&gt;1&lt;/span&gt;&lt;span class="punct"&gt;..&lt;/span&gt;&lt;span class="number"&gt;10000000&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;max&lt;/span&gt;
    &lt;span class="punct"&gt;=&gt;&lt;/span&gt; &lt;span class="number"&gt;10000000&lt;/span&gt;
    &lt;span class="ident"&gt;irb&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;main&lt;/span&gt;&lt;span class="punct"&gt;):&lt;/span&gt;&lt;span class="number"&gt;002&lt;/span&gt;&lt;span class="punct"&gt;:&lt;/span&gt;&lt;span class="number"&gt;0&lt;/span&gt;&lt;span class="punct"&gt;&gt;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;... and see how slow it is. Eww. The explanation is that min/max are only provided in generic versions
that iterate over the full Range (so that the same implementation also works
on other collections).&lt;/p&gt;

&lt;p&gt;If your app, like mine, frequently needs the smallest or greatest value in
a Range, it may be time to monkey patch:&lt;/p&gt;

&lt;pre class="ruby"&gt;&lt;code&gt;
    &lt;span class="keyword"&gt;class &lt;/span&gt;&lt;span class="class"&gt;Range&lt;/span&gt;
      &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;min&lt;/span&gt;
        &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;&lt;=&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
          
      &lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;max&lt;/span&gt;
        &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;&gt;=&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt; &lt;span class="punct"&gt;?&lt;/span&gt; &lt;span class="ident"&gt;first&lt;/span&gt; &lt;span class="punct"&gt;:&lt;/span&gt; &lt;span class="ident"&gt;last&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
    
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;For the app that made me notice this problem, adding the above monkey patch
caused a &lt;strong&gt;30% speedup&lt;/strong&gt;. Of course, if most of your ranges are small, or you
don't use &lt;code&gt;Range#min&lt;/code&gt; or &lt;code&gt;Range#max&lt;/code&gt; anywhere, you may not notice any difference
at all.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b2UfbTBFYkg:deYOB1ZHFZ8:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b2UfbTBFYkg:deYOB1ZHFZ8:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b2UfbTBFYkg:deYOB1ZHFZ8:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=b2UfbTBFYkg:deYOB1ZHFZ8:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=b2UfbTBFYkg:deYOB1ZHFZ8:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/b2UfbTBFYkg" height="1" width="1"/&gt;</description>
      <category>ruby</category>
      <category> range</category>
      <category> programming</category>
      <category> monkey patch</category>
      <pubDate>Thu, 05 Nov 2009 08:05:14 -0500</pubDate>
      <dc:date>2009-11-05T08:05:14-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/a-pitfall-of-the-ruby-range-class.html</feedburner:origLink></item>
    <item>
      <title>Writing a (Ruby) compiler in Ruby bottom up - step 21</title>
      <link>http://feedproxy.google.com/~r/VidarHokstad/~3/Bxp5RUYrU14/writing-a-compiler-in-ruby-bottom-up-step-21.html</link>
      <description>&lt;span style="color: red; "&gt;This is &lt;a href="http://www.hokstad.com/compiler"&gt;part of a series&lt;/a&gt; I started in March 2008 - you may want to go back and look at older parts if you're new to this series.&lt;/span&gt;

&lt;p&gt;I've been lazy lately... Well, not really, I've been extremely busy, but I
ought to have fit this in earlier. It's gotten harder and harder to get done
too, since it's now more work since I had to go back and figure out a lot of
the reasons for what I'd done.&lt;/p&gt;


&lt;p&gt;Anyway, finally a new part, though short.&lt;/p&gt;


&lt;h2&gt;Down the rabbit hole: attr_(reader|writer|accessor)&lt;/h2&gt;


&lt;p&gt;Adding attr_reader / "attr_writer" / "attr_accessor" Should be easy, right?
After all, all they do is allow read/write or both of member variables.&lt;/p&gt;


&lt;p&gt;Trouble is you can't &lt;em&gt;know&lt;/em&gt; that in advance.&lt;/p&gt;


&lt;pre class="ruby"&gt;
    class Class
      def attr_reader foo
        puts "Hah!"
      end
    end

    class Foo
      attr_reader :bar
    end

    foo = Foo.new
    p foo.bar
&lt;/pre&gt;


&lt;p&gt;Ouch. This is part of what makes Ruby exceptionally painful to compile.&lt;/p&gt;


&lt;p&gt;It doesn't mean we can't make some assumptions, though, as long as we can &lt;em&gt;handle&lt;/em&gt;
the worst case where someone does something &lt;em&gt;stupid&lt;/em&gt; (later we may want to add an
option to make it assume you're not being stupid, and enable additional optimizations).&lt;/p&gt;


&lt;p&gt;So how do we do this then?&lt;/p&gt;


&lt;p&gt;Well, the obvious answer is to implement it in Ruby. Here are naive initial
implementations (that only handle a single symbol, not an array like the real thing):&lt;/p&gt;


&lt;pre class="ruby"&gt;
    def attr_accessor sym
      attr_reader sym
      attr_writer sym
    end

    def attr_reader sym
      define_method sym do
         %s(ivar self sym)
      end
    end

    def attr_writer sym
      define_method "#{sym.to_s}=".to_sym do |val|
        %s(assign (ivar self sym) val)
      end
    end
&lt;/pre&gt;


&lt;p&gt;Note the s-expressions that rely on a new "ivar" primitive (that's not been added yet).
That part is simple enough to add, but what else does the above require to be added to
the compiler?&lt;/p&gt;


&lt;p&gt;That's the ugly part. Here's a (possibly incomplete) list:&lt;/p&gt;


&lt;ul&gt;
&lt;li&gt;define_method&lt;/li&gt;
&lt;li&gt;Real Symbol class&lt;/li&gt;
&lt;li&gt;Symbol#to_s&lt;/li&gt;
&lt;li&gt;Real String class&lt;/li&gt;
&lt;li&gt;String.to_sym&lt;/li&gt;
&lt;li&gt;Indirectly the :lambda s-expression construct needs to have support for variables&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This is part of the reason I picked attr_* as the next thing to implement: It's an important,
frequently used piece of functionality that snowballs and help drive implementation of
functionality that is actually likely to be used.&lt;/p&gt;


&lt;h2&gt;Baby steps&lt;/h2&gt;


&lt;p&gt;First take a look at &lt;a href="http://github.com/vidarh/writing-a-compiler-in-ruby/commit/6e7edd638265503c129fdc0a7b44a301cc3ca4f7"&gt;this commit&lt;/a&gt;&lt;/p&gt;


&lt;p&gt;This is one of our assumptions (in between some other cruft): If you use attr_accessor
etc., you &lt;em&gt;are&lt;/em&gt; likely to need a vtable entry for that method.&lt;/p&gt;


&lt;p&gt;Remember, we still don't implement a proper "fallback" in the form of a hash table for
cases where the vtable gets "too big" (whatever we decide too big is) so for now we &lt;em&gt;need&lt;/em&gt; this,
but even later it'll be a useful optimization for many cases.&lt;/p&gt;


&lt;p&gt;Worst case? We waste a vtable slot.&lt;/p&gt;


&lt;p&gt;&lt;a href="http://github.com/vidarh/writing-a-compiler-in-ruby/commit/7fc8ee47332115bede8531f749543136a79833f3#diff-0"&gt;Next is where we actually add the basic implementations shown above&lt;/a&gt;, plus some debug statements, and stub for "define_method".&lt;/p&gt;


&lt;p&gt;The comments bring us to an important question:&lt;/p&gt;


&lt;h2&gt;To type-tag or not to type-tag?&lt;/h2&gt;


&lt;p&gt;&lt;a href="http://www.hokstad.com/ruby-object-model.html"&gt;In MRI Symbol objects are "type tagged" integers&lt;/a&gt;.
That is, they are not real objects at all, rather each symbol is represented
by a specific 32 bit value, and those values can be identified as symbols
by looking for a specific bit-pattern in the least significant byte.&lt;/p&gt;


&lt;p&gt;This has the advantage of saving space - no actual instances need to be
constructed. In this instance, however, it creates a lot of complication,
by requiring the type tags to be checked on each and every method call.&lt;/p&gt;


&lt;p&gt;For this reason we will, at least for now, avoid it.&lt;/p&gt;


&lt;p&gt;(For Fixnum we will run into this problem again, and it will be even more
tricky - for Symbol the number of objects can be expected to be reasonably
small, but what about Fixnum? Ugh... Lets think about that later)&lt;/p&gt;


&lt;p&gt;Instead we will keep a hash table of allocated symbols, which we will
use to return the same object for the same symbol literal&lt;/p&gt;


&lt;p&gt;I'm going to skip over most of the commits here, and show you the current
state of the Symbol class and its associated compiler changes, since
I must admit my commits have been quite messy and all over the place
while putting in place these changes, and it's not very condusive to
explaining the actual changes.&lt;/p&gt;


&lt;pre class="ruby"&gt;
    class Symbol
      # Using class instance var instead of class var
      # because the latter is not properly implemented yet,
      # though in this case it may not make a difference
      #  @symbols = {} # FIXME: Adding values to a class ivar like this is broken
      
      # FIXME: Should be private, but we don't support that yet
      def initialize(name)
        @name = name
      end
    
      def to_s
        @name
      end

      # FIXME
      # The compiler should turn ":foo" into Symbol.__get_symbol("foo").
      # Alternatively, the compiler can do this _once_ at the start for 
      # any symbol encountered in the source text, and store the result.
    #  def self.__get_symbol(name)
    #    Symbol.new(name)       
    #    sym = @symbols[name]  
    #    if !sym               
    #      sym = Symbol.new(name)
    #    end
    #    sym
    #  end
    end

    def __get_symbol(name)
      Symbol.new(name)
    end
&lt;/pre&gt;


&lt;p&gt;Uh, yes. I started out with actually having it turn symbols from text into
an object at runtime, but I quickly realized this makes no sense for the case
where a literal symbol is present in the program.&lt;/p&gt;


&lt;p&gt;Note that we still need a proper Symbol#__get_symbol as commented out here
later, because it is necessary to handle things like String#to_sym. However
for now I've skipped it for a simple reason:&lt;/p&gt;


&lt;p&gt;It requires implementing a hash table. It's not that hash tables are hard.
But currently implementing Hash in pure Ruby likely will require features that
are not in place yet... So, lets sort out the literal Symbols first, and then
work our way up.&lt;/p&gt;


&lt;p&gt;The trivial version above is well and good, but &lt;a href="http://github.com/vidarh/writing-a-compiler-in-ruby/commit/f3a38a8d9085178709428107a9a13d6b484c5c99"&gt;we need to make the compiler
call __get_symbol.&lt;/a&gt;.&lt;/p&gt;


&lt;p&gt;So we replace Compiler#intern with this:&lt;/p&gt;


&lt;pre class="ruby"&gt;
    # Allocate a symbol
    def intern(scope,sym)
      # FIXME: Do this once, and add an :assign to a global var, and use that for any
      # later static occurrences of symbols.
      args = get_arg(scope,sym.to_s.rest)
      get_arg(scope,[:sexp,[:call,:__get_symbol, sym.to_s]])
    end
&lt;/pre&gt;


&lt;p&gt;As you can see from my comment, it really makes no sense to call &lt;tt&gt;__get_symbol&lt;/tt&gt; over
and over - it's inefficient. But it works for now. We'll go back and fix that later.&lt;/p&gt;


&lt;p&gt;Also note that this doesn't exactly match the above commit, but also incorporate a change
from a later commit: We do [:sexp ..] there because :sexp nodes don't get rewritten to a
:callm, and &lt;tt&gt;__get_symbol&lt;/tt&gt; here is indeed a function not a method (this distinction
doesn't really exist in Ruby, but it exist at the implementation level in our compiler,
because it eases interoperability with C - this may change whenever I get to the point
where it makes sense to implement &lt;a href="http://kenai.com/projects/ruby-ffi"&gt;FFI support&lt;/a&gt;)&lt;/p&gt;


&lt;p&gt;The other changes are simply there to reflect the fact that &lt;tt&gt;intern&lt;/tt&gt; now returns
the result of a &lt;tt&gt;get_arg&lt;/tt&gt; instead of some arbitrary integer:&lt;/p&gt;


&lt;pre class="ruby"&gt;
    -      return [:int,intern(name.rest)] if name[0] == ?:
    +      return intern(scope,name.rest) if name[0] == ?:
&lt;/pre&gt;


&lt;h2&gt;That's it for now, folks&lt;/h2&gt;


&lt;p&gt;I promise it won't be nearly as long to the next part. I need to untangle my changes for
splat operator support, method_missing debug improvements, and work on the Array and
String classes. I'll likely give each one of them a short part each before I get fully
"back on track" - I intend to write the future parts in parallel with actually
making the code changes.&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=Bxp5RUYrU14:NBVxUtcQAAk:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=Bxp5RUYrU14:NBVxUtcQAAk:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=Bxp5RUYrU14:NBVxUtcQAAk:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:iYEzUNWTmVE"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=iYEzUNWTmVE" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:I9og5sOYxJI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?d=I9og5sOYxJI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/VidarHokstad?a=Bxp5RUYrU14:NBVxUtcQAAk:Jnkt3q6G96E"&gt;&lt;img src="http://feeds.feedburner.com/~ff/VidarHokstad?i=Bxp5RUYrU14:NBVxUtcQAAk:Jnkt3q6G96E" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/VidarHokstad/~4/Bxp5RUYrU14" height="1" width="1"/&gt;</description>
      <category>ruby</category>
      <category> compiler</category>
      <category> programming</category>
      <category> compiler in Ruby bottom up</category>
      <category> tutorial</category>
      <category> symbol</category>
      <pubDate>Sun, 01 Nov 2009 17:38:38 -0500</pubDate>
      <dc:date>2009-11-01T17:38:38-05:00</dc:date>
    <feedburner:origLink>http://www.hokstad.com/writing-a-compiler-in-ruby-bottom-up-step-21.html</feedburner:origLink></item>
    <dc:date>2010-06-12T17:21:29-04:00</dc:date>
  </channel>
</rss>

