<?xml version="1.0" encoding="UTF-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title>Pat Shaughnessy</title><id>http://patshaughnessy.net</id><updated>2025-11-20T15:06:52Z</updated><author><name>Pat Shaughnessy</name></author><entry><title>Compiling Ruby To Machine Language</title><link href="https://patshaughnessy.net/2025/11/17/compiling-ruby-to-machine-language" rel="alternate"></link><id href="https://patshaughnessy.net/2025/11/17/compiling-ruby-to-machine-language" rel="alternate"></id><published>2025-11-17T00:00:00Z</published><updated>2025-11-17T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;Here’s an excerpt</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;Here’s an excerpt from the completely new content for Chapter 4, about YJIT and
ZJIT. I’m still finishing this up… so this content is fresh off the page! It’s
been a lot of fun for me to learn about how JIT compilers work and to brush up
on my Rust skills as well. And it’s very exciting to see all the impressive work
the Ruby team at Shopify and other contributors have done to improve Ruby’s
runtime performance.&lt;/p&gt;
&lt;h2&gt;Chapter 4: Compiling Ruby To Machine Language&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Interpreting vs. Compiling Ruby Code&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Yet Another JIT (YJIT)&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Virtual Machines and Actual Machines&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Counting Method and Block Calls&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;YJIT Blocks&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;YJIT Branch Stubs&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Executing YJIT Blocks and Branches&lt;/td&gt;&lt;td&gt;11&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Deferred Compilation&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Regenerating a YJIT Branch&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;YJIT Guards&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Adding Two Integers Using Machine Language&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 4-1: Which Code Does YJIT Optimize?&lt;/td&gt;&lt;td&gt;18&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How YJIT Recompiles Code&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Finding a Block Version&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Saving Multiple Block Versions&lt;/td&gt;&lt;td&gt;24&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;ZJIT, Ruby’s Next Generation JIT&lt;/td&gt;&lt;td&gt;26&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Counting Method and Block Calls&lt;/td&gt;&lt;td&gt;27&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;ZJIT Blocks&lt;/td&gt;&lt;td&gt;29&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Method Based JIT&lt;/td&gt;&lt;td&gt;31&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Rust Inside of Ruby&lt;/td&gt;&lt;td&gt;33&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 4-2: Reading  ZJIT HIR and LIR &lt;/td&gt;&lt;td&gt;35&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;37&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;Counting Method and Block Calls&lt;/h2&gt;
&lt;p&gt;To find hot spots, YJIT counts how many times your program calls each function
or block. When this count reaches a certain threshold, YJIT stops your program
and converts that section of code into machine language. Later Ruby will execute
the machine language version instead of the original YARV instructions.&lt;/p&gt;
&lt;p&gt;To keep track of these counts, YJIT saves an internal counter nearby the YARV
instruction sequence for each function or block.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/17/Figure-4-5.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 4-5: YJIT saves information adjacent to each set of YARV instructions
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 4-5 shows the YARV instruction sequence the main Ruby compiler created
for the &lt;span class=&quot;code&quot;&gt;sum += i&lt;/span&gt; block at (3) in Listing 4-1. At the
top, above the YARV instructions, Figure 4-5 shows two YJIT related values:
&lt;span class=&quot;code&quot;&gt;jit_entry&lt;/span&gt; and &lt;span
class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt;. As we’ll see in a moment, &lt;span
class=&quot;code&quot;&gt;jit_entry&lt;/span&gt; starts as a null value but will later hold a
pointer to the machine language instructions YJIT produces for this Ruby block.
Below &lt;span class=&quot;code&quot;&gt;jit_entry&lt;/span&gt;, Figure 4-5 also shows &lt;span
class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt;, YJIT’s internal counter.&lt;/p&gt;
&lt;p&gt;Each time the program in Listing 4-1 calls this block, YJIT increments the value
of &lt;span class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt;. Since the range at (1) in Listing
4-1 spans from 1 through 40, this counter will start at zero and increase by 1
each time &lt;span class=&quot;code&quot;&gt;Range#each&lt;/span&gt; calls the block at (3).&lt;/p&gt;
&lt;p&gt;When the &lt;span class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt; reaches a particular
threshold, YJIT will compile the YARV instructions into machine language. By
default for small Ruby programs YJIT in Ruby 3.5 uses a threshold of 30. Larger
programs, like Ruby on Rails web applications, will use a larger threshold value
of 120. (You can also change the threshold by passing &lt;span
class=&quot;code&quot;&gt;—yjit-call-threshold&lt;/span&gt; when you run your Ruby program.)&lt;/p&gt;
&lt;h2&gt;YJIT Blocks&lt;/h2&gt;
&lt;p&gt;While compiling your Ruby program, YJIT saves the machine language instructions
it creates into &lt;em&gt;YJIT blocks&lt;/em&gt;. YJIT blocks, which are distinct from Ruby blocks,
each contain a sequence of machine language instructions for a range of
corresponding YARV instructions. By grouping YARV instructions and compiling
each group into a YJIT block, YJIT can produce more optimized code that is
tailored to your program’s behavior and avoid compiling code that your program
doesn’t need.&lt;/p&gt;
&lt;p&gt;As we’ll see next, a single YJIT block doesn’t correspond to a Ruby function or
block. YJIT blocks instead represent smaller sections of code: individual YARV
instructions or a small range of YARV instructions. Each Ruby function or block
typically consists of several YJIT blocks.&lt;/p&gt;
&lt;p&gt;Let’s see how this works for our example. After the program in Listing 4-1
executes the Ruby block at (3) 29 times, YJIT will increment the &lt;span
class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt; counter again, just before Ruby runs the
block for the 30th time. Since &lt;span class=&quot;code&quot;&gt;jit_entry_calls&lt;/span&gt; reaches
the threshold value of 30, YJIT triggers the compilation process.&lt;/p&gt;
&lt;p&gt;YJIT compiles the first YARV instruction &lt;span class=&quot;code&quot;&gt;getlocal_WC_1&lt;/span&gt;
and saves machine language instructions that perform the same work as &lt;span
class=&quot;code&quot;&gt;getlocal_WC_1&lt;/span&gt; into a new YJIT block:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/17/Figure-4-6.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 4-6: Creating a YJIT block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;On the left side, Figure 4-6 shows the YARV instructions for the &lt;span
class=&quot;code&quot;&gt;sum += i&lt;/span&gt; Ruby block. On the right, Figure 4-6 shows the new
YJIT block corresponding to &lt;span class=&quot;code&quot;&gt;getlocal_WC_1&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Next, the YJIT compiler continues and compiles the second YARV instruction from
the left side of Figure 4-7: &lt;span class=&quot;code&quot;&gt;getlocal_WC_0&lt;/span&gt; at index 2.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/17/Figure-4-7.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 4-7: Appending to a YJIT block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;On the left side, Figure 4-7 shows the same YARV instructions for the &lt;span
class=&quot;code&quot;&gt;sum += i&lt;/span&gt; Ruby block that we saw above in Figure 4-6. But now
the two dotted arrows indicate that the YJIT block on the right contains the
machine language instructions equivalent to both &lt;span
class=&quot;code&quot;&gt;getlocal_WC_1&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;getlocal_WC_0&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Let’s take a look inside this new block. YJIT compiles or translates the Ruby
YARV instructions into machine language instructions. In this example, running
on my Mac laptop, YJIT writes the following machine language instructions into
this new block:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/17/Figure-4-8.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 4-8: The contents of one YJIT block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 4-8 shows a closer view of the new YJIT block that appeared on the right
side of Figures 4-6 and 4-7. Inside the block, Figure 4-8 shows the assembly
language acronyms corresponding to the ARM64 machine language instructions that
YJIT generated for the two YARV instructions shown on the left. The YARV
instructions on the left are: &lt;span class=&quot;code&quot;&gt;getlocal_WC_1&lt;/span&gt;, which
loads a value from a local variable located in the previous stack frame and
saves it on the YARV stack, and &lt;span class=&quot;code&quot;&gt;getlocal_WC_0&lt;/span&gt;, which
loads a local variable from the current stack from and also saves it on the YARV
stack. The machine language instructions on the right side of Figure 4-8 perform
the same task, loading these values into registers on my M1 microprocessor:
&lt;span class=&quot;code&quot;&gt;x1&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;x9&lt;/span&gt;. If you’re curious
and would like to learn more about what the machine language instructions mean
and how they work, the section “Adding Two Integers Using Machine Language”
discusses the instructions for this example in more detail.&lt;/p&gt;
&lt;h2&gt;YJIT Branch Stubs&lt;/h2&gt;
&lt;p&gt;Next, YJIT continues down the sequence of YARV instructions and compiles the
&lt;span class=&quot;code&quot;&gt;opt_plus&lt;/span&gt; YARV instruction at index 4 in Figures 4-6
and 4-7. But this time, YJIT runs into a problem: It doesn’t know the type of
the addition arguments. That is, will &lt;span class=&quot;code&quot;&gt;opt_plus&lt;/span&gt; add two
integers? Or two strings, floating point numbers, or some other types?&lt;/p&gt;
&lt;p&gt;Machine language is very specific. To add two 64-bit integers on an M1
microprocessor, YJIT could use the &lt;span class=&quot;code&quot;&gt;adds&lt;/span&gt; assembly
language instruction. But adding two floating point numbers would require
different instructions. And, of course, adding or concatenating two strings is
an entirely different operation altogether.&lt;/p&gt;
&lt;p&gt;In order for YJIT to know which machine language instructions to save into the
YJIT block for &lt;span class=&quot;code&quot;&gt;opt_plus&lt;/span&gt;, YJIT needs to know exactly
what type of values the Ruby program might ever add at (3) in Listing 4-1. You
and I can tell by reading Listing 4-1 that the Ruby code is adding integers. We
know right away that the &lt;span class=&quot;code&quot;&gt;sum += 1&lt;/span&gt; block at (3) is
always adding one integer to another. But YJIT doesn’t know this.&lt;/p&gt;
&lt;p&gt;YJIT uses a clever trick to solve this problem. Instead of analyzing the entire
program ahead of time to determine all of the possible types of values the &lt;span
class=&quot;code&quot;&gt;opt_plus&lt;/span&gt; YARV instruction might ever need to add, YJIT
simply waits until the block runs and observes which types the program actually
passes in.&lt;/p&gt;
&lt;p&gt;YJIT uses &lt;em&gt;branch stubs&lt;/em&gt; to achieve this wait-and-see compile behavior, as shown
in Figure 4-9.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/17/Figure-4-9.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 4-9: A YJIT block, branch and stub
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 4-9 shows the YARV instructions on the left, and the YJIT block for
indexes 0000-0002 on the right. But note the bottom right corner of Figure 4-7,
which shows an arrow pointing down from the block to a box labeled stub. This
arrow represents a YJIT branch. Since this new branch doesn’t point to a block
yet, YJIT sets up the branch to point to a branch stub instead.&lt;/p&gt;
</content></entry><entry><title>YARV’s Internal Stack and Your Ruby Stack</title><link href="https://patshaughnessy.net/2025/11/10/yarvs-internal-stack-and-your-ruby-stack" rel="alternate"></link><id href="https://patshaughnessy.net/2025/11/10/yarvs-internal-stack-and-your-ruby-stack" rel="alternate"></id><published>2025-11-10T00:00:00Z</published><updated>2025-11-10T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;The content of Chap</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;The content of Chapter 3, about the YARV virtual machine, hasn't changed much
since 2014.  However, I did update all of the diagrams to account for some new
values YARV now saves inside of each stack frame. And some of the common YARV
instructions were renamed as well. I also moved some content that was previously
part of Chapter 4 here into Chapter 3. Right now I'm rewriting Chapter 4 from
scratch, describing Ruby's new JIT compilers.&lt;/p&gt;
&lt;h2&gt;Chapter 3: How Ruby Executes Your Code&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;YARV’s Internal Stack and Your Ruby Stack&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Stepping Through How Ruby Executes a Simple Script&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Executing a Call to a Block&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Taking a Close Look at a YARV Instruction&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Local and Dynamic Access of Ruby Variables&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Local Variable Access&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Method Arguments Are Treated Like Local Variables&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Dynamic Variable Access&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Climbing The Environment Pointer Ladder In C&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 3-1: Exploring Special Variables&lt;/td&gt;&lt;td&gt;23&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Controlling the Flow of Execution&lt;/td&gt;&lt;td&gt;27&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Executes an if Statement&lt;/td&gt;&lt;td&gt;27&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Jumping from One Scope to Another&lt;/td&gt;&lt;td&gt;29&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Catch Tables&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Other Uses for Catch Tables&lt;/td&gt;&lt;td&gt;33&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 3-2: Testing How Ruby Implements
For Loops Internally&lt;/td&gt;&lt;td&gt;34&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;35&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;YARV’s Internal Stack and Your Ruby Stack&lt;/h2&gt;
&lt;p&gt;As we’ll see in a moment, YARV uses a stack internally to track intermediate values, arguments, and return values. YARV is a stack-oriented virtual machine.&lt;/p&gt;
&lt;p&gt;In addition to its own internal stack, YARV keeps track of your Ruby program’s
&lt;em&gt;call stack&lt;/em&gt;, recording which methods call which other methods, functions, blocks,
lambdas, and so on. In fact, YARV is not just a stack machine—it’s a
double-stack machine! It has to track the arguments and return values not only
for its own internal instructions but also for your Ruby program.&lt;/p&gt;
&lt;p&gt;Figure 3-1 shows YARV’s basic registers and internal stack.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-1.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-1: Some of YARV’s internal registers, including the program counter and stack pointer
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;YARV’s internal stack is on the left. The SP label is the stack pointer, or the
location of the top of the stack. On the right are the instructions that YARV is
executing. PC is the program counter, or the location of the current
instruction. &lt;/p&gt;
&lt;p&gt;You can see the YARV instructions that Ruby compiled from the &lt;span
class=&quot;code&quot;&gt;puts 2+2&lt;/span&gt; example on the right side of Figure 3-1. YARV
stores both the SP and PC registers in a C structure called &lt;span
class=&quot;code&quot;&gt;rb_control_frame_t&lt;/span&gt;, along with the current value of Ruby’s
&lt;span class=&quot;code&quot;&gt;self&lt;/span&gt; variable and some other values not shown here. &lt;/p&gt;
&lt;p&gt;At the same time, YARV maintains another stack of these &lt;span
class=&quot;code&quot;&gt;rb_control_frame_t&lt;/span&gt; structures, as shown in Figure 3-2.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-2.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-2: YARV keeps track of your Ruby call stack using a series of rb_control_frame_t structures.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This second stack of &lt;span class=&quot;code&quot;&gt;rb_control_frame_t&lt;/span&gt; structures
represents the path that YARV has taken through your Ruby program, and YARV’s
current location. In other words, this is your Ruby call stack—what you would
see if you ran &lt;span class=&quot;code&quot;&gt;puts caller&lt;/span&gt;. &lt;/p&gt;
&lt;p&gt;The CFP pointer indicates the current frame pointer. Each stack frame in your
Ruby program stack contains, in turn, a different value for the self, PC, and SP
registers, as shown in Figure 3-1. Ruby also keeps track of type of code running
at each level in your Ruby call stack, indicated by the “[BLOCK]”, “[METHOD]”
notation in Figure 3-2.&lt;/p&gt;
&lt;h2&gt;Stepping Through How Ruby Executes a Simple Script&lt;/h2&gt;
&lt;p&gt;In order to help you understand this a bit better, here are a couple of
examples. I’ll begin with the simple 2+2 example from Chapters 1 and 2, shown
again in Listing 3-1.&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
 Listing 3-1: A one-line Ruby program that we’ll execute as an example
&lt;/div&gt;
&lt;p&gt;This one-line Ruby script doesn’t have a Ruby call stack, so I’ll focus on the
internal YARV stack for now. Figure 3-3 shows how YARV will execute this script,
beginning with the first instruction, &lt;span class=&quot;code&quot;&gt;putself&lt;/span&gt;.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-3.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-3: On the left is YARV’s internal stack, and on the right is the compiled version of my puts 2+2 program.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;As you can see in Figure 3-3, YARV starts the program counter (PC) at the first
instruction, and initially the stack is empty. Now YARV executes the &lt;span
class=&quot;code&quot;&gt;putself&lt;/span&gt; instruction, and pushes the current value of self
onto the stack, as shown in Figure 3-4.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-4.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-4: putself pushes the top self value onto the stack
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Because this simple script contains no Ruby objects or classes, the self pointer
is set to the default top self object. This is an instance of the &lt;span
class=&quot;code&quot;&gt;Object&lt;/span&gt; class that Ruby automatically creates when YARV
starts. It serves as the receiver for method calls and the container for
instance variables in the top-level scope.  The top self object contains a
single, predefined &lt;span class=&quot;code&quot;&gt;to_s&lt;/span&gt; method, which returns the
string “main.” You can call this method by running the following command in the
console: &lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;$ ruby &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;e &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;puts self&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;=&amp;gt; main&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;YARV will use this self value on the stack when it executes the
&lt;span class=&quot;code&quot;&gt;opt_send_without_block&lt;/span&gt; instruction: self is the
receiver of the &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt; method because I didn’t specify a
receiver for this method call. &lt;/p&gt;
&lt;p&gt;Next, YARV executes &lt;span class=&quot;code&quot;&gt;putobject 2&lt;/span&gt;. It pushes the numeric
value 2 onto the stack and increments the PC again, as shown in Figure 3-5.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-5.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-5: Ruby pushes the value 2 onto the stack, the receiver of the + method.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;This is the first step of the receiver (arguments) operation pattern described
in “How Ruby Compiles a Simple Script” on page 34. First, Ruby pushes the
receiver onto the internal YARV stack. In this example, the &lt;span
class=&quot;code&quot;&gt;Fixnum&lt;/span&gt; object 2 is the receiver of the message/method &lt;span
class=&quot;code&quot;&gt;+&lt;/span&gt;, which takes a single argument, also a 2.  Next, Ruby
pushes the argument 2, as shown in Figure 3-6.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-6.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-6: Ruby pushes another value 2 onto the stack, the argument of the + method.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Finally, Ruby executes the + operation. In this case, &lt;span
class=&quot;code&quot;&gt;opt_plus&lt;/span&gt; is an optimized instruction that will add two
values: the receiver and the argument, as shown in Figure 3-7.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-7.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-7: Figure 3-7: The opt_plus instruction calculates 2 + 2 = 4.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;As you can see in Figure 3-7, the &lt;span class=&quot;code&quot;&gt;opt_plus&lt;/span&gt; instruction
leaves the result, 4, at the top of the stack. Now Ruby is perfectly positioned
to execute the &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt; function call: The receiver &lt;span
class=&quot;code&quot;&gt;self&lt;/span&gt; is first on the stack, and the single argument, 4, is
at the top of the stack.  (I’ll describe how method lookup works in Chapter 6.) &lt;/p&gt;
&lt;p&gt;Next, Figure 3-8 shows what happens when Ruby executes the &lt;span
class=&quot;code&quot;&gt;puts&lt;/span&gt; method call. As you can see, the &lt;span
class=&quot;code&quot;&gt;opt_send_without_block&lt;/span&gt; instruction leaves the return value,
&lt;span class=&quot;code&quot;&gt;nil&lt;/span&gt;, at the top of the stack. Finally, Ruby executes
the last instruction, &lt;span class=&quot;code&quot;&gt;leave&lt;/span&gt;, which finishes the
execution of our simple, one-line Ruby program. Of course, when Ruby executes
the &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt; call, the C code implementing the &lt;span
class=&quot;code&quot;&gt;puts&lt;/span&gt; function will actually display the value 4 in the
console output. &lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/10/Figure-3-8.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 3-8: Ruby calls the puts method on the top self object.
&lt;/span&gt;
&lt;/div&gt;
</content></entry><entry><title>Compiling a Call to a Block</title><link href="https://patshaughnessy.net/2025/11/3/compiling-a-call-to-a-block" rel="alternate"></link><id href="https://patshaughnessy.net/2025/11/3/compiling-a-call-to-a-block" rel="alternate"></id><published>2025-11-03T00:00:00Z</published><updated>2025-11-03T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;This week's excerpt</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;https://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;This week's excerpt is from Chapter 2, about Ruby's compiler. Whenever I think
about it, I'm always suprised that Ruby has a compiler like C, Java or any other
programming language. The only difference is that we don't normally interact
with Ruby's compiler directly.&lt;/p&gt;
&lt;p&gt;The developers who contributed Ruby's new parser, Prism, also had to rewrite
the Ruby compiler because Prism now produces a completely different, redesigned
abstract syntax tree (AST). Chapter 2's outline is more or less the same as it
was in 2014, but I redrew all of the diagrams and updated much of the text to
match the new AST nodes and other changes for Prism.&lt;/p&gt;
&lt;h2&gt;Chapter 2: Compilation&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;The Ruby Compiler&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Ruby 3.2 Introduces a Just-In-Time (JIT) Compiler &lt;/td&gt;&lt;td&gt;5&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Compiles a Simple Script&lt;/td&gt;&lt;td&gt;6&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Scope AST Nodes&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Compiling a Simple AST&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Compiling a Call to a Block&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Iterates Through the AST&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 2-1: Displaying YARV Instructions&lt;/td&gt;&lt;td&gt;19&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;The Local Table&lt;/td&gt;&lt;td&gt;21&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Compiling Optional Arguments&lt;/td&gt;&lt;td&gt;23&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Compiling Keyword Arguments&lt;/td&gt;&lt;td&gt;24&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Unnamed Local Variables&lt;/td&gt;&lt;td&gt;25&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 2-2: Displaying the Local Table&lt;/td&gt;&lt;td&gt;28&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;Compiling a Call to a Block&lt;/h2&gt;
&lt;p&gt;Next, let’s compile my &lt;span class=&quot;code&quot;&gt;10.times&lt;/span&gt; do example from
Listing 1-1 in Chapter 1 (see Listing 2-2).&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#d08770;&quot;&gt;10&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;n&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  puts n
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
 Listing 2-2: A simple script that calls a block (repeated from Listing 1-1)
&lt;/div&gt;
&lt;p&gt;Notice that this example contains a block parameter to the &lt;span
class=&quot;code&quot;&gt;times&lt;/span&gt; method. This is interesting because it will give us a
chance to see how the Ruby compiler handles blocks. Figure 2-13 shows the AST
for the &lt;span class=&quot;code&quot;&gt;10.times do&lt;/span&gt; example again.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-13.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-13: A simplified view of the AST for the call to 10.times, passing a block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;The left side of Figure 2-13 shows the AST for the &lt;span
class=&quot;code&quot;&gt;10.times&lt;/span&gt; function call: the call node and the receiver 10,
represented by integer node. On the right, Figure 2-13 shows the beginning of
the AST for the block: &lt;span class=&quot;code&quot;&gt;do |n| puts n end&lt;/span&gt;, represented
by the block node. You can see Ruby has added a scope node on both sides, since
there are two lexical scopes in Listing 2-2: the top level and the block.  Let’s
break down how Ruby compiles the main portion of the script shown on the left of
Figure 2-13. As before, Ruby starts with the first &lt;span
class=&quot;code&quot;&gt;PM_NODE_SCOPE&lt;/span&gt; and creates a new snippet of YARV instructions,
as shown in Figure 2-14.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-14.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-14: Each PM_SCOPE_NODE is compiled into a new snippet of YARV instructions.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Next, Ruby steps down the AST nodes to &lt;span class=&quot;code&quot;&gt;PM_CALL_NODE,&lt;/span&gt;
as shown in Figure 2-15.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-15.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-15: Ruby stepping through an AST
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;At this point, there is still no code generated, but notice in Figure 2-13 that
two arrows lead from &lt;span class=&quot;code&quot;&gt;PM_CALL_NODE&lt;/span&gt;: one to &lt;span
class=&quot;code&quot;&gt;PM_INTEGER_NODE&lt;/span&gt;, which represents the 10 in the &lt;span
class=&quot;code&quot;&gt;10.times&lt;/span&gt; call, and another to the inner block. Ruby will
first continue down the AST to the integer node and compile the &lt;span
class=&quot;code&quot;&gt;10.times&lt;/span&gt; method call.  The resulting YARV code, following
the same receiver-arguments-message pattern we saw in Figures 2-7 through 2-11,
is shown in Figure 2-16.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-16.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-16: Ruby compiles the 10.times method call.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Notice that the new YARV instructions shown in Figure 2-16 push the receiver
(the integer object 10) onto the stack first, after which Ruby generates an
instruction to execute the &lt;span class=&quot;code&quot;&gt;times&lt;/span&gt; method call. But
notice, too, the &lt;span class=&quot;code&quot;&gt;block in &amp;lt;main&amp;gt;&lt;/span&gt; argument in the
&lt;span class=&quot;code&quot;&gt;send&lt;/span&gt; instruction. This indicates that the method call
also contains a block argument: &lt;span class=&quot;code&quot;&gt;do |n| puts n end&lt;/span&gt;. In
this example, the arrow from &lt;span class=&quot;code&quot;&gt;PM_CALL_NODE&lt;/span&gt; to the
second &lt;span class=&quot;code&quot;&gt;PM_SCOPE_NODE&lt;/span&gt; has caused the Ruby compiler to
include this block argument.  Ruby continues by compiling the inner block,
beginning with the second &lt;span class=&quot;code&quot;&gt;PM_CALL_NODE&lt;/span&gt; shown at right
in Figure 2-13.  Figure 2-17 shows what the AST for that inner block looks like.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-17.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-17: The branch of the AST for the contents of the block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Notice Ruby inserted a scope node at the top of this branch of the AST also.
Figure 2-17 shows the scope node contains two values: &lt;span
class=&quot;code&quot;&gt;argc=1&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;locals: [n]&lt;/span&gt;.  These
values were empty in the parent scope node, but Ruby set them here to indicate
the presence of the block parameter &lt;span class=&quot;code&quot;&gt;n&lt;/span&gt;.  From a
relatively high level, Figure 2-18 shows how Ruby compiles the inner block.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/11/3/Figure-2-18.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
Figure 2-18: How Ruby compiles a call to a block
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;You can see the parent &lt;span class=&quot;code&quot;&gt;PM_NODE_SCOPE&lt;/span&gt; at the top, along
with the YARV code from Figure 2-16. And below that Figure 2-18 shows the the
inner scope node for the block, along with the YARV instructions for the block’s
call to &lt;span class=&quot;code&quot;&gt;puts n&lt;/span&gt;. Later in this chapter we’ll learn how
Ruby handles parameters and local variables, like &lt;span class=&quot;code&quot;&gt;n&lt;/span&gt; in
this example; why Ruby generates these instructions for &lt;span class=&quot;code&quot;&gt;puts
n&lt;/span&gt;.  The key point for now is that Ruby compiles each distinct scope in
your Ruby program—methods, blocks, classes, or modules, for example—into a
separate snippet of YARV instructions.&lt;/p&gt;
</content></entry><entry><title>Parsing: How Ruby Understands Your Code</title><link href="https://patshaughnessy.net/2025/10/27/parsing-how-ruby-understands-your-code" rel="alternate"></link><id href="https://patshaughnessy.net/2025/10/27/parsing-how-ruby-understands-your-code" rel="alternate"></id><published>2025-10-27T00:00:00Z</published><updated>2025-10-27T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/stro</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: I’ve made a lot of progress so far this year. I had time to
completely rewrite Chapters 1 and 2, which cover Ruby’s new Prism parser and the Ruby
compiler which now handles the Prism AST. I also updated Chapter 3 about YARV
and right now I’m working on rewriting Chapter 4 which will cover YJIT and
possibly other Ruby JIT compilers.&lt;/p&gt;
&lt;p&gt;Here’s an excerpt from the new version of Chapter 1. Many thanks to Kevin
Newton, who reviewed the content about Prism and had a number of corrections and
great suggestions. Also thanks to Douglas Eichelberger who had some great
feedback as well.&lt;/p&gt;
&lt;p&gt;I’ll post more excerpts from Chapters 2, 3 and 4 in the coming weeks. Thanks for
everyone’s interest in Ruby Under a Microscope! &lt;/p&gt;
&lt;h2&gt;Chapter 1: Tokenization And Parsing&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Tokens: The Words That Make Up the Ruby Language&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Which Words Are Reserved Words?&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 1-1: Using Prism to Tokenize Different Ruby Scripts&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Parsing: How Ruby Understands Your Code&lt;/td&gt;&lt;td&gt;13&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Identifying Tokens&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Parsing Subexpressions Recursively&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Comparing Tokens&lt;/td&gt;&lt;td&gt;17&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Operator Precedence&lt;/td&gt;&lt;td&gt;19&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Left and Right Associative Operators&lt;/td&gt;&lt;td&gt;24&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Binding Powers&lt;/td&gt;&lt;td&gt;28&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 1-2: Using Prism to Parse Different Ruby Scripts&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;33&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;Parsing: How Ruby Understands Your Code&lt;/h2&gt;
&lt;p&gt;Once Ruby converts your code into a series of tokens, what does it do next? How
does it actually understand and run your program? Does Ruby simply step through
the tokens and execute each one in order?&lt;/p&gt;
&lt;p&gt;No. Your code still has a long way to go before Ruby can run it. The next step
on its journey through Ruby is called &lt;em&gt;parsing&lt;/em&gt;, where words or tokens are grouped
into sentences or phrases that make sense to Ruby. When parsing, Ruby takes into
account the order of operations, methods, blocks, and other larger code
structures.&lt;/p&gt;
&lt;p&gt;Ruby’s parsing engine defines Ruby’s syntax rules. Reading in tokens, Ruby
matches the token types and the order the tokens appear with a large series of
patterns. These patterns, indeed, are the heart and soul of the Ruby language.
How we write a function call, how we define a method using the &lt;span
class=&quot;code&quot;&gt;def&lt;/span&gt; keyword, how we write classes and modules - the patterns
Ruby looks for define the language.&lt;/p&gt;
&lt;p&gt;Ruby’s parse algorithm has three high level steps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Identify&lt;/strong&gt;: First, Ruby identifies what the next token represents. Ruby does
this by comparing the token’s type - and possibly the types of the following
tokens - with a large series of patterns. If one pattern matches, Ruby
understands what your code means. If not, Ruby emits a syntax error.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Recurse&lt;/strong&gt;: Secondly, Ruby calls itself. Each value in one of the syntax
patterns can itself be a subexpression - a smaller program that Ruby needs to
parse. To do this, Ruby calls itself recursively.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compare&lt;/strong&gt;: Third, Ruby compares the current token with the next token to
determine which has a higher precedence. This comparison leads Ruby down a
specific path, processing the tokens in a certain order.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let’s break down these ideas further, by following Ruby through the “Hello
World” program. Afterwards, we’ll look at a second, slightly more complicated
example.&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Hello World&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
  Listing 1-11: Hello World in Ruby
&lt;/div&gt;
&lt;p&gt;As we saw in the previous section, Ruby first converts the text in this code
file into tokens. For Hello World, Ruby’s tokenizer produces these five tokens:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;75%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-14.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-14: Hello World Tokenized
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;To make the following diagrams simpler, let’s redraw these tokens in a more
compact format:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-15.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-15: Hello World tokens in a more compact format
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Using a single gray line of text, Figure 1-15 shows the five tokens from Figure
1-14 in a more compact format. First, &lt;span
class=&quot;code&quot;&gt;PM_TOKEN_IDENTIFIER&lt;/span&gt; represents the word “puts” from the
beginning of the program. Next, three tokens make up the string literal value:
&lt;span class=&quot;code&quot;&gt;PM_TOKEN_STRING_BEGIN&lt;/span&gt; for the first double quote,
followed by &lt;span class=&quot;code&quot;&gt;PM_TOKEN_STRING_VALUE&lt;/span&gt; for the words Hello
and World, and &lt;span class=&quot;code&quot;&gt;PM_TOKEN_STRING_END&lt;/span&gt; represents the
second quote. Finally, the program ends with &lt;span
class=&quot;code&quot;&gt;PM_TOKEN_EOF&lt;/span&gt; to mark the end of the source code file.&lt;/p&gt;
&lt;p&gt;Now let’s follow Ruby as it processes the Hello World example using the three
steps: identify, recurse and compare.&lt;/p&gt;
&lt;h2&gt;Identifying Tokens&lt;/h2&gt;
&lt;p&gt;First, &lt;em&gt;identify&lt;/em&gt;. How does Ruby understand what the first token,
&lt;span class=&quot;code&quot;&gt;PM_TOKEN_IDENTIFIER&lt;/span&gt;, means?&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-16.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-16: Parsing the first token
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 1-16 represents the state of Ruby’s parser when it starts to parse this
code. At this moment, Ruby is just getting started by inspecting the &lt;span
class=&quot;code&quot;&gt;puts&lt;/span&gt; identifier. One of the patterns Ruby looks for matches
the identifier; but what does this identifier mean? Ruby knows &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt; could be a
local variable, or it could be the name of a function to call. Since there are
no local variables defined in this program, Ruby determines that the &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt;
identifier represents a function the program is calling. (It’s also possible
that the program is about to create a new local variable like this: &lt;span class=&quot;code&quot;&gt;puts =
&amp;quot;Hello World&amp;quot;&lt;/span&gt;. If that were the case, Ruby would see the assignment operator
next and parse things differently.)&lt;/p&gt;
&lt;p&gt;What happens next? After matching the token to the function call pattern, Ruby
records this match in a data structure called an abstract syntax tree (AST).
Ruby and most other programming languages use ASTs to record the results of
parsing tokens like this. As we’ll see, the AST’s tree structure is well suited
for holding the nested, recursive structure of computer programs.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;25%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-17.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-17: The first AST node
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 1-17 shows the first node Ruby saves in the AST tree. In a moment, Ruby
will begin to add more nodes to the AST.&lt;/p&gt;
&lt;p&gt;Before proceeding to the next token, let’s imagine the syntax pattern for a
function call:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;function&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;name ( argument1, argument2, argument3, etc. )&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Although in Ruby the parentheses are optional, so this pattern also applies:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;function&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;name argument1, argument2, argument3, etc.&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;&lt;br/&gt;
&lt;i&gt;
The original version of the Ruby parser used patterns or grammar rules like this
directly with a tool called a parser generator. However, starting with Ruby 3.3,
Ruby uses a new parser called Prism, which detects these patterns directly using
hand written C code.
&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;After parsing the first token, Ruby inspects the second token. According to the
function call pattern, Ruby knows the second token might represent the first
argument to the function call. But, how many arguments are there? And what is
each argument? The program in Listing 1-11 is very simple, but it could have
instead printed a very complex expression - the arguments to &lt;span
class=&quot;code&quot;&gt;puts&lt;/span&gt; could have run on for many lines and used hundreds of
tokens.&lt;/p&gt;
&lt;h2&gt;Parsing Subexpressions&lt;/h2&gt;
&lt;p&gt;Second, &lt;em&gt;recurse&lt;/em&gt;. To parse each of the arguments to puts, Ruby has to call itself.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-18.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-18: Parsing the second token
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 1-18 shows two levels of the Ruby parser’s call stack; the top line shows
Ruby parsing the puts identifier token, and matching the function call pattern.
The second line shows how Ruby called itself to parse the second token,
&lt;span class=&quot;code&quot;&gt;PM_TOKEN_STRING_BEGIN&lt;/span&gt;, the leading quote of the string
literal. Think of these lines as the backtrace of the Ruby parser.&lt;/p&gt;
&lt;p&gt;Figure 1-18 also shows a value 14 on the right side. While calling itself
recursively, Ruby passes in a numeric value called the &lt;em&gt;binding power&lt;/em&gt;. We’ll
return to this later.&lt;/p&gt;
&lt;p&gt;Now that Ruby has called itself, Ruby starts the 3-step process all over again:
identify, recurse and compare. This time, Ruby has to identify what the &lt;span
class=&quot;code&quot;&gt;PM_TOKEN_STRING_BEGIN&lt;/span&gt; token means. This token always
indicates the start of a string value. In this example &lt;span
class=&quot;code&quot;&gt;PM_TOKEN_STRING_BEGIN&lt;/span&gt; represents the double quote that
appears after &lt;span class=&quot;code&quot;&gt;puts&lt;/span&gt;. But the same token might represent
a single quote or one of the other ways you can write a string in Ruby, for
example using &lt;span class=&quot;code&quot;&gt;%Q&lt;/span&gt; or &lt;span class=&quot;code&quot;&gt;%q&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Ruby’s new parser, Prism, next parses the string contents directly by processing
the following two tokens:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;40%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-19.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-19: Parsing the third and fourth tokens
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;In this example, Ruby’s parser is done after finding the &lt;span
class=&quot;code&quot;&gt;PM_TOKEN_STRING_END&lt;/span&gt; token and can continue to the next step.
More complex strings - strings that contain interpolated values using &lt;span
class=&quot;code&quot;&gt;#{}&lt;/span&gt; for example - might have required Ruby to call itself
yet again to process more nested expressions. But for the simple &lt;span
class=&quot;code&quot;&gt;&amp;quot;Hello World&amp;quot;&lt;/span&gt; string Ruby is done.&lt;/p&gt;
&lt;p&gt;To record the string value, Ruby creates a new AST node called &lt;span
class=&quot;code&quot;&gt;PM_STRING_NODE&lt;/span&gt;.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;60%&quot; src=&quot;https://patshaughnessy.net/assets/2025/10/27/Figure-1-20.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 1-20: Two AST nodes
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 1-20 shows two AST nodes Ruby has created so far: the call node created
earlier, and now a new string node.&lt;/p&gt;
&lt;p&gt;Ruby’s parser is a &lt;em&gt;recursive descent parser&lt;/em&gt;. This Computer Science term
describes parsers that resemble the grammar or syntax rules of the programs they
parse, and call themselves recursively in a top-down manner as they process
nested structures. Many modern programming languages today use this general
approach.&lt;/p&gt;
</content></entry><entry><title>Write Barriers</title><link href="https://patshaughnessy.net/2025/2/18/write-barriers" rel="alternate"></link><id href="https://patshaughnessy.net/2025/2/18/write-barriers" rel="alternate"></id><published>2025-02-18T00:00:00Z</published><updated>2025-02-18T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;Ruby’s garbage col</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;Ruby’s garbage collection implementation is complex and confusing - yet powerful
and beautiful at the same time. Chapter 13 summarizes and explains advanced
aspects of Ruby’s garbage collector. I love learning about the difficult bits at
a deep enough level to be able to explain them to someone else. In RUM I often
show snippets from Ruby's C source code in gray callouts. These sections are for
readers who already understand C syntax, or who would like to learn it.  These
can give you sign posts in case you ever decide to read Ruby’s source code for
yourself.&lt;/p&gt;
&lt;h2&gt;Chapter 13: Incremental and Generational Garbage Collection&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Incremental GC&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Marking in More Detail&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Stop The World Garbage Collection&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Incremental Garbage Collection&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Implements Incremental GC&lt;/td&gt;&lt;td&gt;11&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Write Barriers&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;A Write Barrier in Action&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 13-1: What happens if GC can’t free enough slots?&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Generational GC&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Keeping Track of an Object’s Age&lt;/td&gt;&lt;td&gt;23&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Marking and Sweeping Work Together&lt;/td&gt;&lt;td&gt;24&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Seeing Generational GC In Action&lt;/td&gt;&lt;td&gt;26&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Write Barriers Again&lt;/td&gt;&lt;td&gt;29&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;The Remember Set and Unprotected Values&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Garbage Collection Bitmaps&lt;/td&gt;&lt;td&gt;31&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Major and Minor GCs&lt;/td&gt;&lt;td&gt;34&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 13-2: Major and Minor GCs&lt;/td&gt;&lt;td&gt;35&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;40&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;Write Barriers&lt;/h2&gt;
&lt;p&gt;Write barriers warn Ruby’s garbage collection algorithm whenever your program
creates a new object that might have to be marked. You can think of the
barriers as small boxes that surround arrays, hashes and other data structures
you might have in your program. For example, in Listing 13-1 Ruby places a write
barrier around &lt;span class=&quot;code&quot;&gt;arr&lt;/span&gt;, the array that contains all of the
new objects:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;50%&quot; src=&quot;https://patshaughnessy.net/assets/2025/2/18/Figure-13-9.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 13-9: A Write Barrier
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 13-9 repeats Figure 13-6, which showed the array &lt;span
class=&quot;code&quot;&gt;arr&lt;/span&gt;, just after Ruby removed it from the mark stack, along
with all of the elements of &lt;span class=&quot;code&quot;&gt;arr&lt;/span&gt;, still on the stack.
However, on the left the dotted rectangle represents a writer barrier around
this array. (In reality, there’s nothing special about this array; Ruby uses
write barriers for all arrays, hashes and other data structures.)&lt;/p&gt;
&lt;p&gt;The write barrier, as the name implies, jumps into action whenever your program
writes to the array inside. In this example, the barrier will be triggered when
the program runs the line of code at (3) in Listing 13-1 inside the loop:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) arr.push(&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Object&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;If your program was running between incremental GC steps when Ruby added a new
element to the array, Ruby would move the array back on to the mark stack:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;50%&quot; src=&quot;https://patshaughnessy.net/assets/2025/2/18/Figure-13-10.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 13-10: Triggering a Write Barrier Pushes the Array Back on the Mark Stack
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 13-10 shows the array &lt;span class=&quot;code&quot;&gt;arr&lt;/span&gt; on the left, inside
of a write barrier. The line of code just above, &lt;span
class=&quot;code&quot;&gt;arr.push&lt;/span&gt;, triggers the write barrier because it writes to
the array’s contents. This triggers the write barrier, which moves the array
&lt;span class=&quot;code&quot;&gt;arr&lt;/span&gt; back on to the mark stack, shown on the right. Now
during the next GC step, Ruby will process the array’s children again, even if
it had processed them earlier. This allows Ruby to find and mark the new object
just added to the array.&lt;/p&gt;
&lt;p&gt;The mark stack is how Ruby remembers its place inside of the GC algorithm,
between one incremental GC and the next. Whenever an incremental GC step starts,
Ruby continues to mark the objects present on the mark stack that were pending
from last time &lt;em&gt;or that were modified in the meantime&lt;/em&gt;.&lt;/p&gt;
&lt;div class=&quot;c-code&quot;&gt;
&lt;h2&gt;A Write Barrier in Action&lt;/h2&gt;
&lt;p&gt;Whenever your program modifies an array, hash or other value protected by write
barriers, Ruby calls this C code internally. &lt;code&gt;rb_gc_impl_writebarrier_remember&lt;/code&gt; at
(1) in Listing 13-2 pushes a modified object back on to the mark stack, as shown
above in Figure 13-10:&lt;/p&gt;
&lt;pre&gt;(1) void rb_gc_impl_writebarrier_remember(void *objspace_ptr, VALUE obj)
    {
        rb_objspace_t *objspace = objspace_ptr;
    
        gc_report(1, objspace, &quot;rb_gc_writebarrier_remember: %s\n&quot;, rb_obj_info(obj));
    
(2)     if (is_incremental_marking(objspace)) {
(3)         if (RVALUE_BLACK_P(objspace, obj)) {
(4)             gc_grey(objspace, obj);
            }
        }
        else {
(5)         if (RVALUE_OLD_P(objspace, obj)) {
                rgengc_remember(objspace, obj);
            }
        }
    }&lt;/pre&gt;
&lt;div style=&quot;font-style: italic; margin: 0 0 20px 0&quot;&gt;
  Listing 13-2: The rb_gc_writebarrier_remember function
&lt;/div&gt;
&lt;p&gt;Ruby first checks at (2) whether it is currently in the midst of an incremental
garbage collection. If it is, Ruby continues to the line at (3), and checks
whether the given object was already completely processed: whether Ruby marked
it and all of its children. (Recall the non-inclusive term “black” from the
tricolor GC algorithm.) For example, in Figure 13-9 the &lt;code&gt;arr&lt;/code&gt; value was
completed processed, since Ruby marked it and also moved all of its children on
to the mark stack, removing &lt;code&gt;arr&lt;/code&gt; from the mark stack. &lt;/p&gt;
&lt;p&gt;If Ruby did already mark the value, and if Ruby already pushed the value’s
children on to the mark stack, then Ruby knows it needs to process the value
again - since your program modified it. In this case, Ruby continues on to the
line at (4) and calls &lt;code&gt;gc_grey&lt;/code&gt;, which pushes the value back on to the mark
stack.  Later Ruby will iterate over its children again, pushing each of them
back on to the mark stack.&lt;/p&gt;
&lt;p&gt;Looking ahead to the next section, Generational GC on page 22, write barriers
use the code in the else clause at (5) to handle “old” values. We’ll cover
Generational GC next. But first, Experiment 13-1 will take a look at incremental
GC in action.&lt;/p&gt;
&lt;/div&gt;
</content></entry><entry><title>Using Different Size Pools</title><link href="https://patshaughnessy.net/2025/2/11/using-different-size-pools" rel="alternate"></link><id href="https://patshaughnessy.net/2025/2/11/using-different-size-pools" rel="alternate"></id><published>2025-02-11T00:00:00Z</published><updated>2025-02-11T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;
I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.
&lt;p&gt;The Ruby team has done </summary><content type="html">&lt;p&gt;
I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.
&lt;p&gt;The Ruby team has done a tremendous amount of work over the past decade on
Ruby's garbage collection (GC) implementation. In fact, Ruby's new GC is one of
the key reasons Ruby 3 is so much faster than Ruby 2. To bring all of this work
to light, I decided to rewrite Chapter 12 from scratch, covering garbage
collection in Ruby more accurately and in more depth.  But then, after a few
months, I realized I had gotten carried away and wrote too much material for one
chapter. So the updated book will have two new chapters on garbage collection:
one on garbage collection basics and a second new chapter on incremental and
generational garbage collection. Here's a small excerpt.&lt;/p&gt;
&lt;h2&gt;Chapter 12: Garbage Collection Basics&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Where Do Ruby Values Live?&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;The RVALUE Structure&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;RVALUE Written in C&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;The Free List&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Embedded Values&lt;/td&gt;&lt;td&gt;10&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Size Pools&lt;/td&gt;&lt;td&gt;12&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 12-1: Using Different Size Pools&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Cleaning Up Unused Values&lt;/td&gt;&lt;td&gt;19&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Filling Up a Page&lt;/td&gt;&lt;td&gt;19&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Marking&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Sweeping&lt;/td&gt;&lt;td&gt;24&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Frees An Object&lt;/td&gt;&lt;td&gt;28&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 12-2: When Does Ruby Collect Your Garbage?&lt;/td&gt;&lt;td&gt;29&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;25&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img id=&quot;microscope&quot; src=&quot;https://patshaughnessy.net/assets/2025/2/4/experiment.png&quot;&gt;&lt;br/&gt;
&lt;/div&gt;
&lt;h2&gt;Experiment 12-1: Using Different Size Pools&lt;/h2&gt;
&lt;p&gt;Ruby 3.2 and later provides a way to see statistics about size pools, the
&lt;span class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; class method. &lt;span
class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; returns a hash as shown in Listing 12-5.&lt;/p&gt;
&lt;div style=&quot;clear: left&quot;&gt;&lt;/div&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#795da3;&quot;&gt;require &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;pp&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;pp &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;GC&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.stat_heap
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0 &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;=&amp;gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  {&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;slot_size: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;40&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;heap_eden_pages: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;heap_eden_slots: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;16374&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;total_allocated_pages: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;force_major_gc_count: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;force_incremental_marking_finish_count: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;total_allocated_objects: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;26378&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;total_freed_objects: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10231&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;},
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1 &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;=&amp;gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  {&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;slot_size: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;80&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Etc…&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
  Listing 12-5: Calling GC.stat_heap
&lt;/div&gt;
&lt;p&gt;Listing 12-5 shows the output of &lt;span class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; for Size
Pool 0, which includes the slot size (&lt;span class=&quot;code&quot;&gt;:slot_size=&amp;gt;40&lt;/span&gt;),
the number of pages Ruby has allocated so far for this size (&lt;span
class=&quot;code&quot;&gt;:heap_eden_pages=&amp;gt;10&lt;/span&gt;) and the total number of slots
allocated across all of these pages (&lt;span
class=&quot;code&quot;&gt;:heap_eden_slots=&amp;gt;16374&lt;/span&gt;). The output of &lt;span
class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; continues on to show the same statistics for
the other size pools.&lt;/p&gt;
&lt;p&gt;We can use &lt;span class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; to investigate how Ruby uses
size pools as we allocate more and more objects. Listing 12-6 shows a Ruby
program that allocates arrays of varying sizes, and then records the output from
&lt;span class=&quot;code&quot;&gt;GC.stat_heap.&lt;/span&gt;&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) CAPACITY_ITER &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;100
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    ALLOCATE_ITER &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10000
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) all &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;CAPACITY_ITER&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;capa&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;4&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)   &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ALLOCATE_ITER&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;i&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        all &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Array.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(capa)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;5&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)   total_slots_by_size_pool &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;GC&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.stat_heap.each &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size_pool, stats&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        total_slots_by_size_pool[size_pool] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; stats[&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:heap_eden_slots&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      print &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{capa}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      print &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{total_slots_by_size_pool[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      print &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{total_slots_by_size_pool[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      print &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{total_slots_by_size_pool[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      print &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{total_slots_by_size_pool[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      puts  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{total_slots_by_size_pool[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;4&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
  Listing 12-6: Detecting Which Size Pool Ruby Uses for Arrays of Varying Sizes
&lt;/div&gt;
&lt;p&gt;This program contains an inner loop and an outer loop. The outer loop at (3) in
Listing 12-6 iterates over arrays of different capacities, from 1 up to 100
(&lt;span class=&quot;code&quot;&gt;CAPACITY_ITER&lt;/span&gt;). For each array capacity, the program
creates 10,000 (&lt;span class=&quot;code&quot;&gt;ALLOCATE_ITER&lt;/span&gt;) array objects of that
size using the inner loop (4). Note the program saves all of the new arrays into
a single array called &lt;span class=&quot;code&quot;&gt;all&lt;/span&gt;, created at (2). This
insures that Ruby doesn’t free all of our new arrays by running a garbage
collection.&lt;/p&gt;
&lt;p&gt;After creating 10,000 arrays of the given capacity, the program saves the &lt;span
class=&quot;code&quot;&gt;heap_eden_slots&lt;/span&gt; value from the return value of &lt;span
class=&quot;code&quot;&gt;GC.stat_heap&lt;/span&gt; for all of the size pools at (5), and then
prints out the results at (6).&lt;/p&gt;
&lt;p&gt;Running the code in Listing 12-6 produces this output:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;22923&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;6548&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;34385&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;6548&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;44209&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;6548&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;6548&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;4&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;13914&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;5&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;23735&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;6&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;34374&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;7&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;44195&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;8&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54016&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2861&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;9&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54016&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;11446&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54016&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;21257&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;11&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54033&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;54016&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;31477&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;611&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;306
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Etc…&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; font-size: small; margin: -20px 0 20px 0&quot;&gt;
  Listing 12-7: The Output From Running the Program in Listing 12-6
&lt;/div&gt;
&lt;p&gt;The output in Listing 12-7 shows how many slots Ruby has allocated in total
after each iteration of the outer, array capacity loop. For example:&lt;/p&gt;
&lt;pre&gt;0, 22923, 6548, 2861, 611, 306&lt;/pre&gt;
&lt;p&gt;… shows that after allocating 10,000 empty arrays, Ruby has now uses a total of
22923 slots for Size Pool 0, 6548 for Size Pool One, 2861 for Size Pool Two,
etc. If you try running this, you will see slightly different values.&lt;/p&gt;
&lt;p&gt;Plotting these values, we can see which pool Ruby uses for the new arrays of
various capacities:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2025/2/11/Figure-12-10.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 12-10: Allocating Slots for Arrays of Various Sizes - Size Pools 0, 1 and 2, Ruby 3.4.1
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Each bar in Figure 12-10 represents values from a line of output from Listing
12-7. For example, the first bar on the far left plots this output:&lt;/p&gt;
&lt;pre&gt;0, 22923, 6548, 2861, 611, 306&lt;/pre&gt;
&lt;p&gt;The first value, 0, is the position of each bar on the x-axis, while the bar’s
color segments display the following three values: The dark grey bar at the
bottom left corner represents Size Pool 0 (22923), the lighter bar above it
shows Size Pool 1 (6548), and the lightest, top bar shows Size Pool 2 (2861).&lt;/p&gt;
&lt;p&gt;Moving to the right, each successive bar shows the values for different array
capacities. Looking over the entire graph, we can see the following pattern in
Figure 12-10:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The dark grey bars for Size Pool 0 at the bottom of Figure 12-10 increase in
size from capacity 0 through capacity 3, and then remain the same height after
that for capacities 4 and greater.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The medium grey bars for Size Pool 1 have the same height from capacity 0
through 3, but then increase from capacity 4 through 8. From capacity 9 and
onward, the medium grey bars have the same height again.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The light grey bars at the top for Size Pool 2 remain small for capacities 0
through 8, and then increase in size steadily for capacities 9 through 18. Then
the remain unchanged after that.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Figure 12-10 shows Ruby saves new arrays using the following size pools:&lt;/p&gt;
&lt;pre&gt;Capacity Range	Size Pool
0-3			0
4-8			1
9-18			2&lt;/pre&gt;
&lt;p&gt;Plotting the entire output from Listing 12-7 up to capacity=100, we see the
following:&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2025/2/11/Figure-12-11.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 12-11: Complete Output from Listing 12-7, Ruby 3.4.1
&lt;/p&gt;
&lt;/i&gt;
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 12-11 shows an interesting pattern. Ruby uses size pools 0 through 4 to
save arrays with capacities from 0 up to 78 in a similar way:&lt;/p&gt;
&lt;pre&gt;Capacity Range	Size Pool
0-3			0
4-8			1
9-18			2
19-38			3
39-78			4&lt;/pre&gt;
&lt;p&gt;For each capacity range, the length of the corresponding bars grows steadily,
and then stops growing.&lt;/p&gt;
&lt;p&gt;However, once we started to save large arrays with capacities of 79 or more
elements, Ruby saved them in the original Size Pool Zero again. This indicates
that Ruby stopped embedding the array elements in the size pool entirely, and
instead allocated a new, separate memory segment to save the elements. For these
large arrays, small 40 byte &lt;span class=&quot;code&quot;&gt;RVALUE&lt;/span&gt; slots in Size Pool
Zero were sufficient, because they each contained a pointer to the array data,
and not the embedded array data itself.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2025/2/11/Figure-12-12.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 12-12: A Large Array Saving Its Elements In A Separate Memory Segment
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 12-12 shows how large arrays, arrays which contain 79 or more elements,
do not save their elements inside of the &lt;span class=&quot;code&quot;&gt;Array&lt;/span&gt;
structure, but instead save a pointer (ptr) which contains the location of a
separate memory segment that holds the array elements.&lt;/p&gt;
&lt;p&gt;One key detail of this experiment was in Listing 12-6 at (2): the &lt;span
class=&quot;code&quot;&gt;all&lt;/span&gt; array. The inner loop just below in Listing 12-6 at (4)
saved each new array into the &lt;span class=&quot;code&quot;&gt;all&lt;/span&gt; array. This meant
all the new arrays were in fact still being used and Ruby’s garbage collector
could not reclaim their memory. Without this line of code, we would not have
seen the total number of slots continually increase, preventing us from
discovering which slots Ruby saved the arrays into.&lt;/p&gt;
&lt;p&gt;But how did Ruby’s garbage collector know this, exactly? How does Ruby identify
which values are used and unused by our programs? And how does it reclaim memory
from the unused values? Let’s take a look at Ruby’s garbage collection process
next.&lt;/p&gt;
</content></entry><entry><title>Inserting One New Element into Hashes of Varying Sizes</title><link href="https://patshaughnessy.net/2025/2/4/inserting-one-new-element-into-hashes-of-varying-sizes" rel="alternate"></link><id href="https://patshaughnessy.net/2025/2/4/inserting-one-new-element-into-hashes-of-varying-sizes" rel="alternate"></id><published>2025-02-04T00:00:00Z</published><updated>2025-02-04T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;RUM includes a serie</summary><content type="html">&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;RUM includes a series of “experiments:” simple code snippets that show evidence
the book’s explanations are accurate. One of the first experiments I wrote back
in 2013, Experiment 7-2 is a fun way to see exactly when Ruby increases the
number of bins in a hash table. The experiments in RUM are a great way to see
for yourself how Ruby works. They also keep me honest; in fact, I ran this code
again recently using Ruby 3.4.1 and saw different results than what I expected!&lt;/p&gt;
&lt;h2&gt;Chapter 7: The Hash Table: The Workhorse Of Ruby Internals&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Hash Tables in Ruby&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Saving a Value in a Hash Table&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Retrieving a Value from a Hash Table&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-1: Retrieving a Value from Hashes of Varying Sizes&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Hash Tables Expand to Accommodate More Values&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Hash Collisions and Open Addressing&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Searching For an Element Using Open Addressing&lt;/td&gt;&lt;td&gt;11&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Expanding a Hash Table&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Does Ruby Decide How Many Entries And Bins To Use?&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-2: Inserting One New Element into Hashes of Varying Sizes&lt;/td&gt;&lt;td&gt;17&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Optimization for Small Hashes&lt;/td&gt;&lt;td&gt;20&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Does Ruby Convert A Packed Hash Into A Hash Table?&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Implements Hash Functions&lt;/td&gt;&lt;td&gt;23&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-3: Using Objects as Keys in a Hash&lt;/td&gt;&lt;td&gt;25&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img id=&quot;microscope&quot; src=&quot;https://patshaughnessy.net/assets/2025/2/4/experiment.png&quot;&gt;&lt;br/&gt;
&lt;/div&gt;
&lt;h2&gt;Experiment 7-2: Inserting One New Element into Hashes of Varying Sizes&lt;/h2&gt;
&lt;p&gt;One way to test whether this rehashing, or redistribution, of entries really
occurs when Ruby expands a hash is to measure the amount of time Ruby takes to
save one new element into existing hashes of different sizes. As we add more
elements to the same hash, we should eventually see evidence that Ruby is taking
extra time to rehash the elements. &lt;/p&gt;
&lt;div style=&quot;clear: left&quot;&gt;&lt;/div&gt;
&lt;p&gt;The code for this experiment is shown in Listing 7-3.&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#795da3;&quot;&gt;require &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;benchmark&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;100&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    hashes &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10000&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      hash &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{}
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      (&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;..&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size).each &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        hash[rand] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;rand
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      hashes &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;hash
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;GC&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.disable
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Benchmark&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.bm &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;bench&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      bench.report(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;adding element number &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{size&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;10000&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.times &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;n&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;| 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)       hashes[n][size] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;rand
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;GC&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.enable
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;font-style: italic; margin: 0 0 20px 0&quot;&gt;
  Listing 7-3: Adding one more element to hashes of different sizes  
&lt;/div&gt;
&lt;p&gt;At (1) the outer loop iterates over hash sizes from 0 to 100, and at (2) the
inner loop creates 10,000 hashes of the given size. After disabling garbage
collection, this experiment uses the benchmark library to measure how long it
takes Ruby to insert a single new value at (3) into all 10,000 hashes of the given
size. The results are surprising! Figure 7-13 shows the results for Ruby 3.4.1. &lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/2/4/Figure-7-3.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 7-13: Time to add 10,000 key/value pairs vs. hash size (Ruby 3.4.1) 
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Interpreting these data values from left to right, we see the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;It takes about 0.6 ms to insert the first element into an empty hash (10,000
times). &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;As the hash size increases from 2 to 8, the amount of time required to insert
a new element is about the same: 0.6ms more or less.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Inserting the 9th key/value pair takes much longer, about 2ms for 10,000
hashes! &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Next, as the hash size increases from 10 up to 16, once again Ruby can insert
new elements quickly, between 0.6ms and 0.7ms (10,000 times).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A huge spike! It takes almost 3.1ms to insert the 17th element.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;And then once again starting with the 18th element, the time to insert each
element reduced to around 0.7ms-0.8.ms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A 3rd spike appears when Ruby inserts the 33rd element: almost 5ms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The graph once again flattens and returns to around 0.7-0.8ms per element (10,000
times).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;And a 4th spike appears when Ruby inserts the 65th element: almost 6ms.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What’s going on here? Well, Ruby spends the extra time required to insert that
17th key/value pair expanding the hash table: reallocating the entries array from
16 to 32 entries, and the bin array from 32 to 64 bins, and then reassigning
the &lt;code&gt;st_table_entry&lt;/code&gt; structures to the new bin array. Ruby expands the entries and
bins arrays a second time again after inserting the 33rd entry, this from from 32
to 64 entries and 64 to 128 bins. (Recall the &lt;code&gt;st_features&lt;/code&gt; table, shown on page
15, used powers of 2 to determine these array sizes.)&lt;/p&gt;
&lt;p&gt;The smaller spike on the 9th insert in this figure is curious.  While not as
pronounced as the spike at the 17th element, this smaller spike is nonetheless
noticeable. As it turns out, Ruby contains another optimization that speeds up
hash access even more for small hashes that contain less than 9 elements.&lt;/p&gt;
</content></entry><entry><title>Updating Ruby Under a Microscope</title><link href="https://patshaughnessy.net/2025/1/28/updating-ruby-under-a-microscope" rel="alternate"></link><id href="https://patshaughnessy.net/2025/1/28/updating-ruby-under-a-microscope" rel="alternate"></id><published>2025-01-28T00:00:00Z</published><updated>2025-01-28T00:00:00Z</updated><category>Updating Ruby Under a Microscope</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img id=&quot;microscope&quot; src=&quot;https://patshaughnessy.net/assets/2014/12/17/microscope.png&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Ruby stores much of its own internal data in hash tables.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-unde</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img id=&quot;microscope&quot; src=&quot;https://patshaughnessy.net/assets/2014/12/17/microscope.png&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Ruby stores much of its own internal data in hash tables.
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;I've started working on a new edition of &lt;a
href=&quot;http://patshaughnessy.net/ruby-under-a-microscope&quot;&gt;Ruby Under a
Microscope&lt;/a&gt; that covers Ruby 3.x. I'm working on this in my spare time, so it
will take a while to finish. Leave a comment or &lt;a
href=&quot;mailto:pat@patshaughnessy.net?subject=Ruby Under a Microscope Update&quot;&gt;drop
me a line&lt;/a&gt; and I'll email you when it's finished.&lt;/p&gt;
&lt;p&gt;In the meantime, here’s an excerpt from a rewrite of Chapter 7 about hash
tables. Vladimir Makarov and the Ruby team redesigned Ruby’s hash table
implementation back in 2015 to use open addressing, shortly after I published
the first edition of RUM.  This seemed like a good place to start.&lt;/p&gt;
&lt;div style=&quot;clear: left&quot;&gt;&lt;/div&gt;
&lt;h2&gt;Chapter 7: The Hash Table: The Workhorse Of Ruby Internals&lt;/h2&gt;
&lt;div style=&quot;font-size: small&quot;&gt;
&lt;table id=&quot;toc&quot;&gt;
	&lt;tr&gt;
		&lt;td&gt;Hash Tables in Ruby&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Saving a Value in a Hash Table&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Retrieving a Value from a Hash Table&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-1: Retrieving a Value from Hashes of Varying Sizes&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Hash Tables Expand to Accommodate More Values&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Hash Collisions and Open Addressing&lt;/td&gt;&lt;td&gt;9&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Searching For an Element Using Open Addressing&lt;/td&gt;&lt;td&gt;11&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Expanding a Hash Table&lt;/td&gt;&lt;td&gt;14&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Does Ruby Decide How Many Entries And Bins To Use?&lt;/td&gt;&lt;td&gt;15&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-2: Inserting One New Element into Hashes of Varying Sizes&lt;/td&gt;&lt;td&gt;17&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Optimization for Small Hashes&lt;/td&gt;&lt;td&gt;20&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Does Ruby Convert A Packed Hash Into A Hash Table?&lt;/td&gt;&lt;td&gt;22&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;How Ruby Implements Hash Functions&lt;/td&gt;&lt;td&gt;23&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Experiment 7-3: Using Objects as Keys in a Hash&lt;/td&gt;&lt;td&gt;25&lt;/td&gt;
	&lt;/tr&gt;
	&lt;tr&gt;
		&lt;td&gt;Summary&lt;/td&gt;&lt;td&gt;30&lt;/td&gt;
	&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;h2&gt;Hash Tables in Ruby&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Hash tables&lt;/em&gt; are a commonly used, well-known, age-old concept in computer
science. They organize values into groups, or &lt;em&gt;bins&lt;/em&gt;, based on an integer value
calculated from each value — a &lt;em&gt;hash&lt;/em&gt;. When you need to find a value, you can
figure out which bin it’s in by recalculating its hash value, thus speeding up
the search. &lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2025/1/28/bins.png&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Every time you write a method, Ruby creates an entry in a hash table. 
&lt;/span&gt;
&lt;/div&gt;
&lt;div style=&quot;clear: right&quot;&gt;&lt;/div&gt;
&lt;h2&gt;Saving a Value in a Hash Table&lt;/h2&gt;
&lt;p&gt;Figure 7-1 shows a single hash object and its hash table.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/1/28/Figure-7-1.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 7-1: A Ruby hash object with an empty hash table 
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;On the left is the &lt;code&gt;RHash&lt;/code&gt; (short for &lt;em&gt;Ruby hash&lt;/em&gt;) structure. On the right, you see
the hash table used by this hash, represented by the &lt;code&gt;st_table&lt;/code&gt; structure. This C
structure contains the basic information about the hash table, including the
number of entries saved in the table and pointers to the entries and bins. Each
&lt;code&gt;RHash&lt;/code&gt; structure contains a pointer to a corresponding &lt;code&gt;st_table&lt;/code&gt; structure. For
many hashes, Ruby initially creates 32 entries and 64 bins. (Hashes with 8 or
fewer entries work somewhat differently; see “Optimization for Small Hashes” on
page 20.) The best way to understand how a hash table works is by stepping
through an example. Suppose I add a new key/value to a hash called &lt;code&gt;my_hash&lt;/code&gt;: &lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;my_hash[&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:key&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;value&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;While executing this line of code, Ruby saves the key and value into the first
entry, as shown in Figure 7-2.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/1/28/Figure-7-2.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 7-2: A Ruby hash object containing a single value
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Here you can see the new key/value pair inside the first entry slot, called an
&lt;code&gt;st_table_entry&lt;/code&gt; in Ruby’s C source code. Ruby saves the keys and values in the
entries array in the order you add them. This makes it easy for Ruby to return
them back to you in the same order. Also see that Ruby saved an entry index of 0
in the third bin, number 2. Ruby did this by taking the given key — in this
example, the symbol &lt;code&gt;:key&lt;/code&gt; — and passing it to an internal hash function that
returns a pseudorandom integer: &lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;some_value &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; internal_hash_function(&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:key&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Next, Ruby takes the hash value — in this example, &lt;code&gt;some_value&lt;/code&gt; — and calculates the modulus by the number of bins, which is the remainder after dividing by the number of bins.&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;some_value &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;% &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;64 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;margin-bottom: 2.5rem; font-style: italic&quot;&gt;
NOTE: In Figure 7-2, I assume that the actual hash value for &lt;span
class=&quot;code&quot;&gt;:key&lt;/span&gt; divided by 64 leaves a remainder of 2. Later in this
chapter, I’ll explore in more detail the hash functions that Ruby actually uses.
We’ll see how using 64 bins (a power of 2) speeds up this calculation.  &lt;/div&gt;
&lt;p&gt;Now let’s add a second element to the hash:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;my_hash[&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:key2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;value2&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This time let’s imagine that the hash value of &lt;code&gt;:key2&lt;/code&gt; divided by 64 yields a
remainder of 5. &lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;internal_hash_function(&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:key2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;% &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;64 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;5&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Figure 7-3 shows that Ruby fills in a second &lt;code&gt;st_table_entry&lt;/code&gt; structure in the
entries array, and the entry index 1 in bin number 5, the sixth bin.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/1/28/Figure-7-3.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 7-3: A Ruby hash object containing two values
&lt;/span&gt;
&lt;/div&gt;
&lt;h2&gt;Retrieving a Value from a Hash Table&lt;/h2&gt;
&lt;p&gt;The benefit of using a hash table becomes clear when you ask Ruby to retrieve
the value for a given key. For example: &lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;p my_hash[&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:key&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; =&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;value&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;If Ruby had saved all of the keys and values in an array or linked list, it
would have to iterate over all the elements in that array or list, looking for
:key. This might take a very long time, depending on the number of elements. But
using a hash table, Ruby can jump straight to the key it needs to find by
recalculating the hash value for that key.  To recalculate the hash value for a
particular key, Ruby simply calls the hash function again: &lt;code&gt;some_value = internal_hash_function(:key)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Then, it redivides the hash value by the number of bins to get the remainder, or
the modulus. &lt;code&gt;some_value % 64 = 2&lt;/code&gt; At this point, Ruby knows to look in bin
number 2 and finds the entry index 0, and in turn finds the entry with the key
of &lt;code&gt;:key&lt;/code&gt; in entry number 0 or the first entry slot.  Ruby can later find the
value for &lt;code&gt;:key2&lt;/code&gt; by repeating the same hash calculation
&lt;code&gt;internal_hash_function(:key2) % 64 = 5&lt;/code&gt;.&lt;/p&gt;
&lt;div style=&quot;padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
&lt;img width=&quot;100%&quot; src=&quot;https://patshaughnessy.net/assets/2025/1/28/Figure-7-4.svg&quot;&gt;&lt;br/&gt;
&lt;span style=&quot;font-style: italic; font-size: small&quot;&gt;
  Figure 7-4: Finding Values in a Hash Table
&lt;/span&gt;
&lt;/div&gt;
&lt;p&gt;Figure 7-4 explains how Ruby would find these two values in the hash table: On
the left side, the first arrow starts from the third bin (bin index 2 =
&lt;code&gt;internal_hash_function(:key) % 64&lt;/code&gt;) and points to the first key/value pair, at
index 0. To the right, the second arrow starts from the sixth bin (bin index 5 =
&lt;code&gt;internal_hash_function(:key2) % 64&lt;/code&gt;) and points to the second key/value pair, at
index 1.&lt;/p&gt;
&lt;span style=&quot;font-style: italic&quot;&gt;
NOTE: The C library used by Ruby to implement hash tables was written in the
1980s by Peter Moore from the University of California, Berkeley. Later in 2015
Vladimir Makarov rewrote the hash table code, using a data structure which saves
the entry and bin arrays in contiguous memory segments. By saving all the
entries and bins nearby each other in memory, modern CPUs are able to cache all
of the data in the hash table more efficiently, speeding up the overall process.
You can find Makarov’s hash table code in the C code files st.c and
include/ruby/st.h. All of the function and structure names in that code use the
naming convention st_. The definition of the RHash structure that represents
every Ruby Hash object is in the include/ruby/ruby.h file. Along with RHash,
this file contains all of the other primary object structures used in the Ruby
source code: RString, RArray, RValue, and so on. 
&lt;/span&gt;
</content></entry><entry><title>LLVM IR: The Esperanto of Computer Languages</title><link href="https://patshaughnessy.net/2022/2/19/llvm-ir-the-esperanto-of-computer-languages" rel="alternate"></link><id href="https://patshaughnessy.net/2022/2/19/llvm-ir-the-esperanto-of-computer-languages" rel="alternate"></id><published>2022-02-19T00:00:00Z</published><updated>2022-02-19T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/esperanto.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt; Esperanto grammar is logical and self&lt;br/&gt;
consistent, designed to be easy to learn. &lt;br/&gt;
  &lt;small&gt; &lt;a title=&quot;Renatoeo, CC BY-SA 4.0 &amp;lt;https://creativecommons.org/licenses/by-sa/4.0&amp;gt;, via Wikimedia Commons&quot; href=&quot;https:/</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/esperanto.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt; Esperanto grammar is logical and self&lt;br/&gt;
consistent, designed to be easy to learn. &lt;br/&gt;
  &lt;small&gt; &lt;a title=&quot;Renatoeo, CC BY-SA 4.0 &amp;lt;https://creativecommons.org/licenses/by-sa/4.0&amp;gt;, via Wikimedia Commons&quot; href=&quot;https://commons.wikimedia.org/wiki/File:CARD_GRAM%C3%81TICA_ESPERANTO.png&quot;&gt;via Wikimedia Commons&lt;/a&gt;&lt;/small&gt; &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;I empathize for people who have to learn English as a foreign language. English
grammar is inconsistent, arbitrary and hard to master. English spelling is even
worse. I sometimes find myself apologizing for my language’s shortcomings. But
learning any foreign language as an adult is very difficult.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Esperanto&quot;&gt;Esperanto&lt;/a&gt;, an “artificial language,”
is different. Invented by Ludwik Zamenhof in 1873, Esperanto has a vocabulary
and grammar that are logical and consistent, designed to be easier to learn.
Zamenhof intended Esperanto to become the universal second language.&lt;/p&gt;
&lt;p&gt;Computers have to learn foreign languages too. Every time you compile and run
a program, your compiler translates your code into a foreign language: the
native machine language that runs on your target platform. Compilers should
have been called translators. And compilers struggle with the same things we
do: inconsistent grammar and vocabulary, and other peculiarities of the target
platform.&lt;/p&gt;
&lt;p&gt;Recently, however, more and more compilers translate your code to an
artificial machine language. They produce a simpler, more consistent, more
powerful machine language that doesn’t actually run on any machine. This
artificial machine language, LLVM IR, makes writing compilers simpler and
reading the code compilers produce simpler too.&lt;/p&gt;
&lt;p&gt;LLVM IR is becoming the universal second language for compilers.&lt;/p&gt;
&lt;h2&gt;One Line of LLVM IR&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&quot;https://llvm.org&quot;&gt;Low Level Virtual Machine&lt;/a&gt; (LLVM) project had the novel
idea of inventing a virtual machine that was easy for compiler engineers to use
as a target platform. The LLVM team designed a special instruction set called
&lt;a href=&quot;https://llvm.org/docs/LangRef.html&quot;&gt;intermediate representation&lt;/a&gt; (IR). New,
modern languages such as Rust, Swift, Clang-based versions of C and many
others, first translate your code to LLVM IR. Then they use the LLVM framework
to convert the IR into actual machine language for any target platform LLVM
supports:&lt;/p&gt;
&lt;img style=&quot;width: 500px; margin-bottom: 20px&quot; src=&quot;https://patshaughnessy.net/assets/2022/2/19/platforms.svg&quot;&gt;
&lt;p&gt;LLVM is great for compilers. Compiler engineers don’t have to worry about the
detailed instruction set of each platform, and LLVM optimizes your code for
whatever platform you choose automatically. And LLVM is also great for people
like me who are interested in what machine language instructions look like and
how CPUs execute them. LLVM instructions are much easier to follow than real
machine instructions. Let’s take a look at one!&lt;/p&gt;
&lt;p&gt;Here’s a line of LLVM IR I generated from a simple
&lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt; program:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;%57 = call %&quot;Array(Int32)&quot;* @&quot;*Array(Int32)@Array(T)::unsafe_build&lt;Int32&gt;:Array(Int32)&quot;(i32 610, i32 2), !dbg !89&lt;/pre&gt;
&lt;p&gt;Wait a minute! This isn’t simple or easy to follow at all! What am I talking
about here? At first glance, this does look confusing. But as we’ll see, most
of the confusing syntax is related to Crystal, not LLVM. Studying this line of
code will reveal more about Crystal than it will about LLVM.&lt;/p&gt;
&lt;p&gt;The rest of this article will unpack and explain what this line of code means.
It looks complex, but is actually quite simple.&lt;/p&gt;
&lt;h2&gt;The Call Instruction&lt;/h2&gt;
&lt;p&gt;The instruction above is a function call in LLVM IR. To produce this code, I
wrote a small Crystal program and then translated it using this command:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;$ crystal build array_example.cr --emit llvm-ir&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;--emit&lt;/code&gt; option directed Crystal to generate a file called array_example.ll,
which contains the line above along with thousands of other lines. We’ll get to
the Crystal code in a minute. But for now, how do I get started understanding
what the LLVM code means?&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://llvm.org/docs/LangRef.html&quot;&gt;LLVM Language Reference Manual&lt;/a&gt; has
documentation for &lt;code&gt;call&lt;/code&gt; and all of the other LLVM IR instructions. Here’s the
syntax for &lt;code&gt;call&lt;/code&gt;:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;result&gt; = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(&amp;lt;num&gt;)]
         &amp;lt;ty&gt;|&amp;lt;fnty&gt; &amp;lt;fnptrval&gt;(&amp;lt;function args&gt;) [fn attrs] [ operand bundles ]&lt;/pre&gt;
&lt;p&gt;My example &lt;code&gt;call&lt;/code&gt; instruction doesn’t use many of these options. Removing the
unused options, I can see the actual, basic syntax of &lt;code&gt;call&lt;/code&gt;:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;result&gt; = call &amp;lt;ty&gt; &amp;lt;fnptrval&gt;(&amp;lt;function args&gt;)&lt;/pre&gt;
&lt;p&gt;In order from left to right, these values are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;result&amp;gt;&lt;/code&gt; which register to save the result in&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;ty&amp;gt;&lt;/code&gt; the type of the return value&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;fnptrval&amp;gt;&lt;/code&gt; a pointer to the function to call&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;function args&amp;gt;&lt;/code&gt; the arguments to pass to that function&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What does all of this mean, exactly? Let’s find out!&lt;/p&gt;
&lt;h2&gt;A CPU With Infinite Registers&lt;/h2&gt;
&lt;p&gt;Starting on the left and moving right, let’s step through the &lt;code&gt;call&lt;/code&gt; instruction:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/result.svg&quot;&gt;
&lt;p&gt;The token &lt;code&gt;%57&lt;/code&gt; to the left of the equals sign tells LLVM where to save the
return value of the function call that follows. This isn’t a normal variable;
&lt;code&gt;%57&lt;/code&gt; is an LLVM “register.”&lt;/p&gt;
&lt;p&gt;Registers are physical circuits located on microprocessor chips used to save
intermediate values. Saving a value in a CPU register is much faster than
saving a value in memory, since the register is located on the same chip as the
rest of the microprocessor. Saving a value in RAM memory, on the other hand,
requires transmitting that value from one chip to another and is much slower,
relatively speaking. Unfortunately, each CPU has a limited number of registers
available, and so compilers have to decide which values are used frequently
enough to warrant saving in nearby registers, and which other values can be
moved out to more distant memory.&lt;/p&gt;
&lt;p&gt;Unlike the limited number of registers available on a real CPU, the imaginary
LLVM microprocessor has an infinite number of them. Because of this, compilers
that target LLVM can simply save values to a register whenever they would like.
There’s no need to find an available register, or to move an existing value out
of a register first before using it for something else. Busy work that normal
machine language code can’t avoid.&lt;/p&gt;
&lt;p&gt;In this program, the Crystal compiler had already saved 56 other values in
“registers” and so for this line of LLVM IR, Crystal simply used the next
register, number 57.&lt;/p&gt;
&lt;h2&gt;LLVM Structure Types&lt;/h2&gt;
&lt;p&gt;Moving left to right, LLVM &lt;code&gt;call&lt;/code&gt; instructions next indicate the type of the
function call’s return value:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/type.svg&quot;&gt;
&lt;p&gt;This name of this type, &lt;code&gt;Array(Int32)&lt;/code&gt;, is generated by the Crystal compiler, not
by LLVM. That is, this is a type from my Crystal program. It could have been
anything, and indeed other compilers that target LLVM will generate completely
different type names.&lt;/p&gt;
&lt;p&gt;The example Crystal program I used to generate this LLVM code was:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;When I compiled this program, Crystal generated the &lt;code&gt;call&lt;/code&gt; instruction above,
which returns a pointer to the new array, &lt;code&gt;arr&lt;/code&gt;. Since &lt;code&gt;arr&lt;/code&gt; is an array
containing integers, Crystal uses a generic type &lt;code&gt;Array(Int32).&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Machine languages that target real machines only support hardware types that
machine supports.  For example, Intel x86 assembly language allows you to save
integers of different widths, 16, 32 or 64 bits for example, and an Intel x86
CPU has registers designed to hold values of each of these sizes.&lt;/p&gt;
&lt;p&gt;LLVM IR is more powerful. It supports “structure types,” similar to a C
structure or an object in a language like Crystal or Swift. Here the &lt;code&gt;%&amp;quot;…&amp;quot;&lt;/code&gt;
syntax indicates the name inside the quotes is the name of a structure type.
And the asterisk which follows, like in C, indicates the type of the return
value of my function call is a pointer to this structure.&lt;/p&gt;
&lt;p&gt;My example LLVM program defines the type &lt;code&gt;Array(Int32)&lt;/code&gt; like this:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;%&quot;Array(Int32)&quot; = type { i32, i32, i32, i32, i32* }&lt;/pre&gt;
&lt;p&gt;Structure types allow LLVM IR programs to create pointers to structures or
objects, and to access any of the values inside each object. That makes writing
a compiler much easier. In my example, the call instruction returns a pointer
to an object which contains 4 32-bit integer values, followed by a pointer to
other 32 integer values. But what are all of these integer values? Above I said
this function call was returning a new array - how can that be the case?&lt;/p&gt;
&lt;p&gt;LLVM itself has no idea, and no opinion on the matter. To understand what these
values are, and what they have to do with the array in my program, we need to
learn more about the Crystal compiler that generated this LLVM IR code.&lt;/p&gt;
&lt;p&gt;Reading the &lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L48&quot;&gt;Crystal standard
library&lt;/a&gt;,
we can see Crystal implements arrays like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Indexable&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Mutable(T)
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Comparable(Array)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Size of an Array that we consider small to do linear scans or other optimizations.
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;private &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SMALL_ARRAY_SIZE &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;16 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The size of this array.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The capacity of `@buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Note that, because `@buffer` moves on shift, the actual
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# capacity (the allocated memory) starts at `@buffer - @offset_to_buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The actual capacity is also given by the `remaining_capacity` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;capacity &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Offset to the buffer that was originally allocated, and which needs to
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# be reallocated on resize. On shift this value gets increased, together with
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# `@buffer`. To reach the root buffer you have to do `@buffer - @offset_to_buffer`,
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and this is also provided by the `root_buffer` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;offset_to_buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The buffer where elements start.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Pointer(T)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# In 64 bits the Array is composed then by:
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - type_id            : Int32   # 4 bytes -|
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - size               : Int32   # 4 bytes  |- packed as 8 bytes
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - capacity           : Int32   # 4 bytes -|
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - offset_to_buffer   : Int32   # 4 bytes  |- packed as 8 bytes
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - buffer             : Pointer # 8 bytes  |- another 8 bytes&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The comments above are very illustrative and complete - the Crystal team took
the time to document their standard library and explain not only how to use
each class, like &lt;code&gt;Array(T)&lt;/code&gt;, but how they are implemented internally.&lt;/p&gt;
&lt;p&gt;In this case, we can see the four &lt;code&gt;i32&lt;/code&gt; values inside the &lt;code&gt;Array(Int32)&lt;/code&gt; LLVM
structure type hold the size and capacity off the array, among other things.
And the &lt;code&gt;i32*&lt;/code&gt; value is a pointer to the actual contents of the array.&lt;/p&gt;
&lt;h2&gt;Functions&lt;/h2&gt;
&lt;p&gt;The target of the call instruction appears next, after the return type:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/function.svg&quot;&gt;
&lt;p&gt;This is quite a mouthful! What sort of function is this?&lt;/p&gt;
&lt;p&gt;There are two steps to understanding this: First, the &lt;code&gt;@&amp;quot;…&amp;quot;&lt;/code&gt; syntax. This is
simply a global identifier in this LLVM program. So my &lt;code&gt;call&lt;/code&gt; instruction is just
calling a global function. In LLVM programs, all functions are global; there is
no concept of a class, module or similar groupings of code.&lt;/p&gt;
&lt;p&gt;But what in the world does that crazy identifier mean?&lt;/p&gt;
&lt;p&gt;LLVM ignores this complex name. For LLVM this is just a name like &lt;code&gt;foo&lt;/code&gt; or &lt;code&gt;bar&lt;/code&gt;.
But for Crystal, the name has much more significance. Crystal encoded a lot of
information into this one name. Crystal can do this because the LLVM code isn’t
intended for anyone to read directly. Crystal has created a “mangled name,”
meaning the original version of the function to call is there but it’s been
mangled or rewritten in a confusing manner.&lt;/p&gt;
&lt;p&gt;Crystal rewrites function names to ensure they are unique. In Crystal, like in
many other statically typed languages, functions with different argument types
or return value types are actually different functions. So in Crystal if I
write:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;foo&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(a : Int32)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Int: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{a}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;foo&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(a : String)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;String: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{a}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;foo(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;123&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#=&amp;gt; Int: 123
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;foo(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;123&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#=&amp;gt; String: 123&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…I have two separate, different functions both called &lt;code&gt;foo&lt;/code&gt;. The type of the
parameter &lt;code&gt;a&lt;/code&gt; distinguishes one from the other.&lt;/p&gt;
&lt;p&gt;Crystal generates unique function names by encoding the arguments, return value
and type of the receiver into the into the function name string, making it
quite complex. Let’s break it down:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/mangled.svg&quot;&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Array(Int32)@Array(T)&lt;/code&gt; - this is the type of the receiver. That means the
&lt;code&gt;unsafe_build&lt;/code&gt; function is actually a method on the &lt;code&gt;Array(T)&lt;/code&gt; generic class.
And in this case, the receiver is an array holding 32 bit integers, the
&lt;code&gt;Array(Int32)&lt;/code&gt; class. Crystal includes both names in the mangled function name.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;unsafe_build&lt;/code&gt; - this is the function Crystal is calling.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Int32&lt;/code&gt; - these are the function’s parameter types. In this case, Crystal is
passing in a single integer, so we just see one &lt;code&gt;Int32&lt;/code&gt; type.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Array(Int32)&lt;/code&gt; - this is the return value type, a new array containing integers.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As I discussed in &lt;a href=&quot;https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree&quot;&gt;my last
post&lt;/a&gt;,
the Crystal compiler internally rewrites my array literal expression &lt;code&gt;[12345, 67890]&lt;/code&gt; into code that creates and initializes a new array object:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; __temp_621.to_unsafe
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;In this expanded code, Crystal calls &lt;code&gt;unsafe_build&lt;/code&gt; and passes in &lt;code&gt;2&lt;/code&gt;, the
required capacity of the new array. And to distinguish this use of
&lt;code&gt;unsafe_build&lt;/code&gt; from other &lt;code&gt;unsafe_build&lt;/code&gt; functions that might exist in my
program, the compiler generated the mangled name we see above. &lt;/p&gt;
&lt;h2&gt;Arguments&lt;/h2&gt;
&lt;p&gt;Finally, after the function name the LLVM IR instruction shows the arguments
for the function call:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/args.svg&quot;&gt;
&lt;p&gt;LLVM IR uses parentheses, like most languages, to enclose the arguments to a
function call. And the types precede each value: &lt;code&gt;610&lt;/code&gt; is a 32-bit integer and
&lt;code&gt;2&lt;/code&gt; is also a 32-bit integer.&lt;/p&gt;
&lt;p&gt;But wait a minute! We saw just above the expanded Crystal code for generating
the array literal passes a single value, &lt;code&gt;2&lt;/code&gt;, into the call to &lt;code&gt;unsafe_build&lt;/code&gt;.
And looking at the mangled function name above, we also see there is a single
&lt;code&gt;i32&lt;/code&gt; parameter to the function call.&lt;/p&gt;
&lt;p&gt;But reading the LLVM IR code we can see a second value is also passed in:
&lt;code&gt;610&lt;/code&gt;. What in the world does &lt;code&gt;610&lt;/code&gt; mean? I don’t have 610 elements in my new
array, and 610 is not one of the array elements. So what is going on here?&lt;/p&gt;
&lt;p&gt;Crystal is an object oriented language, meaning that each function is
optionally associated with a class. In OOP parlance, we say that we are
“sending a message” to a “receiver.” In this case, &lt;code&gt;unsafe_build&lt;/code&gt; is the message,
and &lt;code&gt;::Array(typeof(12345, 67890))&lt;/code&gt; is the receiver. In fact, this function is
really a class method. We are calling &lt;code&gt;unsafe_build&lt;/code&gt; on the &lt;code&gt;Array(Int32)&lt;/code&gt; class,
not on an instance of one array.&lt;/p&gt;
&lt;p&gt;Regardless, LLVM IR does’t support classes or instance methods or class
methods. In LLVM IR, we only have simple, global functions. And indeed, the
LLVM virtual machine doesn’t care what these arguments are or what they mean.
LLVM doesn’t encode the meaning or purpose of each argument; it just does what
the Crystal compiler tells it to do.&lt;/p&gt;
&lt;p&gt;But Crystal, on the other hand, has to implement object oriented behavior
somehow. Specifically, the &lt;code&gt;unsafe_build&lt;/code&gt; function needs to behave differently
depending on which class it was called for, depending on what the receiver is.
For example:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;… has to return an array of two integers. While:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;abc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;def&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…has to return an array of two strings. How does this work in the LLVM IR code?&lt;/p&gt;
&lt;p&gt;To implement object oriented behavior, Crystal passes the receiver as a hidden,
special argument to the function call:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/args2.svg&quot;&gt;
&lt;p&gt;This receiver argument is a reference or pointer to the receiver’s object, and
is normally known as &lt;code&gt;self&lt;/code&gt;. Here &lt;code&gt;610&lt;/code&gt; is a reference or tag corresponding to
the &lt;code&gt;Array(Int32)&lt;/code&gt; class, the receiver. And &lt;code&gt;2&lt;/code&gt; is the actual argument to the
&lt;code&gt;unsafe_build&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;Reading the LLVM IR code, we’ve learned that Crystal secretly passes a hidden
&lt;code&gt;self&lt;/code&gt; argument to every method call to an object. Then inside each method, the
code has access to &lt;code&gt;self&lt;/code&gt;, to the object instance that code is running for. Some
languages, like Rust, require us to pass &lt;code&gt;self&lt;/code&gt; explicitly in each method call;
in Crystal this behavior is automatic and hidden.&lt;/p&gt;
&lt;h2&gt;Learning How Compilers Work&lt;/h2&gt;
&lt;p&gt;LLVM IR is a simple language designed for compiler engineers. I think of it
like a blank slate for them to write on. Most LLVM instructions are quite
simple and easy to understand; as we saw above, understanding the basic syntax
of the call instruction wasn’t hard at all.&lt;/p&gt;
&lt;p&gt;The hard part was understanding how the Crystal compiler, which targets LLVM
IR, generates code. The LLVM syntax itself was easy to follow; it was the
Crystal language’s implementation that was harder to understand.&lt;/p&gt;
&lt;p&gt;And this is the real reason to learn about LLVM IR syntax. If you take the time
to learn how LLVM instructions work, then you can start to read the code your
favorite language’s compiler generates. And once you can do that, you can learn
more about how your favorite compiler works, and what your programs actually do
when you run them.&lt;/p&gt;
</content></entry><entry><title>Visiting an Abstract Syntax Tree</title><link href="https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree" rel="alternate"></link><id href="https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree" rel="alternate"></id><published>2022-01-22T00:00:00Z</published><updated>2022-01-22T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-tree.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Joshua Tree National Park
  &lt;small&gt;(via: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Backpacker_at_Sunset_(22849298523).jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;In my &lt;a href=&quot;https://patshaughnessy.net/2021/1</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-tree.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Joshua Tree National Park
  &lt;small&gt;(via: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Backpacker_at_Sunset_(22849298523).jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;In my &lt;a href=&quot;https://patshaughnessy.net/2021/12/22/reading-code-like-a-compiler&quot;&gt;last
post&lt;/a&gt;, I
explored how &lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt; parsed a simple program and
produced a data structure called an &lt;a href=&quot;https://en.wikipedia.org/wiki/Abstract_syntax_tree&quot;&gt;abstract syntax
tree&lt;/a&gt; (AST). But what does
Crystal do with the AST? Why bother going to such lengths to create it?&lt;/p&gt;
&lt;p&gt;After Crystal parses my code, it repeatedly steps through all the entries or
nodes in the AST and builds up a description of the intended meaning and
behavior of my code. This process is known as &lt;em&gt;semantic analysis&lt;/em&gt;. Later,
Crystal will use this description to convert my program into a machine language
executable.&lt;/p&gt;
&lt;p&gt;But what does this description contain? What does it really mean for a compiler
to &lt;em&gt;understand&lt;/em&gt; anything? Let’s pack our bags and visit an abstract syntax tree
with Crystal to find out.&lt;/p&gt;
&lt;div style=&quot;clear: both&quot;&gt;&lt;/div&gt;
&lt;h2&gt;The Visitor Pattern&lt;/h2&gt;
&lt;p&gt;Imagine several tourists visiting a famous tree: Each of them sees the same
tree in a different way. The tree doesn’t change, but the perspective of each
person looking at it is different. They each take a different photo, or
remember different details.&lt;/p&gt;
&lt;p&gt;In Computer Science this separation of the data structure (the tree) from the
algorithms using it (the tourists) is known as the &lt;a href=&quot;https://en.wikipedia.org/wiki/Visitor_pattern&quot;&gt;visitor
pattern&lt;/a&gt;. This technique allows
compilers and other programs to run multiple algorithms on the same data
structure without making a mess.&lt;/p&gt;
&lt;p&gt;The visitor pattern calls for two functions: &lt;code&gt;accept&lt;/code&gt; and &lt;code&gt;visit&lt;/code&gt;. First, a
node in the data structure “accepts” a visitor:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor1.svg&quot;&gt;
&lt;p&gt;After accepting a visitor, the node turns around and calls the &lt;code&gt;visit&lt;/code&gt; method on
&lt;code&gt;Visitor&lt;/code&gt;:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor2.svg&quot;&gt;
&lt;p&gt;The &lt;code&gt;visit&lt;/code&gt; method implements whatever algorithm that visitor is interested in.&lt;/p&gt;
&lt;p&gt;This seems kind of pointless… why use &lt;code&gt;accept&lt;/code&gt; at all? We could just call
&lt;code&gt;visit&lt;/code&gt; directly. The key is that, after calling the visitor and passing
itself, the node passes the visitor to each of its children, recursively:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor3.svg&quot;&gt;
&lt;p&gt;And then the visitor can visit each of the child nodes also. The &lt;code&gt;Visitor&lt;/code&gt;
class doesn’t necessarily need to know anything about how to navigate the node
data structure. And more and more visitor classes can implement new algorithms
without changing the underlying data structure and breaking each other.&lt;/p&gt;
&lt;h2&gt;The Visitor Pattern in the Crystal Compiler&lt;/h2&gt;
&lt;p&gt;In order to understand what my code means, Crystal reads through my program’s
AST over and over again using different visitors. Each algorithm looks for
certain syntax, records information about the types and objects my code uses or
possibly even transforms my code into a different form.&lt;/p&gt;
&lt;div style=&quot;float: right; padding: 8px 0px 0px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/angel-oak.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;A photo I took in 2018 of &lt;a href=&quot;https://en.wikipedia.org/wiki/Angel_Oak&quot;&gt;Angel Oak&lt;/a&gt;,&lt;br/&gt; a 400-500 year old tree in South Carolina.&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Crystal implements the basics of the visitor pattern in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/syntax/visitor.cr#L24&quot;&gt;visitor.cr&lt;/a&gt;,
inside the superclass of all AST nodes:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ASTNode
&lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;accept&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(visitor)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; visitor.visit_any self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; visitor.visit self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        accept_children visitor
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      visitor.end_visit self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      visitor.end_visit_any self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Each subclass of &lt;code&gt;ASTNode&lt;/code&gt; implements its own version of &lt;code&gt;accept_children&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;My Tiny Crystal Program&lt;/h2&gt;
&lt;p&gt;To get a sense of how the visitor pattern works inside of Crystal, let’s look
at one line of code from my last post:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;As I explained last month, the Crystal parser generates this AST tree fragment:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/ast1.svg&quot;&gt;
&lt;p&gt;Once the parser is finished and has created this small tree, the Crystal
compiler steps through it a number of different times, looking for classes,
variables, type declarations, etc. Each of these passes through the AST is
performed by a different visitor class: &lt;code&gt;TopLevelVisitor&lt;/code&gt;,
&lt;code&gt;InstanceVarsInitializerVisitor&lt;/code&gt; or &lt;code&gt;ClassVarsInitializerVisitor&lt;/code&gt; among many
others.&lt;/p&gt;
&lt;p&gt;The most important visitor class the Crystal compiler uses is called simply
&lt;code&gt;MainVisitor&lt;/code&gt;. You can find the code for &lt;code&gt;MainVisitor&lt;/code&gt; in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/semantic/main_visitor.cr#L26&quot;&gt;main_visitor.cr&lt;/a&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# This is the main visitor of the program, ran after types have been declared
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and their type declarations (like `@x : Int32`) have been processed.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# This visits the &amp;quot;main&amp;quot; code of the program and resolves calls, instantiates
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# methods and visits them, recursively, with other MainVisitors.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The visitor keeps track of a method&amp;#39;s variables (or the main program, split into
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# several files, in case of top-level code). It keeps track both of the type of a
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# variable at a single point (stored in @vars) and the combined type of all assignments
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# to it (in @meta_vars).
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Call resolution logic is in `Call#recalculate`, where method lookup is done.
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;MainVisitor &lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;SemanticVisitor&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Since Crystal supports typed parameters and method overloading, the visitor
class implements a different &lt;code&gt;visit&lt;/code&gt; method for each type of node that it visits,
for example:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;MainVisitor &lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;SemanticVisitor
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : Assign)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : Var)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : ArrayLiteral)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : NumberLiteral)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Etc…&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Now let’s look at three examples of what the &lt;code&gt;MainVisitor&lt;/code&gt; class does with my
code: identifying variables, assigning types and expanding array literals. The
Crystal compiler is much too complex to describe in a single blog post, but
hopefully I can give you glimpse into the sort of work Crystal does during
semantic analysis.&lt;/p&gt;
&lt;h2&gt;Identifying Variables&lt;/h2&gt;
&lt;p&gt;Obviously, my example code creates and initializes a variable called &lt;code&gt;arr&lt;/code&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;But how does Crystal identify this variable’s name and value? What does it do
with &lt;code&gt;arr&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;MainVisitor&lt;/code&gt; class starts to process my code by visiting the root node of
this branch of my AST, the &lt;code&gt;Assign&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-assign1.svg&quot;&gt;
&lt;p&gt;As you can see, earlier during the parsing phrase Crystal had saved the target
variable and value of this assign statement in the child AST nodes. The target
variable, &lt;code&gt;arr&lt;/code&gt;, appears in the &lt;code&gt;Var&lt;/code&gt; node, and the value to assign is an
&lt;code&gt;ArrayLiteral&lt;/code&gt; node. The &lt;code&gt;MainVisitor&lt;/code&gt; now knows I declared a new variable, called
&lt;code&gt;arr&lt;/code&gt;, in the current lexical scope. Since my program has no classes, methods or
any other lexical scopes, Crystal saves this variable in a table of variables
for the top level program:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;300px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/table.svg&quot;&gt;
&lt;p&gt;Actually, to be more accurate, there will always be many other variables in
this table along with &lt;code&gt;arr&lt;/code&gt;. All Crystal programs automatically include the
standard library, so Crystal also saves all of the top level variables from the
standard library in this table.&lt;/p&gt;
&lt;p&gt;In a more normal program, there will be many lexical scopes for different
method and class or module definitions, and &lt;code&gt;MainVisitor&lt;/code&gt; will save each
variable in the corresponding table.&lt;/p&gt;
&lt;h2&gt;Assigning Types&lt;/h2&gt;
&lt;p&gt;Probably the most important function of &lt;code&gt;MainVisitor&lt;/code&gt; is to assign a type to each
value in my program. The simplest example of this is when &lt;code&gt;MainVisitor&lt;/code&gt; visits a
&lt;code&gt;NumberLiteral&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;300px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-number-literal.svg&quot;&gt;
&lt;p&gt;Looking at the size of the numeric value, Crystal determines the type should be
&lt;code&gt;Int32&lt;/code&gt;. Crystal then saves this type right inside of the &lt;code&gt;NumberLiteral&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;114px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/updated-number-literal.svg&quot;&gt;
&lt;p&gt;Strictly speaking, this violates the visitor pattern because the visitors
shouldn’t be modifying the data structure they visit. But the type of each
node, the type of each programming construct in my program, is really an
integral part of that node. In this case the &lt;code&gt;MainVisitor&lt;/code&gt; is really just
completing each node. It’s not changing the structure of the AST in this case…
although as we’ll see in a minute the &lt;code&gt;MainVisitor&lt;/code&gt; does this for other nodes!&lt;/p&gt;
&lt;h2&gt;Type Inference&lt;/h2&gt;
&lt;p&gt;Sometimes type values can’t be determined from the intrinsic value of an AST
node. Often the type of a node is determined by other nodes in the AST.&lt;/p&gt;
&lt;p&gt;Recall my example line of code is:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here Crystal automatically sets the type of the arr variable to the type of the
array literal expression: &lt;code&gt;Array(Int32)&lt;/code&gt;. In Computer Science, this is known as
&lt;em&gt;type inference&lt;/em&gt;. Because Crystal can automatically determine the type of
&lt;code&gt;arr&lt;/code&gt;, I don’t need to declare it explicitly by writing something more
complicated like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; uninitialized Array(Int32)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Type inference allows me to write concise, clean code with fewer type
annotations. Most modern, statically typed languages such as Swift, Rust,
Julia, Kotlin, etc., use type inference in the same way as Crystal. Even newer
versions of Java or C++ use type inference.&lt;/p&gt;
&lt;p&gt;The Crystal compiler implements type inference when the MainVisitor encounters
an &lt;code&gt;Assign&lt;/code&gt; AST node, what we saw above.&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-assign1.svg&quot;&gt;
&lt;p&gt;After encountering the &lt;code&gt;Assign&lt;/code&gt; node, Crystal recursively processes one of the
two child nodes, the &lt;code&gt;ArrayLiteral&lt;/code&gt; value, and its child nodes. When this process
finishes, Crystal knows the type of the &lt;code&gt;ArrayLiteral&lt;/code&gt; node is &lt;code&gt;Array(Int32)&lt;/code&gt;:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;425px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/set-type.svg&quot;&gt;
&lt;p&gt;I’ll take a closer look at how Crystal processes the &lt;code&gt;ArrayLiteral&lt;/code&gt; node next.
But for now, once Crystal has the type of the &lt;code&gt;ArrayLiteral&lt;/code&gt; node it copies that
type over to the &lt;code&gt;Var&lt;/code&gt; node and sets its type also:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;425px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/set-type2.svg&quot;&gt;
&lt;p&gt;But Crystal does something else interesting here: It sets up a dependency
between the two AST nodes: it “binds” the variable to the value:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;325px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/bind.svg&quot;&gt;
&lt;p&gt;This binding dependency allows Crystal to later update the type of the &lt;code&gt;arr&lt;/code&gt;
variable whenever necessary. In this case the value &lt;code&gt;[12345, 67890]&lt;/code&gt; will always
have the same type, but I suspect that sometimes the Crystal compiler can
update types during semantic analysis. In this way if the Crystal compiler ever
changed its mind about the type of some value, it can easy update the types of
any dependent values. I also suspect Crystal uses these type dependency
connections to produce error messages whenever you pass an incorrect type to
some function, for example. These are just guesses, however; if anyone from the
Crystal team knows exactly what these type bindings are used for let me know.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; Ary Borenszweig explained that sometimes the Crystal compiler
updates the type of variables based on how they are used. He posted an
interesting example on &lt;a href=&quot;https://forum.crystal-lang.org/t/visiting-an-abstract-syntax-tree/4304&quot;&gt;The Crystal Programming Language
Forum&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Expanding an Array Literal&lt;/h2&gt;
&lt;p&gt;So far we’ve seen Crystal set the type of the &lt;code&gt;NumberLiteral&lt;/code&gt; node to &lt;code&gt;Int32&lt;/code&gt;,
and we’ve seen Crystal assign &lt;code&gt;arr&lt;/code&gt; a type of &lt;code&gt;Array(Int32)&lt;/code&gt;. But how did Crystal
determine the type of the array literal &lt;code&gt;[12345, 67890]&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;This is where things get even more complicated. Sometimes during semantic
analysis the Crystal compiler completely rewrites parts of your code, replacing
it with something else. This happens even with my simple example. When visiting
the &lt;code&gt;ArrayLiteral&lt;/code&gt; node, the &lt;code&gt;MainVisitor&lt;/code&gt; expands this simple line of code into
something more complex:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; __temp_621.to_unsafe
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Reading this, you can see how later my compiled program will create the new
array. First Crystal creates an empty array with a capacity of 2, and an
element type of &lt;code&gt;Int32&lt;/code&gt;. &lt;code&gt;typeof(12345, 67890)&lt;/code&gt; returns the type (or multiple
types inside a union type) found in the given set of values, in this case just
&lt;code&gt;Int32&lt;/code&gt;. Later Crystal sets the two elements in the array just by assigning
them.&lt;/p&gt;
&lt;p&gt;Crystal achieves this by replacing part of my program’s AST with a new branch:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/expanded-ast.svg&quot;&gt;
&lt;p&gt;For clarity, I’m not drawing the AST nodes for the inner assign operations,
only the first line:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;Putting It All Together&lt;/h2&gt;
&lt;p&gt;With this new, updated AST we can see exactly how Crystal determines the type
of my variable &lt;code&gt;arr&lt;/code&gt;. Starting at the root of my AST, &lt;code&gt;MainVisitor&lt;/code&gt; visits all of
the AST nodes in this order in a series of recursive calls:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;114px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/call-recurse.svg&quot;&gt;
&lt;p&gt;And it determines the types of each of these nodes as it returns from the
recursive calls:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;240px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/return-recurse.svg&quot;&gt;
&lt;p&gt;Some interesting details here that I don’t understand completely or have space
to explain here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The &lt;code&gt;TypeOf&lt;/code&gt; node calculates a common union type using a type formula. In this
example, it just returns &lt;code&gt;Int32&lt;/code&gt; because both elements of my array, &lt;code&gt;12345&lt;/code&gt; and
&lt;code&gt;67890&lt;/code&gt;, are simple 32 bit integers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I believe the &lt;code&gt;Generic&lt;/code&gt; node refers to a Crystal generic class via the &lt;code&gt;Path&lt;/code&gt; node
shown above, in this example &lt;code&gt;Array(T)&lt;/code&gt;. When &lt;code&gt;MainVisitor&lt;/code&gt; processes the &lt;code&gt;Generic&lt;/code&gt;
node, it sets &lt;code&gt;T&lt;/code&gt; to the type &lt;code&gt;Int32&lt;/code&gt;, arriving at the type &lt;code&gt;Array(Int32).class&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &lt;code&gt;Call&lt;/code&gt; node looks up the method my code is calling (&lt;code&gt;unsafe_build&lt;/code&gt;) and
uses the type from that method’s return value. I didn’t have time to explore
how method lookup works in Crystal, however, so I’m not sure about this.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Scratching the Surface&lt;/h2&gt;
&lt;p&gt;Today we looked at a tiny piece of what the Crystal compiler can do. There are
many more types of AST nodes, each of which the &lt;code&gt;MainVisitor&lt;/code&gt; class handles
differently. And there are many different visitor classes also, beyond
&lt;code&gt;MainVisitor&lt;/code&gt;. When analyzing a more complex program Crystal has to understand
class and module definitions, instance and class variables, type annotations,
different lexical scopes, macros, and much, much more. Crystal will need all of
this information later, during the code generation phase, the next step that
follows semantic analysis.&lt;/p&gt;
&lt;p&gt;But I hope this article gave you a sense of what sort of work a compiler has to
do in order to understand your code. As you can see, for a statically typed
language like Crystal the compiler spends much of its time identifying all of
the types in your code, and determining which programming constructs or AST
nodes have which types.&lt;/p&gt;
&lt;p&gt;Next time I’ll look at code generation: Now that Crystal has identified the
variables, function calls and types in my code it is ready to generate the
machine language code needed to execute my program. To do that, it will
leverage the LLVM framework.&lt;/p&gt;
</content></entry></feed>