<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.3">Jekyll</generator><link href="https://relistan.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://relistan.com/" rel="alternate" type="text/html" /><updated>2026-03-18T15:00:51-04:00</updated><id>https://relistan.com/feed.xml</id><title type="html">Repeatable Systems</title><subtitle>More than a decade of technical ramblings</subtitle><author><name>Karl Matthias</name></author><entry><title type="html">A Mermaid Planning Tool for AI</title><link href="https://relistan.com/mermaid-tool-for-ai" rel="alternate" type="text/html" title="A Mermaid Planning Tool for AI" /><published>2026-03-16T00:00:00-04:00</published><updated>2026-03-16T00:00:00-04:00</updated><id>https://relistan.com/mermaid-tool-for-ai</id><content type="html" xml:base="https://relistan.com/mermaid-tool-for-ai"><![CDATA[<p><img src="/images/mermaid-about.png" alt="Header image" /></p>

<p>The practice of writing code has been changing fast. Really fast. I–and a large
portion of the industry–write code in a completely different way from a year
ago. And by that I mean that I largely manage a sophisticated coding machine
that does the work. Writing code is now a process very similar to some of my
previous experience as VP of Architecture at
<a href="https://community.com">Community</a>: I help co-design an architecture that will
support the desired implementation, and then I help supervise the plan to do
so. I review PRs, and suggest changes. I ensure docs and patterns are up to
date. It’s a familiar role.</p>

<p>As I’ve become more adept at coding with AI, I’ve found that what worked well
for teams of developers works well for AI. And, one of those things is that
starting with a diagram-focused discussion is a lot easier. In person this
often starts at a whiteboard. In remote teams it’s often some cobbled together
combination of other tools. Some years ago now, thanks to Community.com
being 100% remote, my brother <a href="https://github.com/idlehands">Jeffrey Matthias</a>
introduced me to <a href="https://mermaid-js.github.io/mermaid/#/">Mermaid</a> and its
implementation of sequence diagrams. I was (and remain) impressed by the
simplicity of that diagram and how easily the most important interactions are
made clear.</p>

<h2 id="leveling-up">Leveling Up</h2>

<p>Some months back, <a href="https://github.com/patoms">Tom Patterer</a> and I started
asking Claude Code to generate Mermaid diagrams for when exploring existing
code, and when creating implementation plans. And, it turns out, as a fantastic
tool in debugging.</p>

<p>This evolved into flipping the order around: on most projects we just <em>start
with the diagram</em>. That diagram can codify an awful lot of thinking: it’s a
really dense carrier of meaning. Once you have that, and if you have good rules
in your <code class="language-plaintext highlighter-rouge">AGENTS.md/CLAUDE.md</code> file, you will get a very good plan, and likley a
clean implementation. We learned to always include the diagram in the plan, so
that it is not lost in the context of the planning session. When I get a plan
from the agent, I usually clear the context and have it compare the plan to the
diagram a second time to catch any errors or oversights. I rarely catch major
issues in plans or implementation now.</p>

<h2 id="pain-point">Pain Point</h2>

<p>But now you have another problem: collaborating on a diagram is somewhat
painful. Certain IDEs enable collaboration on a file with auto-reloading, but
that requires that you are editing in an IDE and it has the necessary plugins.
The few of you reading this who know me may also know that I am a 35 year <code class="language-plaintext highlighter-rouge">vi</code>
and <code class="language-plaintext highlighter-rouge">vim</code> user and not a fan of most IDEs. I co-wrote an O’Reilly book (with
<a href="https://github.com/spkane">Sean Kane</a>) <em>in</em> <code class="language-plaintext highlighter-rouge">vim</code>. I don’t want to use an IDE,
<code class="language-plaintext highlighter-rouge">neovim</code> + Claude Code is enough. Furthermore in most of the IDEs I’ve seen,
and in the <a href="https://mermaid.live">mermaid.live</a> tool, looking at an actual
diagram of any size is not great. If you have this tooling and prefer it,
great.</p>

<p>But, to suit my preferences, I decided to write a tool that would allow us to
collaborate on a diagram in a way that is more natural to AI. And, which is
more natural to programmers. It does include an editor, so that I can directly
edit the diagram, but for the most part it’s the ability to push and pull
diagrams from the agent that is the biggest win. It has both an MCP server and
a CLI tool to interact with it, so take your pick. Both allow the agent to
directly push updates, the CLI users fewer tokens. The tool supports going
entirely full screen with the diagram, too.</p>

<p>Most of the rest of the <a href="https://mozi.app">Mozi</a> team are now using it, and I
am pretty sold on this workflow at the moment. It’s also fun how it harkens
back to the way we all talked about designing software in the aughts, back when
we thought we might largely generate code from diagrams, as <a href="https://github.com/toddsundsted">Todd
Sundsted</a> pointed out to me awhile back. The
code generator is just a lot more sophisticated these days.</p>

<h2 id="the-app">The App</h2>

<p>It’s a pretty simple app, but I tried to polish it up enough to make it nice to
use. Other co-workers have now contributed to it as well. I’m a light-mode guy
but co-workers are not all, so there is a dark mode. <code class="language-plaintext highlighter-rouge">vim</code> bindings are
supported but not required.</p>

<p>On Linux and Windows, it’s a server and opens the browser to the diagram. On
macOS, it’s a desktop app that opens the diagram. It can also open <code class="language-plaintext highlighter-rouge">.mmd</code> files
in the event you want to review previous work.</p>

<p>If you are serious about improving your AI workflow, I encourage you to try it
out. <a href="https://github.com/mozi-app/mermAId">You can get it here</a>.</p>

<p><img src="/images/editor-ui-light.png" alt="Editor UI" /></p>

<p><img src="/images/claude-code-interaction.png" alt="Claude Code interaction" /></p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="Mermaid" /><category term="AI" /><category term="Planning" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Calling Go from Elixir with a CNode (in Crystal!)</title><link href="https://relistan.com/calling-go-from-elixir-with-a-cnode" rel="alternate" type="text/html" title="Calling Go from Elixir with a CNode (in Crystal!)" /><published>2025-05-16T00:00:00-04:00</published><updated>2025-05-16T00:00:00-04:00</updated><id>https://relistan.com/calling-go-from-elixir-with-a-cnode</id><content type="html" xml:base="https://relistan.com/calling-go-from-elixir-with-a-cnode"><![CDATA[<p><img src="/images/go-elixir-cnode.png" alt="Header image" /></p>

<p>At <a href="https://mozi.app">Mozi</a>, we needed to connect a new Elixir Phoenix LiveView
app to an existing Go backend. This is how we did it.</p>

<h2 id="background">Background</h2>

<p>We have a backend built in Go, which is fully-evented along the lines of the
patterns we used at Community, described in a <a href="/event-sourcing-and-event-bus">previous
post</a>. In order to support all of that, we have
some hefty internal libraries and existing patterns, and get a lot for free by
building on top of them.</p>

<p>Previously the only frontend to our application was an iOS app and that app is
<em>also</em> event-sourced and evented. And now <a href="https://github.com/patoms">Tom
Patterer</a> and I wanted to add a webapp to the mix,
in order to support scenarios outside of the iOS app, either because they work
better on the web, or so we can support Android and desktop browsers to limited
extent (for now) as well. We chose Phoenix LiveView for the web frontend
because it is a great fit for this kind of web app, the main backend developers
at Mozi already know Elixir well, and the comparable Go live implementation is
not as robust or complete.</p>

<p>However, we <em>really</em> didn’t want to have to rewrite/duplicate a lot of the
code that handles events in our existing Go services, or maintain two different
stacks. It would be great if the Elixir app could just call the Go code.</p>

<h2 id="solutions-we-didnt-use">Solutions We Didn’t Use</h2>

<p>I have actually done this before and it works: you can compile the Go code into
a C ABI library and then call it from Elixir via NIFs (Native Implemented
Functions). If you aren’t familiar with the BEAM ecosystem (Erlang VM on which
Elixir runs), NIFs are foreign function interface glue that allows you to call
C code from code running on the BEAM. There are some problems with this
approach (not in order of importance):</p>

<ol>
  <li>
    <p>You have two runtimes with complex internal schedulers running in the same
process, potentially competing with each other for resources.</p>
  </li>
  <li>
    <p>One of the great things about the BEAM is that it is super robust and with
OTP applications (OTP is the framework built originally to run phone
switches), you get a lot of fault tolerance built in. But now you have some C
code in the BEAM and that has to be exactly right or your Elixir app will
crash.</p>
  </li>
  <li>
    <p>Compiles and builds become a mess because your build for your Elixir app is
either linked against the C (Go) library, or you create an Elixir lib that
wraps the C library. In either case you end up with a painful build somewhere.</p>
  </li>
</ol>

<p>There are probably some other issues I haven’t mentioned. It’s not a great
solution.</p>

<p>Ports are another option. They are a sub-process running on the other end of a pipe
that you control from the BEAM process. This is a bit better because you have a
separate process and there is not a worry about two schedulers in the same process.
However, the overhead is higher and you are still not fully decoupled because the
BEAM process has to manage the running port and sub-process. This was the most
viable option other than the one we chose, which allows even more decoupling.</p>

<h2 id="c-nodes-to-the-rescue">C Nodes to the Rescue</h2>

<p>The option we actually chose was to implement what is called a “C Node” in the
Erlang ecosystem. There is a library that ships with the Erlang distribution
called
<a href="https://www.erlang.org/doc/apps/erl_interface/ei_users_guide.html"><code class="language-plaintext highlighter-rouge">erl_interface</code></a>
that allows you to implement a BEAM distribution node in C. What that means is
that you can write C code that will talk to the Elixir/Erlang application over
the native distribution protocol used to connect nodes together running the
BEAM.</p>

<p>This is a great option because, while it does introduce more overhead than the
NIFs, it allows you to fully decouple the codebases from each other at both
compile and runtime. All you need to do is write a lightweight wrapper library
in the Elixir side that makes it easy to call the remote node using native
Elixir functions like <code class="language-plaintext highlighter-rouge">send/2</code>. And on the C side you use the library to decode
the distribution messages and process them as needed, then call the library to
return data back, or make other remote calls. If the nodes are connected,
sending to the remote node feels just like making a normal function call from
the Elixir side.</p>

<p>What we did was build the Go code as a C ABI library. We then wrote a small C
wrapper that processes some CLI args and environment variables, and starts
looping on the inbound messages. It calls the Go code as needed from the C
code. In this setup, <code class="language-plaintext highlighter-rouge">main()</code> is in the C code and the Go code is initialized
and called from there.</p>

<p>The way it works is that the C code starts up and calls to the Elixir app on a
well-known local address. This then begins distribution with the BEAM running
the Elixir app. You can tell on the Elixir side if the C node is connected by
calling <code class="language-plaintext highlighter-rouge">Node.list(:hidden)</code>. On the C side the call either succeeds or fails,
so you can easily manage retries as needed. In our case, embracing the “let it
crash” philosophy, the process exits cleanly, shutting down the Go code. Then
it is restarted by <a href="https://skarnet.org/software/s6-linux-init/">S6</a> running
inside the container. Because we use a well-known name for the C node, it’s
easy to tell if it is connected or not.</p>

<p>Discovering the remote node is handled in <code class="language-plaintext highlighter-rouge">application.ex</code> like this:</p>

<figure class="highlight"><pre><code class="language-elixir" data-lang="elixir"><span class="c1"># This is used when findin the events-sidecar node, overriden in tests</span>
<span class="no">Application</span><span class="o">.</span><span class="n">put_env</span><span class="p">(</span><span class="ss">:elmozi</span><span class="p">,</span> <span class="ss">:find_events_sidecar</span><span class="p">,</span> <span class="k">fn</span> <span class="o">-&gt;</span>
  <span class="n">node</span> <span class="o">=</span>
    <span class="no">Node</span><span class="o">.</span><span class="n">list</span><span class="p">(</span><span class="ss">:hidden</span><span class="p">)</span>
    <span class="o">|&gt;</span> <span class="no">Enum</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="k">fn</span> <span class="n">n</span> <span class="o">-&gt;</span> <span class="no">String</span><span class="o">.</span><span class="n">contains?</span><span class="p">(</span><span class="no">Atom</span><span class="o">.</span><span class="n">to_string</span><span class="p">(</span><span class="n">n</span><span class="p">),</span> <span class="s2">"events_sidecar"</span><span class="p">)</span> <span class="k">end</span><span class="p">)</span>
    <span class="o">|&gt;</span> <span class="n">hd</span></code></pre></figure>

<p>As the comment says, this then makes it easy to override the sidecar in tests,
using a a mock or stub as needed. Note that we did not implement
<code class="language-plaintext highlighter-rouge">Node.ping/1</code> and this is not necessary.</p>

<h2 id="a-short-interlude">A Short Interlude</h2>

<p>Some small number (of the 5 of you who got this far) of you may now be
asking… but why use C, there is a Go implementation of the BEAM distribution
protocol? I had been watching this implementation, called
<a href="https://github.com/ergo-services/ergo">Ergo</a>, for awhile. I wrote some simple
stuff using it. There was always a bit of an issue with making sure it
supported the latest OTP version. In the past I steered away from it because we
couldn’t realiably be sure that we would be able to upgrade Elixir and not
suffer issues talking to Go. As of last fall, in a major departure, that
project no longer supports the native distribution protocol. Instead, you must
run a separate implementation on the BEAM side. And it’s now a commercial
offering. Fair enough, the developer deserves to make some money, but I am glad
we didn’t build anything serious on it.</p>

<h2 id="crystal-upgrade">Crystal Upgrade</h2>

<p>Back to our C node. While I can write and maintain C code, I’m one of the few
people here who can. So in order to improve the maintainability of the
codebase, I decided to rewrite the C code in
<a href="https://crystallang.org">Crystal</a>, a strongly typed language that looks and
feels a lot like Ruby but is compiled to native machine code. This was not a
major effort. It took some work to build the wrappers for the <code class="language-plaintext highlighter-rouge">erl_interface</code>
library, but it wasn’t too bad. We <em>did</em> get this for free in C, but the
Crystal wrappers are thin, and in the end the total line count (as a basic
measure of complexity) of the Crystal code is still less than the C code. The
result is a three language mash-up that actually feels pretty slick and fairly
natural. It is certainly nicer to work on than the C code was.</p>

<p>To make life a little easier, we exposed some additional functions from Go to
allow the Crystal code to log using our same Go logging setup and a few other
basic infrastructure bits that we then don’t have to duplicate in Crystal.
There is a performance penalty every time you cross the C/Go boundary, but for
our use case, it’s not a big deal.</p>

<p>This is roughly what it looks like to work with the <code class="language-plaintext highlighter-rouge">erl_interface</code> library.
Note that <code class="language-plaintext highlighter-rouge">Elgo</code> is the Crystal module wrapping the Go functions.</p>

<figure class="highlight"><pre><code class="language-crystal" data-lang="crystal"><span class="k">def</span> <span class="nf">handle_message</span><span class="p">(</span><span class="n">xbuf</span> <span class="p">:</span> <span class="no">Erlang</span><span class="o">::</span><span class="no">EiXBuff</span><span class="p">)</span>
  <span class="c1"># We predeclare these because we need to pass pointers to them to the Erlang code</span>
  <span class="n">index</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="n">version</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="n">arity</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="n">pid</span> <span class="o">=</span> <span class="n">uninitialized</span> <span class="no">Erlang</span><span class="o">::</span><span class="no">ErlPid</span>
  <span class="n">atom_buf</span> <span class="o">=</span> <span class="no">StaticArray</span><span class="p">(</span><span class="no">UInt8</span><span class="p">,</span> <span class="no">COMMAND_ATOM_MAX_SIZE</span><span class="p">).</span><span class="nf">new</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>

  <span class="k">if</span> <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_version</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">version</span><span class="p">))</span> <span class="o">!=</span> <span class="mi">0</span> <span class="o">||</span>
     <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_tuple_header</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">arity</span><span class="p">))</span> <span class="o">!=</span> <span class="mi">0</span> <span class="o">||</span> <span class="n">arity</span> <span class="o">!=</span> <span class="mi">2</span> <span class="o">||</span>
     <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_pid</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">pid</span><span class="p">))</span> <span class="o">!=</span> <span class="mi">0</span>
    <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Invalid message format"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
    <span class="k">return</span>
  <span class="k">end</span>

  <span class="k">if</span> <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_tuple_header</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">arity</span><span class="p">))</span> <span class="o">!=</span> <span class="mi">0</span> <span class="o">||</span> <span class="n">arity</span> <span class="o">!=</span> <span class="mi">2</span>
    <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Failed to decode command tuple header"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
    <span class="k">return</span>
  <span class="k">end</span>

  <span class="k">if</span> <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_atom</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">atom_buf</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span>
    <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Failed to decode command"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
    <span class="k">return</span>
  <span class="k">end</span>

  <span class="n">command</span> <span class="o">=</span> <span class="no">String</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">atom_buf</span><span class="p">.</span><span class="nf">to_slice</span><span class="p">)</span>

  <span class="k">type</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="n">size</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="k">if</span> <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_get_type</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="k">type</span><span class="p">),</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">size</span><span class="p">))</span> <span class="o">!=</span> <span class="mi">0</span>
    <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Failed to get message type"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
    <span class="k">return</span>
  <span class="k">end</span>

  <span class="k">case</span> <span class="k">type</span>
  <span class="k">when</span> <span class="no">ErlDefs</span><span class="o">::</span><span class="no">ERL_BINARY_EXT</span>
    <span class="n">bin_buf</span> <span class="o">=</span> <span class="no">Bytes</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">size</span><span class="p">)</span>
    <span class="n">bin_size</span> <span class="o">=</span> <span class="n">size</span><span class="p">.</span><span class="nf">to_i64</span>
    <span class="k">if</span> <span class="no">Erlang</span><span class="p">.</span><span class="nf">ei_decode_binary</span><span class="p">(</span><span class="n">xbuf</span><span class="p">.</span><span class="nf">buff</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">index</span><span class="p">),</span> <span class="n">bin_buf</span><span class="p">,</span> <span class="n">pointerof</span><span class="p">(</span><span class="n">bin_size</span><span class="p">))</span> <span class="o">==</span> <span class="mi">0</span>
      <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_info</span><span class="p">(</span><span class="s2">"Received binary message"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>

      <span class="c1"># Main switch for the functions we support</span>
      <span class="k">case</span>
      <span class="k">when</span> <span class="n">command</span><span class="p">.</span><span class="nf">starts_with?</span><span class="p">(</span><span class="s2">"allow_publish_event"</span><span class="p">)</span>
        <span class="c1"># Call the Go code</span>
        <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_add_allowed_publish_event</span><span class="p">(</span><span class="n">bin_buf</span><span class="p">)</span>
      <span class="k">when</span> <span class="n">command</span><span class="p">.</span><span class="nf">starts_with?</span><span class="p">(</span><span class="s2">"notify_publish_ready"</span><span class="p">)</span>
        <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_notify_publish_ready</span><span class="p">(</span><span class="n">bin_buf</span><span class="p">)</span>
      <span class="k">else</span>
        <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Unsupported command: </span><span class="si">#{</span><span class="n">command</span><span class="si">}</span><span class="s2">"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
      <span class="k">end</span>

    <span class="k">else</span>
      <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Failed to decode binary"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
    <span class="k">end</span>
  <span class="k">else</span>
    <span class="no">Elgo</span><span class="p">.</span><span class="nf">elgo_log_error</span><span class="p">(</span><span class="s2">"Unsupported message type"</span><span class="p">.</span><span class="nf">to_unsafe</span><span class="p">)</span>
  <span class="k">end</span></code></pre></figure>

<h2 id="how-it-works-in-practice">How It Works In Practice</h2>

<p>It’s quite solid! We build and deploy the Crystal/Go code as a single Docker
container, running in the same Kubernetes pod as the Elixir app. They are be
independently built and are only coupled by the deploy-time configuration that
specifies which version of each to deploy in the pod.</p>

<p>Because both Crystal and Go are able to build easily on both macOS and Linux,
we can develop locally on macOS and build and deploy on Linux. It took a bit of
fiddling to find the right distribution of Linux to build on with easy support
for Crystal and that Go would run on properly.</p>

<p>The only hard part here was a temporary issue: in the end I had to build a
custom build of the Go 1.24 compiler on Alpine Linux that has a patch to
properly support MUSL libc when starting under Cgo. Shortly this won’t be
ncessary as I did not write <a href="https://go-review.googlesource.com/c/go/+/610837">this
patch</a> myself, it was
contributed to the Go project but has not yet shipped.</p>

<p>If there is enough interest, I will work to open source the Crystal wrapper we
wrote to <code class="language-plaintext highlighter-rouge">erl_interface</code> so others can use it as well. Hit me up <a href="https://mstdn.social/@relistan">on
Mastodon</a> to let me know you are interested!</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="go" /><category term="golang" /><category term="development" /><category term="events" /><category term="event-sourcing" /><category term="elixir" /><category term="crystal" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Parsing Protobuf Definitions with Tree-sitter</title><link href="https://relistan.com/parsing-protobuf-files-with-treesitter" rel="alternate" type="text/html" title="Parsing Protobuf Definitions with Tree-sitter" /><published>2024-07-20T00:00:00-04:00</published><updated>2024-07-20T00:00:00-04:00</updated><id>https://relistan.com/parsing-protobuf-files-with-treesitter</id><content type="html" xml:base="https://relistan.com/parsing-protobuf-files-with-treesitter"><![CDATA[<p><img style="float: right; display: block; margin-bottom: 1em; width: 20%; margin-left: 1em;" src="/images/tree-sitter-small.png" alt="Tree-sitter logo" /></p>

<p>If you work with Protocol Buffers (<code class="language-plaintext highlighter-rouge">protobuf</code>), you can really save time,
boredom, and headache by parsing your definitions to build tools and generate
code.</p>

<p>The usual tool for doing that is <code class="language-plaintext highlighter-rouge">protoc</code>. It supports plugins to generate
output of various kinds: language bindings, documentation etc. But, if you want
to do anything custom, you are faced with either using something limited like
<a href="https://github.com/moul/protoc-gen-gotemplate">protoc-gen-gotemplate</a> or
writing your own plugin. <code class="language-plaintext highlighter-rouge">protoc-gen-gotemplate</code> works well, but you can’t
build complex logic into the workflow. You are limited to what is possible in a
simple Go template.</p>

<p>It’s also possible to use <code class="language-plaintext highlighter-rouge">protoreflect</code> from Go to process the compiled results
at runtime. This is painful. Really painful.</p>

<p>So, at work, we had made limited use of the <code class="language-plaintext highlighter-rouge">protobuf</code> definitions other than
for their main purpose and for documentation and package configuration via
custom options (these are supported in <code class="language-plaintext highlighter-rouge">protobuf</code>). Writing the <code class="language-plaintext highlighter-rouge">protoreflect</code>
code to make that work is not something I want to repeat.</p>

<p>Then I recently revamped my editor setup and moved from
<a href="https://www.vim.org/">Vim</a> to <a href="https://neovim.io/">Neovim</a>. In the process I
realized how awesome the
<a href="https://tree-sitter.github.io/tree-sitter/">Tree-sitter</a> parsing library is
and that it probably was going to support extracting everything I wanted to get
from our <code class="language-plaintext highlighter-rouge">protobuf</code> definitions. Neovim uses Tree-sitter extensively.</p>

<h2 id="why-this-matters">Why This Matters</h2>

<p>Our evented and event-sourced backend at <a href="https://mozi.app">Mozi</a> relies on
<code class="language-plaintext highlighter-rouge">protobuf</code> for schema definitions and serialization of events. We use these
same schemas everywhere from the frontend all the way to the backend. This
means our whole system is working on the exact same entity definitions
throughout. Good Stuff™.</p>

<p>In Go, the bindings are not really native structs and require a lot of
<code class="language-plaintext highlighter-rouge">GetXYZ()</code> and <code class="language-plaintext highlighter-rouge">GetValue()</code> calls chained with nil checking to work around the
fact that <code class="language-plaintext highlighter-rouge">nil</code> and zero values are encoded the same way in Protobuf. You also
can’t use them in conjunction with anything that uses <code class="language-plaintext highlighter-rouge">struct</code> tags because you
can’t apply tags. I am told by the mobile devs that the Swift bindings are
similarly unfriendly.</p>

<p>We use a mapping layer to paper over this and to make these easier to work with
in in our Go code, in data stores, and with off-the-shelf libraries.</p>

<p>We were maintaining custom mappings by hand. That’s a waste of time and even
getting GPT to write the transformations back and forth is annoying, and
invariably requires tweaking. So I wanted a solution that was much more
automatic and repeatable.</p>

<p>Here’s what I did.</p>

<h2 id="example-definition">Example Definition</h2>

<p>First we’ll have a look at one <code class="language-plaintext highlighter-rouge">protobuf</code> definition. Then we’ll talk about
extracting the information we want from it.</p>

<p>Imagine that we’re working with the following fairly typical <code class="language-plaintext highlighter-rouge">protobuf</code> message
definition. We want to be able to extract the name of the message, the enum
names and values, and the fields and their types. Here we are not particularly
interested in the field numbers, but you could also extract them, of course.</p>

<figure class="highlight"><pre><code class="language-protobuf" data-lang="protobuf"><span class="na">syntax</span> <span class="o">=</span> <span class="s">"proto3"</span><span class="p">;</span>
<span class="kn">package</span> <span class="nn">entities</span><span class="p">;</span>
<span class="k">import</span> <span class="s">"google/protobuf/wrappers.proto"</span><span class="p">;</span>

<span class="c1">// A BlogPost represents a single post on the blog.</span>
<span class="kd">message</span> <span class="nc">BlogPost</span> <span class="p">{</span>
  <span class="c1">// Some custom configuration options</span>
  <span class="c1">// ...</span>

  <span class="c1">// Each blog post is assigned a type so we can identify how it will be used</span>
  <span class="kd">enum</span> <span class="n">PostType</span> <span class="p">{</span>
    <span class="na">POST_TYPE_NOT_SET</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="na">POST_TYPE_ARTICLE</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="na">POST_TYPE_PAGE</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="na">POST_TYPE_SPLASH</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="n">google.protobuf.StringValue</span> <span class="na">post_id</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="c1">// The ID of the blog post</span>
  <span class="n">PostType</span> <span class="na">post_type</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span> <span class="c1">// What kind of post is this?</span>
  <span class="n">google.protobuf.StringValue</span> <span class="na">title</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span> <span class="c1">// The title of the post</span>
  <span class="n">google.protobuf.StringValue</span> <span class="na">body</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span> <span class="c1">// The actual contents of the post</span>
<span class="p">}</span></code></pre></figure>

<p>This typical message contains a single enum type and 4 fields. Real life
messages will contain many more fields, but this is enough for us in this post.
Looking at this, we <em>could</em> hack something to parse this fairly simple example
using regexes or other string matching. But it would end up being pretty
brittle. You could even break most trivial parsers by commenting out one or
more lines of valid code with <code class="language-plaintext highlighter-rouge">/* */</code> style. So let’s take a look at how we
could get the data we need using a real parser: Tree-sitter.</p>

<h2 id="parsing-and-querying-the-document">Parsing and Querying the Document</h2>

<p>Tree-sitter has numerous bindings that enable parsing programming languages and
data formats and protobuf is supported. There are also good Go bindings for
Tree-sitter that make it possible to interact with all of this in a
straightforward way from Go code. We’ll use the
<a href="https://github.com/smacker/go-tree-sitter">github.com/smacker/go-tree-sitter</a>
package and the <a href="https://github.com/smacker/go-tree-sitter/protobuf">associated <code class="language-plaintext highlighter-rouge">protobuf</code>
bindings</a>.</p>

<p>The library supports various methods of access to the parsed tree, but the one
we’ll use here is a query expression that will extract only the data we care
about.</p>

<p>We can use an <a href="https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax">S
expression</a>
to query the parsed tree. But, we need to understand what the parsed tree looks
like before we can query it. How do we visualize what is in the AST? One way
would be to use the <a href="https://tree-sitter.github.io/tree-sitter/playground">online
playground</a>, but that
lacks support for Protobuf. Because I was already working in Neovim, I decided
to use the excellent built-in visualization and query tools!</p>

<p>Inside Neovim you can run <code class="language-plaintext highlighter-rouge">:InspectTree</code> on any open document where the bindings
are included, and see a nice tree. Here is me running the inspector on the
source code for this blog post. (See if you can spot the code error)</p>

<p><a href="/images/Tree-sitter-inspect.jpg" caption="" data-lightbox="">
  <img class="image" src="/images/Tree-sitter-inspect.jpg" width="" alt="" />
</a></p>

<p>In <code class="language-plaintext highlighter-rouge">:InspectTree</code>, if I highlight things in the document, I see them reflected
in the tree, and vice versa. This is invaluable for working with the queries,
since we can identify what each element in the AST actually is in the document,
live.</p>

<p>We can do the same thing for our Protobuf document. Then, it’s a matter of
constructing a query to find and extract the parts of the document we want:</p>

<ol>
  <li>Message Name</li>
  <li>Enum names, keys, and values</li>
  <li>Field names and types</li>
</ol>

<p>Writing a query using the Neovim tools is also nice, and straightforward. From
the <code class="language-plaintext highlighter-rouge">:InspectTree</code> panel, you can open the query editor by typing <code class="language-plaintext highlighter-rouge">:EditQuery</code>.
This brings up another pane where we can type queries and see them reflected in
the original document via highlighting and annotation.</p>

<p>This is what writing a query looks like in the Neovim query window:</p>

<p><a href="/images/Tree-sitter-query.jpg" caption="" data-lightbox="">
  <img class="image" src="/images/Tree-sitter-query.jpg" width="" alt="" />
</a></p>

<p>When I put the cursor over the named capture <code class="language-plaintext highlighter-rouge">@name</code> in the query, it
highlights any matched parts of the document. There are many ways to write the
queries that we might use here. You essentially just walk through the tree in
the viewer and mark the things you’d like to return as named captures.</p>

<p>The simplest query, shown in the screenshot, is to simply extract the message name:</p>

<figure class="highlight"><pre><code class="language-common_lisp" data-lang="common_lisp"><span class="p">(</span><span class="nv">message_name</span> <span class="p">(</span><span class="nv">identifier</span><span class="p">))</span> <span class="nv">@name</span></code></pre></figure>

<p>Here we found by inspecting the tree, that a <code class="language-plaintext highlighter-rouge">message_name</code> type is always
followed by an <code class="language-plaintext highlighter-rouge">identifier</code>. If we capture the <code class="language-plaintext highlighter-rouge">identifier</code> as <code class="language-plaintext highlighter-rouge">@name</code> we can
then refer to that capture when we want the message name. Then we can just
build it up from there.</p>

<p>Here you can see me traversing a query that I built, and how the editor
highlights the matches:</p>

<p><a href="/images/Tree-sitter-Neovim.mov"><img src="/images/Tree-sitter-Neovim.jpg" alt="" /></a></p>

<p>This is an example of a single query that will extract all of our required data
from the <code class="language-plaintext highlighter-rouge">protobuf</code> definition:</p>

<figure class="highlight"><pre><code class="language-common_lisp" data-lang="common_lisp"><span class="p">(</span><span class="nv">message_name</span> <span class="p">(</span><span class="nv">identifier</span><span class="p">))</span> <span class="nv">@message_name</span>
<span class="p">(</span><span class="nv">enum_name</span> <span class="p">(</span><span class="nv">identifier</span><span class="p">))</span> <span class="nv">@enum_name</span>
<span class="p">(</span><span class="nv">enum_field</span>
	<span class="p">(</span><span class="nv">identifier</span><span class="p">)</span> <span class="nv">@enum_key</span>
	<span class="p">(</span><span class="nv">int_lit</span> <span class="p">(</span><span class="nv">_</span><span class="p">)</span> <span class="nv">@enum_value</span><span class="p">)</span>
<span class="p">)</span>
<span class="p">(</span><span class="nv">field</span> <span class="p">(</span>
<span class="p">(</span><span class="k">type</span> <span class="p">(</span><span class="nv">message_or_enum_type</span><span class="p">))</span> <span class="nv">@field_type</span>
	<span class="p">)</span>
	<span class="p">(</span><span class="nv">identifier</span><span class="p">)</span> <span class="nv">@field_name</span>
<span class="p">)</span></code></pre></figure>

<p>Captures from the document will be returned by Tree-sitter in order. This is
very helpful. We can then walk the results to generate a structure more easily
reference in code. So let’s take a look at some Go code to interact with this
document using the query we built.</p>

<h2 id="working-with-tree-sitter-from-go">Working with Tree-Sitter from Go</h2>

<p>We need to import the two packages mentioned earlier. This is truncated for
clarity: you will need other simple stdlib import.</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="k">import</span> <span class="p">(</span>
	<span class="n">sitter</span> <span class="s">"github.com/smacker/go-tree-sitter"</span>   <span class="c">// Tree-sitter bindings</span>
	<span class="s">"github.com/smacker/go-tree-sitter/protobuf"</span> <span class="c">// Protobuf definitions</span>
<span class="p">)</span></code></pre></figure>

<p>We need some kind of data structure to store our parsed info in. The simplest
starting point is something like this:</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="c">// A Message represents a single Protobuf message definition</span>
<span class="k">type</span> <span class="n">Message</span> <span class="k">struct</span> <span class="p">{</span>
	<span class="n">Name</span>   <span class="kt">string</span>
	<span class="n">Fields</span> <span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">string</span>
	<span class="n">Enums</span>  <span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">int</span>
<span class="p">}</span></code></pre></figure>

<p>You could, of course, use a more structured type if that suits your purpose
better.</p>

<p>Then we need a function to read in the file and run it through the parser:</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="c">// ParseMessage parses the message file and returns a Message struct</span>
<span class="k">func</span> <span class="n">ParseMessage</span><span class="p">(</span><span class="n">filename</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">Message</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">content</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">os</span><span class="o">.</span><span class="n">ReadFile</span><span class="p">(</span><span class="n">filename</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to read file: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
    <span class="p">}</span>

    <span class="c">// Create a new parser</span>
    <span class="n">parser</span> <span class="o">:=</span> <span class="n">sitter</span><span class="o">.</span><span class="n">NewParser</span><span class="p">()</span>
    <span class="n">parser</span><span class="o">.</span><span class="n">SetLanguage</span><span class="p">(</span><span class="n">protobuf</span><span class="o">.</span><span class="n">GetLanguage</span><span class="p">())</span>

    <span class="c">// Parse the content of the protobuf file</span>
    <span class="n">tree</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">parser</span><span class="o">.</span><span class="n">ParseCtx</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="no">nil</span><span class="p">,</span> <span class="n">content</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to parse protobuf: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
    <span class="p">}</span>

    <span class="n">fields</span><span class="p">,</span> <span class="n">enums</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">GetMessageFields</span><span class="p">(</span><span class="n">tree</span><span class="p">,</span> <span class="n">content</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
    <span class="p">}</span>

    <span class="n">msg</span> <span class="o">:=</span> <span class="o">&amp;</span><span class="n">Message</span><span class="p">{</span>
        <span class="n">Name</span><span class="o">:</span>   <span class="n">name</span><span class="p">,</span>
        <span class="n">Fields</span><span class="o">:</span> <span class="n">fields</span><span class="p">[</span><span class="n">name</span><span class="p">],</span>
        <span class="n">Enums</span><span class="o">:</span>  <span class="n">enums</span><span class="p">,</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="n">msg</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span></code></pre></figure>

<p>You will note in the above that the majority of the hard work is being done by
a function we have not seen yet: <code class="language-plaintext highlighter-rouge">GetMessageFields()</code>. That should look
something like this:</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="c">// GetMessageFields runs the Treesitter query and returns the  two maps</span>
<span class="k">func</span> <span class="n">GetMessageFields</span><span class="p">(</span><span class="n">tree</span> <span class="o">*</span><span class="n">sitter</span><span class="o">.</span><span class="n">Tree</span><span class="p">,</span> <span class="n">content</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">)</span> <span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">string</span><span class="p">,</span> <span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">int</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">query</span> <span class="o">:=</span> <span class="s">`
        (message_name (identifier)) @message_name
        (enum_name (identifier)) @enum_name
        (enum_field
            (identifier) @enum_key
            (int_lit (_) @enum_value)
        )
        (field (
        (type (message_or_enum_type)) @field_type
            )
            (identifier) @field_name
        )
  `</span>

    <span class="n">q</span><span class="p">,</span> <span class="n">qc</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">queryTree</span><span class="p">(</span><span class="n">tree</span><span class="p">,</span> <span class="n">query</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
    <span class="p">}</span>

    <span class="n">fields</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">string</span><span class="p">)</span>
    <span class="n">enumFields</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">int</span><span class="p">)</span>

    <span class="k">var</span> <span class="p">(</span>
        <span class="n">fieldName</span><span class="p">,</span> <span class="n">fieldType</span><span class="p">,</span> <span class="n">messageName</span> <span class="kt">string</span>
        <span class="n">enumName</span><span class="p">,</span> <span class="n">enumKey</span><span class="p">,</span> <span class="n">enumValue</span>      <span class="kt">string</span>
    <span class="p">)</span>

    <span class="c">// Iterate over the matches and print the field names and types</span>
    <span class="k">for</span> <span class="p">{</span>
        <span class="n">match</span><span class="p">,</span> <span class="n">ok</span> <span class="o">:=</span> <span class="n">qc</span><span class="o">.</span><span class="n">NextMatch</span><span class="p">()</span>
        <span class="k">if</span> <span class="o">!</span><span class="n">ok</span> <span class="p">{</span>
            <span class="k">break</span>
        <span class="p">}</span>

        <span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="n">capture</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">match</span><span class="o">.</span><span class="n">Captures</span> <span class="p">{</span>
            <span class="n">node</span> <span class="o">:=</span> <span class="n">capture</span><span class="o">.</span><span class="n">Node</span>
            <span class="n">captureName</span> <span class="o">:=</span> <span class="n">q</span><span class="o">.</span><span class="n">CaptureNameForId</span><span class="p">(</span><span class="n">capture</span><span class="o">.</span><span class="n">Index</span><span class="p">)</span>
            <span class="k">switch</span>  <span class="n">captureName</span> <span class="p">{</span>
            <span class="k">case</span> <span class="s">"message_name"</span><span class="o">:</span>
                <span class="n">messageName</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
                <span class="n">fields</span><span class="p">[</span><span class="n">messageName</span><span class="p">]</span> <span class="o">=</span> <span class="nb">make</span><span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">string</span><span class="p">)</span>
            <span class="k">case</span> <span class="s">"field_name"</span><span class="o">:</span>
                <span class="n">fieldName</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
                <span class="n">fields</span><span class="p">[</span><span class="n">messageName</span><span class="p">][</span><span class="n">fieldName</span><span class="p">]</span> <span class="o">=</span> <span class="n">fieldType</span>
            <span class="k">case</span> <span class="s">"field_type"</span><span class="o">:</span>
                <span class="n">fieldType</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
            <span class="k">case</span> <span class="s">"enum_name"</span><span class="o">:</span>
                <span class="n">enumName</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
                <span class="n">enumFields</span><span class="p">[</span><span class="n">enumName</span><span class="p">]</span> <span class="o">=</span> <span class="nb">make</span><span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="kt">int</span><span class="p">)</span>
            <span class="k">case</span> <span class="s">"enum_key"</span><span class="o">:</span>
                <span class="n">enumKey</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
            <span class="k">case</span> <span class="s">"enum_value"</span><span class="o">:</span>
                <span class="n">enumValue</span> <span class="o">=</span> <span class="n">node</span><span class="o">.</span><span class="n">Content</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
                <span class="c">// Treesitter thinks zeroes are octal, let's work around it</span>
                <span class="k">if</span> <span class="n">strings</span><span class="o">.</span><span class="n">HasPrefix</span><span class="p">(</span><span class="n">enumValue</span><span class="p">,</span> <span class="s">"0x"</span><span class="p">)</span> <span class="p">{</span>
                    <span class="n">enumFields</span><span class="p">[</span><span class="n">enumName</span><span class="p">][</span><span class="n">enumKey</span><span class="p">]</span> <span class="o">=</span> <span class="m">0</span>
                <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
                    <span class="n">enumFields</span><span class="p">[</span><span class="n">enumName</span><span class="p">][</span><span class="n">enumKey</span><span class="p">],</span> <span class="n">err</span> <span class="o">=</span> <span class="n">strconv</span><span class="o">.</span><span class="n">Atoi</span><span class="p">(</span><span class="n">enumValue</span><span class="p">)</span>
                    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
                    <span class="p">}</span>
                <span class="p">}</span>
      <span class="k">default</span><span class="o">:</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"unexpected match type: %s"</span><span class="p">,</span> <span class="n">captureName</span><span class="p">)</span>
            <span class="p">}</span>
        <span class="p">}</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="n">fields</span><span class="p">,</span> <span class="n">enumFields</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span></code></pre></figure>

<p>Here we define the query, ask Tree-sitter to kick off the query, and then we
loop over the matches, inspecting their name and then building up the maps.</p>

<p>The last piece of code to show is the <code class="language-plaintext highlighter-rouge">queryTree()</code> function that kicks off the
query and cursor. It looks like this:</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="c">// queryTree runs a Treesitter query over a pre-existing tree</span>
<span class="k">func</span> <span class="n">queryTree</span><span class="p">(</span><span class="n">tree</span> <span class="o">*</span><span class="n">sitter</span><span class="o">.</span><span class="n">Tree</span><span class="p">,</span> <span class="n">query</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">sitter</span><span class="o">.</span><span class="n">Query</span><span class="p">,</span> <span class="o">*</span><span class="n">sitter</span><span class="o">.</span><span class="n">QueryCursor</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">q</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">sitter</span><span class="o">.</span><span class="n">NewQuery</span><span class="p">([]</span><span class="kt">byte</span><span class="p">(</span><span class="n">query</span><span class="p">),</span> <span class="n">protobuf</span><span class="o">.</span><span class="n">GetLanguage</span><span class="p">())</span>
    <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
        <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to run query: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
    <span class="p">}</span>

    <span class="c">// Execute the query</span>
    <span class="n">qc</span> <span class="o">:=</span> <span class="n">sitter</span><span class="o">.</span><span class="n">NewQueryCursor</span><span class="p">()</span>
    <span class="n">qc</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">tree</span><span class="o">.</span><span class="n">RootNode</span><span class="p">())</span>

    <span class="k">return</span> <span class="n">q</span><span class="p">,</span> <span class="n">qc</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span></code></pre></figure>

<p>And that’s pretty much the meat of it. We can call <code class="language-plaintext highlighter-rouge">ParseMessage()</code> and we get back
a <code class="language-plaintext highlighter-rouge">Message{}</code> struct that is populated with our message name, fields, and enums. In
JSON representation, it would look something like this:</p>

<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
   </span><span class="nl">"Enums"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"PostType"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
         </span><span class="nl">"POST_TYPE_NOT_SET"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
         </span><span class="nl">"POST_TYPE_ARTICLE"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
         </span><span class="nl">"POST_TYPE_PAGE"</span><span class="p">:</span><span class="w">    </span><span class="mi">2</span><span class="p">,</span><span class="w">
         </span><span class="nl">"POST_TYPE_SPLASH"</span><span class="p">:</span><span class="w">  </span><span class="mi">3</span><span class="w">
      </span><span class="p">}</span><span class="w">
   </span><span class="p">},</span><span class="w">
   </span><span class="nl">"Fields"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"post_id"</span><span class="p">:</span><span class="w">   </span><span class="s2">"google.protobuf.StringValue"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"post_type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"PostType"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"title"</span><span class="p">:</span><span class="w">     </span><span class="s2">"google.protobuf.StringValue"</span><span class="w">
      </span><span class="nl">"body"</span><span class="p">:</span><span class="w">      </span><span class="s2">"google.protobuf.StringValue"</span><span class="p">,</span><span class="w">
   </span><span class="p">},</span><span class="w">
   </span><span class="nl">"Name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"BlogPost"</span><span class="w">
</span><span class="p">}</span></code></pre></figure>

<p>And that’s it! It’s up to you what you do with this, but that gets you started.
If you need to parse sub-types, you could design a query to do that. If you
want to parse RPC definitions, you could do that, too. We use this information
to generate out our bindings (which includes some logic).</p>

<h2 id="conclusion">Conclusion</h2>

<p>This basis for tooling has been pretty good for us. I will undoubtedly bring
Tree-sitter and the Neovim tooling to bear on other problems in the future.
Hopefully this overview gets you a starting point.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="go" /><category term="golang" /><category term="development" /><category term="protobuf" /><category term="treesitter" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Run Your Own Kubernetes Instance with Microk8s</title><link href="https://relistan.com/running-your-own-kubernetes" rel="alternate" type="text/html" title="Run Your Own Kubernetes Instance with Microk8s" /><published>2022-11-20T00:00:00-05:00</published><updated>2022-11-20T00:00:00-05:00</updated><id>https://relistan.com/running-your-own-kubernetes</id><content type="html" xml:base="https://relistan.com/running-your-own-kubernetes"><![CDATA[<p><img src="/images/micro-k8s.svg" alt="microk8s logo" /></p>

<p>This post covers how to get a personal Kubernetes (K8s) cluster running on an
instance with a  public IP address, with HTTPS support, on the cheap. I’m
paying about €9/month to run mine and without VAT in the US you will pay less.
This post is not how to build or run a large production system, but some of the
things you learn here could apply.</p>

<p>We’ll get a cloud instance, install Kubernetes, set up ingress, get some https
certs, and set the site up to serve traffic from a basic <code class="language-plaintext highlighter-rouge">nginx</code> installation.
You need to know some basic stuff to make this work, things like Linux CLI,
getting a domain name, and setting up a DNS record. Those are not covered.</p>

<h2 id="choices">Choices</h2>

<p>Whenever you dive into something technical of any complexity, you need to make
a series of choices. There are an almost infinite set of combinations you could
choose for this setup. If you have strong feelings about using something else,
you should do that! But in this post, I have made the following choices:</p>

<ul>
  <li>Ubuntu Linux as the OS distro</li>
  <li><a href="https://microk8s.io/">microk8s</a> as the Kubernetes distro</li>
  <li><a href="https://projectcontour.io/">Contour</a> as the ingress controller</li>
  <li><a href="https://letsencrypt.org/">Letsencrypt certificates</a></li>
</ul>

<p>These are opinionated choices that will impact the rest of this narrative.
However, there are two other choices you need to make that are up to you.</p>

<h3 id="selecting-hosting">Selecting Hosting</h3>

<p>To run Kubernetes where your services are available on the public Internet, you
need to host it somewhere. If you want to run that at home on your own
hardware, you could do that. Having had a long career maintaining things, I try
to keep that to a minimum in my spare time. So I have chosen a cloud provider
to host mine.</p>

<p>You will need to make sure you get a fixed public IP address if you want to
serve traffic from your instance. Most clouds provide this, but some require
add-ons (and more money).</p>

<p>From my own experience, I can strongly recommend
<a href="https://www.hetzner.com/cloud">Hetzner</a> for this. They have a really good
price plan for instances with a good bit of CPU, memory, and disk. I’ve chosen
the CPX21 offering at the time of this writing.</p>

<p>This offering has the following characteristics:</p>

<ul>
  <li>3 vCPUs</li>
  <li>4GB RAM</li>
  <li>80GB disk</li>
  <li>20TB traffic</li>
  <li>IPv4 support (IPv6-only is cheaper)</li>
  <li>Available in US, Germany, and Finland</li>
  <li>€8.98/month with VAT included, less if you are in the US</li>
</ul>

<p>In any case, you should select an hosting provider and make sure your instance
is running the latest Long Term Support (LTS) Ubuntu Linux.</p>

<h3 id="selecting-a-domain-and-dns">Selecting A Domain And DNS</h3>

<p>Your Kubernetes host will need to be addressed by name. You should either use a
domain you already own, or register one for this purpose. A sub-domain from an
existing domain you own is also no problem. Once you have an instance up, you
will want to get the IP address provided and set it up with your registrar. You
could also consider fronting your instance with CloudFlare, in which case they
will host your DNS. I will leave this as an exercise to the reader. There are
lots of tutorials about how to do it.</p>

<h2 id="installing-and-configuring-kubernetes">Installing and Configuring Kubernetes</h2>

<p><img style="float: right" src="/images/Kubernetes_logo_without_workmark.svg.png" alt="Kubernetes logo" />
This is actually quite easy to do with <code class="language-plaintext highlighter-rouge">microk8s</code>. This is a very self-containd
distro of Kubernetes that is managed by Canonical, the people who make Ubuntu
Linux. So that means it’s pretty native on the platform. It supports pretty
much anything you are going to want to do on Kubernetes right now.</p>

<p>On Ubuntu you can install <code class="language-plaintext highlighter-rouge">microk8s</code> with the <code class="language-plaintext highlighter-rouge">snap</code> package provider. This is
normally available on the latest Ubuntu installs, but on Hetzner the distro is
quite tiny so you will need to <code class="language-plaintext highlighter-rouge">sudo apt-get install snapd</code>. With that
complete, installing <code class="language-plaintext highlighter-rouge">microk8s</code> is:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span><span class="nb">sudo </span>snap <span class="nb">install </span>microk8s <span class="nt">--classic</span></code></pre></figure>

<p>That will pick the latest version, which is what I recommend doing. You can
test that this has worked by running:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span>microk8s kubectl get pods</code></pre></figure>

<p>That should complete successfully and return an empty pod list. Kubernetes is
installed!</p>

<p>That being said, we need to add some more stuff to it. <code class="language-plaintext highlighter-rouge">microk8s</code> makes some of
that quite easy. You should take the following and write it to a file named
“microk8s-plugins” on your hosted instance:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="c">#!/bin/sh</span>

<span class="c"># These are the plugins you need to get a basic cluster up and running</span>
<span class="nv">DNS_SERVER</span><span class="o">=</span>1.1.1.1
<span class="nv">HOST_PUBLIC_IP</span><span class="o">=</span><span class="s2">"&lt;put your IP here&gt;"</span>

microk8s disable ha-cluster <span class="nt">--force</span>
microk8s <span class="nb">enable </span>hostpath-storage
microk8s <span class="nb">enable </span>dns:<span class="s2">"</span><span class="nv">$DNS_SERVER</span><span class="s2">"</span>
microk8s <span class="nb">enable </span>cert-manager
microk8s <span class="nb">enable </span>rbac
microk8s <span class="nb">enable </span>metallb:<span class="s2">"</span><span class="k">${</span><span class="nv">HOST_PUBLIC_IP</span><span class="k">}</span><span class="s2">-</span><span class="k">${</span><span class="nv">HOST_PUBLIC_IP</span><span class="k">}</span><span class="s2">"</span></code></pre></figure>

<p>Youd should put the public IP address of your host in the placeholder at the
top of the script.</p>

<p>I have selected CloudFlare’s DNS here. This is used by services running in the
cluster when they need name resolution. You could alternatively pick another
DNS server you prefer. Another example would be Google’s <code class="language-plaintext highlighter-rouge">8.8.8.8</code> or <code class="language-plaintext highlighter-rouge">8.8.4.4</code>.</p>

<p>Note that the <code class="language-plaintext highlighter-rouge">disable ha-cluster</code> is not <em>required</em> and it will limit your
cluster to a single machine. However, it does free up memory on the instance
and if you only plan to have one instance, it’s what I would do.</p>

<p>Now that you’ve written that to a file, make it executable and then run it.</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span><span class="nb">chmod </span>755 microk8s-plugins <span class="o">&amp;&amp;</span> ./microk8s-plugins</code></pre></figure>

<p>It’s a good idea to keep this around so that if you re-install or upgrade later
you have it around. You might also look around later to see which other add-ons
you might want. For now this is good enough.</p>

<p>We’re setting up:</p>

<ul>
  <li>
    <p><strong>hostpath storage</strong>: allows us to provide persistent volumes to pods written to
the actual disks of the system itself.</p>
  </li>
  <li>
    <p><strong>cluster wide DNS</strong>: configures the DNS server to use for lookups</p>
  </li>
  <li>
    <p><strong>cert manager</strong>: will support getting and managing certificates with Letsencrypt</p>
  </li>
  <li>
    <p><strong>rbac</strong>: support access control in the same way as most production systems</p>
  </li>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">metallb</code></strong>: a basic on-host load balancer that allows us to expose services
from the Kubernetes cluster to the public IP address. This is installed with
a single public IP address.</p>
  </li>
</ul>

<h3 id="kubectl-options"><code class="language-plaintext highlighter-rouge">kubectl</code> Options</h3>

<p><code class="language-plaintext highlighter-rouge">microk8s</code> has <code class="language-plaintext highlighter-rouge">kubectl</code> already available. You used it earlier with <code class="language-plaintext highlighter-rouge">microk8s
kubectl</code>. This in a totally valid way to use it. You might want to set up an
alias like: <code class="language-plaintext highlighter-rouge">alias kubectl="microk8s kubectl"</code>. Alternatively you can install
<code class="language-plaintext highlighter-rouge">kubectl</code> with:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span>snap <span class="nb">install </span>kubectl <span class="nt">--classic</span></code></pre></figure>

<p>Either is fine. If you like typing a lot, you don’t have to do either one!</p>

<h2 id="namespaces">Namespaces</h2>

<p>From this point on we’ll just work in the <code class="language-plaintext highlighter-rouge">default</code> namespace to keep things
simple. You should do this however you like. If you want to put things into
their own namespace, you can apply a namespace to all of the <code class="language-plaintext highlighter-rouge">kubectl</code>
commands.</p>

<p><code class="language-plaintext highlighter-rouge">cert-manager</code> and <code class="language-plaintext highlighter-rouge">contour</code> have their own namespaces already and are setup to
work that way.</p>

<h2 id="container-registry">Container Registry</h2>

<p><img style="float: left; width: 20%; padding-right: 1%;" src="/images/jfrog.svg" alt="JFrog Logo" />
You need somewhere to put containers you are going to run on your cluster. If
you already have that, just use what you have. If you are only going to run
public projects, then you don’t need one. If you want to run anything from a
private repository, you will need to sign up for one. Docker Hub allows a
single public image.  There are lots of other options. I selected
<a href="https://jfrog.com/">JFrog</a> because they allow up to 2GB of private images,
regardless of how many you have.</p>

<p>If you are going to use a private registry, you need to give some credentials
to the Kubernetes cluster to tell it how to pull from the registry. You will
need to get these credentials from your provider.</p>

<p>A running Kubernetes it entirely configured via the API. Using <code class="language-plaintext highlighter-rouge">kubectl</code> we can
post YAML files to it to affect configuration, or for simpler things, just
providing them on the CLI. You need to give it registry creds by setting up a
Kubernetes secret like this:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span>kubectl <span class="nt">--namespace</span> default  create secret docker-registry image-registry-credentials <span class="nt">--docker-username</span><span class="o">=</span><span class="s2">"&lt;your username here&gt;"</span> <span class="nt">--docker-password</span><span class="o">=</span><span class="s2">"&lt;your password here&gt;"</span> <span class="nt">--docker-server</span><span class="o">=</span>&lt;your server&gt;</code></pre></figure>

<p>For <code class="language-plaintext highlighter-rouge">jfrog.io</code> the server is <code class="language-plaintext highlighter-rouge">&lt;yourdomain&gt;.jfrog.io</code>. This should not contain a path.</p>

<p>You now have a secret called <code class="language-plaintext highlighter-rouge">image-registry-credentials</code>. You can verify this with
<code class="language-plaintext highlighter-rouge">kubectl get secrets</code>, which should return a list with one secret.</p>

<h2 id="setting-up-ingress">Setting Up Ingress</h2>

<p>When stuff is running inside a Kubernetes cluster, you can’t just contact it
from outside. There are a lot of ways to do this, as with everything else in
Kubernetes.  For most, this is managed by an ingress controller. There are
again many options here. I’ve chosen Contour because it’s widely used, easily
supported, and runs on top of Envoy, which I think is a really good piece of
software.</p>

<p>So let’s set up Contour. This is super easy! Do the following</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span>kubectl apply <span class="nt">-f</span> https://projectcontour.io/quickstart/contour.yaml</code></pre></figure>

<p>That grabs a manifest from the Internet and runs it on your cluster. As with
anything like this, caveat emptor. Check it over yourself if you want to be
sure what you are running. Now that it is installed, you can run:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell">kubectl get pods <span class="nt">-n</span> projectcontour</code></pre></figure>

<p>Contour installs in its own namespace, hence the <code class="language-plaintext highlighter-rouge">-n</code>.</p>

<p>That should look <em>something</em> like this:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell">NAME                       READY   STATUS    RESTARTS        AGE
contour-6d4545ff84-kptk2   1/1     Running   1 <span class="o">(</span>5d19h ago<span class="o">)</span>   6d3h
contour-6d4545ff84-qbggg   1/1     Running   1 <span class="o">(</span>5d19h ago<span class="o">)</span>   6d3h
envoy-l5sqg                2/2     Running   2 <span class="o">(</span>5d19h ago<span class="o">)</span>   6d3h</code></pre></figure>

<p>Obviously yours will have been running for much less time.</p>

<p>That’s all we need to do here.</p>

<h2 id="storage">Storage</h2>

<p>We installed the <code class="language-plaintext highlighter-rouge">hostpath</code> plugin so we can make persisntent volumes available
to pods. But we don’t really have control over where it puts the directories
that contain each volume. We can control that by creating a <code class="language-plaintext highlighter-rouge">storageClass</code> that
specifies it, and then specifying that storage class when creating pods. This
is how you might do that. If you don’t care, just skip this step and don’t
specify the storage class later.</p>

<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="nn">---</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">StorageClass</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">storage.k8s.io/v1</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">ssd-hostpath</span>
<span class="na">provisioner</span><span class="pi">:</span> <span class="s">microk8s.io/hostpath</span>
<span class="na">reclaimPolicy</span><span class="pi">:</span> <span class="s">Delete</span>
<span class="na">parameters</span><span class="pi">:</span>
  <span class="na">pvDir</span><span class="pi">:</span> <span class="s">/opt/k8s-storage</span>
<span class="na">volumeBindingMode</span><span class="pi">:</span> <span class="s">WaitForFirstConsumer</span></code></pre></figure>

<p>This will create a <code class="language-plaintext highlighter-rouge">storageClass</code> called <code class="language-plaintext highlighter-rouge">ssh-hostpath</code> and all created volumes
will live in <code class="language-plaintext highlighter-rouge">/opt/k8s-storage</code>. You can, of course, specify what you like here
instead. You might look into the <code class="language-plaintext highlighter-rouge">reclaimPolicy</code> if you want to do something
else as well.</p>

<h2 id="setting-up-certs">Setting Up Certs</h2>

<p>If you are going to expose stuff to the Internet you will probably want to use
HTTPS, which means you will need to have certificates. We installed
<code class="language-plaintext highlighter-rouge">cert-manager</code> earlier, but we need to set it up. We’ll use <code class="language-plaintext highlighter-rouge">letsencrypt</code> certs
so that means we’ll need a way to respond to the ACME challenges used to
authenticate our ownership of the website in question. There are a few ways to
do this, including DNS records, but we’ll use the HTTP method. <code class="language-plaintext highlighter-rouge">cert-manager</code>
will automatically start up an <code class="language-plaintext highlighter-rouge">nginx</code> instance and map the right ingress for
this to work! There are just a few things we need to do.
<img style="float: right; width: 20%; padding-right: 1%;" src="/images/letsencrypt.webp" alt="JFrog Logo" /></p>

<p>We’ll first use the staging environment for <code class="language-plaintext highlighter-rouge">letsencrypt</code> so that you
don’t run off the rate limit for new certs while messing around.</p>

<p>You’ll want to install both of these, substituting your own email address for
the placeholder:</p>

<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="nn">---</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">cert-manager.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">ClusterIssuer</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">letsencrypt</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">acme</span><span class="pi">:</span>
    <span class="c1"># You must replace this email address with your own.</span>
    <span class="c1"># Let's Encrypt will use this to contact you about expiring</span>
    <span class="c1"># certificates, and issues related to your account.</span>
    <span class="na">email</span><span class="pi">:</span> <span class="s1">'</span><span class="s">&lt;your</span><span class="nv"> </span><span class="s">email</span><span class="nv"> </span><span class="s">here&gt;'</span>
    <span class="na">server</span><span class="pi">:</span> <span class="s">https://acme-v02.api.letsencrypt.org/directory</span>
    <span class="na">privateKeySecretRef</span><span class="pi">:</span>
      <span class="c1"># Secret resource that will be used to store the account's private key.</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">letsencrypt-account-key</span>
    <span class="c1"># Add a single challenge solver, HTTP01 using nginx</span>
    <span class="na">solvers</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">http01</span><span class="pi">:</span>
        <span class="na">ingress</span><span class="pi">:</span>
          <span class="na">class</span><span class="pi">:</span> <span class="s">contour</span></code></pre></figure>

<p>Write this to a file called <code class="language-plaintext highlighter-rouge">letsecrypt.yaml</code> and the run <code class="language-plaintext highlighter-rouge">kubectl apply -f
letsencrypt.yaml</code>.</p>

<p>Do the same for this file, naming it <code class="language-plaintext highlighter-rouge">letsencrypt-staging.yaml</code>.</p>

<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">cert-manager.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">ClusterIssuer</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">letsencrypt-staging</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">cert-manager</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">acme</span><span class="pi">:</span>
    <span class="na">email</span><span class="pi">:</span> <span class="s1">'</span><span class="s">&lt;your</span><span class="nv"> </span><span class="s">email</span><span class="nv"> </span><span class="s">here&gt;'</span>
    <span class="na">privateKeySecretRef</span><span class="pi">:</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">letsencrypt-staging</span>
    <span class="na">server</span><span class="pi">:</span> <span class="s">https://acme-staging-v02.api.letsencrypt.org/directory</span>
    <span class="na">solvers</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">http01</span><span class="pi">:</span>
        <span class="na">ingress</span><span class="pi">:</span>
          <span class="na">class</span><span class="pi">:</span> <span class="s">contour</span></code></pre></figure>

<p>At this point, we are finally ready to deploy something!</p>

<h2 id="deploying-nginx">Deploying Nginx</h2>

<p>To demonstrate how to deploy something and get certs, we’ll deploy an <code class="language-plaintext highlighter-rouge">nginx</code>
service to serve files from a persistent volume mounted from the host. I won’t
walk through this whole thing, it’s outside the scope of this post. But, you
can read through the comments to understand what is happening.</p>

<p>Write the following into a file called <code class="language-plaintext highlighter-rouge">nginx.yaml</code>.</p>

<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="nn">---</span>
<span class="c1"># We need a volume to be mounted on the ssd-hostpath we created earlier.</span>
<span class="c1"># You can add content here on the disk of the actual instance and it</span>
<span class="c1"># will be visible inside the pod.</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">PersistentVolumeClaim</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">nginx-static</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">storageClassName</span><span class="pi">:</span> <span class="s">ssd-hostpath</span>
  <span class="na">accessModes</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">ReadWriteMany</span><span class="pi">]</span>
  <span class="c1"># Configure the amount you want here. This is 1 gigabyte.</span>
  <span class="na">resources</span><span class="pi">:</span> <span class="pi">{</span> <span class="nv">requests</span><span class="pi">:</span> <span class="pi">{</span> <span class="nv">storage</span><span class="pi">:</span> <span class="nv">1Gi</span> <span class="pi">}</span> <span class="pi">}</span>
<span class="nn">---</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">apps/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Deployment</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">labels</span><span class="pi">:</span>
    <span class="na">app</span><span class="pi">:</span> <span class="s">nginx</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">replicas</span><span class="pi">:</span> <span class="m">1</span>
  <span class="na">selector</span><span class="pi">:</span>
    <span class="na">matchLabels</span><span class="pi">:</span>
      <span class="na">app</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">template</span><span class="pi">:</span>
    <span class="na">metadata</span><span class="pi">:</span>
      <span class="na">labels</span><span class="pi">:</span>
        <span class="na">app</span><span class="pi">:</span> <span class="s">nginx</span>
    <span class="na">spec</span><span class="pi">:</span>
      <span class="c1"># This is not needed if you are using a public image like</span>
      <span class="c1"># nginx. It's here for reference for your own apps.</span>
      <span class="na">imagePullSecrets</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">image-registry-credentials</span>
      <span class="na">containers</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">nginx</span>
        <span class="na">image</span><span class="pi">:</span> <span class="s">nginx:latest</span>
        <span class="na">ports</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">containerPort</span><span class="pi">:</span> <span class="m">80</span>
        <span class="na">volumeMounts</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">mountPath</span><span class="pi">:</span> <span class="s2">"</span><span class="s">/usr/share/nginx/html"</span>
          <span class="na">name</span><span class="pi">:</span> <span class="s">nginx-static</span>
      <span class="na">volumes</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">nginx-static</span>
        <span class="na">persistentVolumeClaim</span><span class="pi">:</span>
          <span class="na">claimName</span><span class="pi">:</span> <span class="s">nginx-static</span>
<span class="nn">---</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Service</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">nginx</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">selector</span><span class="pi">:</span>
    <span class="na">app</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">ports</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">protocol</span><span class="pi">:</span> <span class="s">TCP</span>
    <span class="na">port</span><span class="pi">:</span> <span class="m">80</span>
    <span class="na">targetPort</span><span class="pi">:</span> <span class="m">80</span>
  <span class="na">sessionAffinity</span><span class="pi">:</span> <span class="s">None</span>
  <span class="na">type</span><span class="pi">:</span> <span class="s">ClusterIP</span>
<span class="nn">---</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">networking.k8s.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Ingress</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">labels</span><span class="pi">:</span>
    <span class="na">app</span><span class="pi">:</span> <span class="s">nginx</span>
  <span class="na">annotations</span><span class="pi">:</span>
    <span class="na">cert-manager.io/cluster-issuer</span><span class="pi">:</span> <span class="s">letsencrypt-staging</span>
    <span class="c1"># You will probably want to re-run this manifest with this</span>
    <span class="c1"># set to true later. For now it *must* be false, or the</span>
    <span class="c1"># ACME challenge will fail: you don't have a cert yet!</span>
    <span class="na">ingress.kubernetes.io/force-ssl-redirect</span><span class="pi">:</span> <span class="s2">"</span><span class="s">false"</span>
    <span class="c1"># This is important because it tells Contour to handle this</span>
    <span class="c1"># ingress.</span>
    <span class="na">kubernetes.io/ingress.class</span><span class="pi">:</span> <span class="s">contour</span>
    <span class="na">kubernetes.io/tls-acme</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">tls</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">secretName</span><span class="pi">:</span> <span class="s1">'</span><span class="s">&lt;a</span><span class="nv"> </span><span class="s">meaningful</span><span class="nv"> </span><span class="s">name</span><span class="nv"> </span><span class="s">here&gt;'</span>
    <span class="na">hosts</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s1">'</span><span class="s">&lt;your</span><span class="nv"> </span><span class="s">hostname&gt;'</span>
  <span class="na">rules</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">host</span><span class="pi">:</span> <span class="s1">'</span><span class="s">&lt;your</span><span class="nv"> </span><span class="s">hostname&gt;'</span>
    <span class="na">http</span><span class="pi">:</span>
      <span class="na">paths</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">pathType</span><span class="pi">:</span> <span class="s">Prefix</span>
        <span class="na">path</span><span class="pi">:</span> <span class="s">/</span>
        <span class="na">backend</span><span class="pi">:</span>
          <span class="na">service</span><span class="pi">:</span>
            <span class="na">name</span><span class="pi">:</span> <span class="s">nginx</span>
            <span class="na">port</span><span class="pi">:</span>
              <span class="na">number</span><span class="pi">:</span> <span class="s">80</span></code></pre></figure>

<p>Read through this before you run it, there are a few things you will need to
change in the bottom section. There are a ton of other ways to deploy this but
this is enough to get it running and show that your setup is working.</p>

<p>Now just run <code class="language-plaintext highlighter-rouge">kubectl apply -f nginx.yaml</code>. Following a successful application,
you should be able to run <code class="language-plaintext highlighter-rouge">kubectl get deployments,services,ingress</code> and get
something back like:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell">NAME                   READY   UP-TO-DATE
deployment.apps/nginx  1/1     1

NAME           TYPE        CLUSTER-IP    EXTERNAL-IP   PORT<span class="o">(</span>S<span class="o">)</span>   AGE
service/nginx  ClusterIP   10.152.183.94 &lt;none&gt;        80/TCP    5d1h

NAME          CLASS    HOSTS      ADDRESS  PORTS AGE
ingress ...   &lt;none&gt;   some-host  1.2.3.4  80    5d1h</code></pre></figure>

<p>This has <em>also</em> created a certificate request and begun the ACME challenge. You
can see that a certificate has been requested and check status by running
<code class="language-plaintext highlighter-rouge">kubectl get certificates</code>.</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell">k8 get certificates
NAME      READY   SECRET    AGE
my-cert   True    my-cert   4d6h</code></pre></figure>

<p>Once the ACME challenge has succeeded you can see <code class="language-plaintext highlighter-rouge">Ready: True</code>. If something
goes wrong here, you can follow <a href="https://cert-manager.io/docs/troubleshooting/acme/">this very good troubleshooting
guide</a>.</p>

<p>You can test against your site with <code class="language-plaintext highlighter-rouge">curl --insecure</code> since the cert from the
staging environment is not legit.</p>

<h2 id="final-steps">Final Steps</h2>

<p>Once that all works, I recommend re-running against the production
<code class="language-plaintext highlighter-rouge">letsencrypt</code> by changing the <code class="language-plaintext highlighter-rouge">nginx.yaml</code> to reference it and re-applying. You
can watch the ACME challenge by inspecting the pod that will be created to run
the ACME nginx.</p>

<p>That’s pretty much it. Kubernetes is huge and infintitely configurable, so
there are a million other things you could do. But it’s up and running now.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="kubernetes" /><category term="devops" /><category term="infrastructure" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Events, Event Sourcing, and the Path Forward</title><link href="https://relistan.com/event-sourcing-and-event-bus" rel="alternate" type="text/html" title="Events, Event Sourcing, and the Path Forward" /><published>2022-01-12T00:00:00-05:00</published><updated>2022-01-12T00:00:00-05:00</updated><id>https://relistan.com/event-sourcing-and-event-bus</id><content type="html" xml:base="https://relistan.com/event-sourcing-and-event-bus"><![CDATA[<p><img src="/images/tin_can_telephone.png" alt="TinCanTelephone" /></p>

<p>Distributed systems pose all kinds of challenges. And we’ve built them in the
web age, when the tech of the wider Internet is what we use in microcosm to
build the underpinnings of our own systems. Our industry has done somersaults
to try to make these systems work well with synchronous calls built on top of
HTTP. This is working at scale for a number of companies and that’s just fine.
But if you were to start from scratch is that what you would build? A few years
ago, we had that opportunity and we decided, no, that’s not what we would
build. Instead, we built a distributed, asynchronous system centered around an
event bus, and it has been one of the best decisions we’ve made. Most of the
things that make a service architecture painful are either completely
alleviated or mostly so, with only a few tradeoffs. Here’s what we’ve built <a href="https://community.com">at
Community</a>, and some of what we have learned.</p>

<h2 id="beyond-the-monolith">Beyond The Monolith</h2>

<p>The advice du jour is that you should start with a monolith because it allows
you to iterate quickly, change <em>big</em> things more easily, and avoid any serious
consistency issues—assuming you back your service with an ACID-compliant DB.
That’s good advice that I also give to people, and we did that. That scales
pretty well, but becomes an issue as you grow the team and need to iterate on
independent work streams.</p>

<h3 id="queuing">Queuing</h3>

<p>But what’s the next step? For us, the next step was to stop building things in
the monolith and to build some async services that could work semi-autonomously
alongside it. Our platform is a messaging platform so this was already a
reasonably good fit: messages aren’t delivered synchronously and some parts of
the workflow can operate like a pipeline.</p>

<p>We needed at least a queuing system to do that, something that would buffer
calls between services and which would guarantee reliable delivery. We are
primarily an Elixir shop so we picked RabbitMQ because of the good driver
support on our stack: RabbitMQ is written in Erlang and Elixir runs on the same
VM and can leverage any Erlang libraries. This has turned out to be a really
good choice for the long term. RabbitMQ is super reliable, can have very good
throughput, is available in various hosted forms, and has a lot of different
topologies and functionality that make it a Swiss army knife of async systems.
We paid a 3rd party vendor to host it and began building on top of it.</p>

<p>Initially we used a very common pattern and just queued work for other
services, using JSON payloads. This was great for passing things between
services where fire-and-forget was adequate. Being able to rely on delivery
once RabbitMQ accepted the message from the publisher means you don’t deal with
retries on the sender side, you almost never lose messages, and the consumer of
the messages can determine how it wants retries to be handled. Deploys never
interrupt messaging. A service can treat each message as a transaction and only
ack the message once work has been completed successfully. All good stuff.</p>

<p>But the core data and associated models was/were still locked up inside the
monolith. And we needed to access that data from other services fairly often.
The first pass was to just look at the messages passed from the monolith to
other services and do a local DB call to enrich them with required fields
before passing them on. That works for a few cases just fine.</p>

<h3 id="other-paradigms">Other Paradigms</h3>

<p>We built other kinds of async messaging between services on top of those same
ad hoc JSON messages, knowing full well that wasn’t what we wanted long term,
but learning about the interaction patterns, and getting our product into the
market.</p>

<p>But, eventually you litter the code with DB enrichment calls and other
complexity. And with no fixed schemas, the JSON messaging rapidly outscales
your ability to reason about it. Instead, wouldn’t it be nice if a new service
could also get a copy of those same messages? And wouldn’t it be really great
if the people writing that new service didn’t have to talk to the people
emitting those messages to make sure the schema wouldn’t change on them? And
what if you could get a history of all the major things that ever happened in
the system in the form of messages? And maybe a new service could have access
to that history to bootstrap it?</p>

<h3 id="events">Events!</h3>

<p>Yes, to all of the above. That’s what a truly event-based system offers. And
that’s what we transformed this async system into.</p>

<p>Building the async system in the first place made this much easier and I want
to shout out to <a href="https://github.com/kocitomas">Tomas Koci</a>, <a href="https://github.com/idlehands">Jeffrey
Matthias</a>, and <a href="https://github.com/JoeMerriweather-Webb">Joe
Merriweather-Webb</a> who designed and
built most of that and who have made many contributions to the events system as
well. Once we were in the market with our product, we all agreed it was time
for the next phase.</p>

<p>In mid-2019, <a href="https://github.com/whatyouhide">Andrea Leopardi</a>, <a href="https://github.com/rolandtritsch">Roland
Tritsch</a>, and I met up in Dublin and plotted
the course for the next year or so. The plans from that meeting turned into the
structure of the events system we have now. A lot of people have contributed
since! This has been a big team effort from the folks in <a href="https://engineering.community.com/">Community
Engineering</a>. I have attempted to name
names here wherever possible, but there are a million contributions that have
been important.</p>

<p>Since building the bus, we’ve grown to about 146 services running in
production, of which 106 are core business services (61 Elixir, 20 Go, 7
Python, remainder 3rd party or other tech). Most core business logic lives in
Elixir with Go in a supporting role, and Python focused on data science. This
is nearly 2 services per engineer in the company. On most stacks that would be
an enormous burden. On our system, as described below, it’s fairly painless.
We still have the monolith, but it’s a lot smaller now. It gets smaller all the
time, thanks to the drive of <a href="https://github.com/idlehands">Jeffrey Matthias</a>, <a href="https://github.com/GeoffreyPS">Geoff
Smith</a>, <a href="https://github.com/lmarlow">Lee
Marlow</a>, and <a href="https://github.com/joeLepper">Joe
Lepper</a> among others.</p>

<p>So, back to the story…</p>

<p><img src="/images/envelope.png" alt="envelope" /></p>

<h2 id="the-public-event-bus">The Public Event Bus</h2>

<p>The next step was the move to a system with a public event bus. We made some
pretty good decisions but still live with a few small mistakes. So I’ll
describe a simpler version of what we have now and gloss over the iterations on
getting to this points, and I’ll call out mistakes later.</p>

<p>If you aren’t familiar with the idea of an event bus, it boils down to this:
any service can listen on the bus for events that happen anywhere in the
system, and then do whatever they need to do based on the event that <em>happened</em>
in the system. Services that do important things publish those events in a
defined schema so that anyone can use them as they need. You archive all of
those events and make them arbitrarily queryable and replayable. To be clear
here about what we mean when we say “events”: <code class="language-plaintext highlighter-rouge">Events</code> are something that
happened, <code class="language-plaintext highlighter-rouge">Commands</code> are a request for something to happen. We started with
only events, and that was a good choice. That approach allowed us to more
carefully iterate on the events side so that we didn’t make the same mistakes
in two places at once.</p>

<p>For other interactions between services we built an async RPC functionality
over RabbitMQ that essentially provides a very lightweight Protobuf wrapper
around arbitrary data. This enabled us to largely punt on <code class="language-plaintext highlighter-rouge">Commands</code>
implementation while we got good at events. It allowed us to identify sensible
practices in a private way between services before making those commands
public system-wide.</p>

<p>So let’s talk about the event bus and event store since that’s the core of the
system.</p>

<h3 id="system-overview">System Overview</h3>

<p>Our public bus is RabbitMQ, with a common top-level exchange. We separated it
into two clusters: one built around text messages (e.g. SMS) that we process
and the main one around events that are the real core of the system. This
allowed us to have very low latency on core data events while allowing more
latency on the high throughput messaging cluster. You could run it on on a
single cluster, but we have enough messaging throughput that separation was a
good choice. It also divides the fault domain along lines that make sense for
our system.</p>

<p>We publish events to the main bus (a <code class="language-plaintext highlighter-rouge">topic</code> exchange), using event type as the
routing key. Services that want to subscribe to them do so. Those services may
then event-source a projection into a DB of their own based on those events, or
simply take action as required. DB tech is whatever the service requires. We
have Postgres, Vitess (MySQL), Redis, Cassandra, ElasticSearch, etc. For
services that do event source, we have standardized a set of rules about how
they must interact with the bus, how they handle out-of-order events,
duplicates, etc. We have a
<a href="https://www.thoughtworks.com/radar/techniques/lightweight-architecture-decision-records">LADR</a>
that defines how this must work. The technical side of this is built into a
shared Elixir library that most of the services use. This wraps the excellent
<a href="https://github.com/dashbitco/broadway">Broadway library</a>, the <a href="https://github.com/dashbitco/broadway_rabbitmq">Broadway AMQP
integration</a>, and our generated
Protobuf library containing the schemas. It provides things like validation and
sane RabbitMQ bindings, and publishes an events manifest we can use to build
maps of who produces and consumes which events. <a href="https://github.com/dselans">Dan
Selans</a> worked for us back then, and built a
frontend that makes those manifests human consumable, and draws an events map.
This is very useful!</p>

<p>Because some of the services in our system are written in Go, we have built
some of the same logic in that language and leverage the
<a href="https://benthos.dev">Benthos</a> project (which we sponsor) for the work
pipelining, similar to how we use Broadway in Elixir. Benthos is an
unbelievable jack-of-all-trades that you should take a look at if you don’t
know it. We additionally build all the Protobuf in Python for use in data
science activities, but don’t have a full events library implementation….
yet.</p>

<p>We archive everything that is published on the bus and put it into the event
store. This then enables replays.</p>

<h3 id="sourcing">Sourcing</h3>

<p>When we start a new service that needs to source events, we bootstrap it from a
replay of historical events and it writes them to its DB using the same code it
will use in production. Because our services must handle out of order events,
we can generally replay any part of the history necessary at any future point.
Simple rules about idempotency and a little bit of state keeping solve this out
of order handling for the 95% case. The remaining 5% tend to be one-off
solutions for each use case.</p>

<h3 id="entropy">Entropy</h3>

<p>Services now have their own projection(s) of what things look like from the
events they have consumed. Because this is so distributed, even with RabbitMQ
ensuring deliverability, there are still many ways the system could introduce
drift.</p>

<p>Underpinning all of the other mechanisms is another simple rule: we have worked
hard to guarantee that one event type is published from one service. This
vastly reduces the complexity of working with them.</p>

<p>We handle anti-entropy by several means:</p>

<ol>
  <li>
    <p>We have a fallback for RabbitMQ. If it’s not available for publishing,
 services will fall back to Amazon SQS. This is an extremely rare occurrence,
 but ensures we don’t lose events. We can then play events from SQS into
 RabbitMQ when it comes back up. Thus, services don’t <em>subscribe</em> to SQS.</p>
  </li>
  <li>
    <p>Services <em>must</em> handle all events that are of the type they subscribe to.
 This means any failure to do so is an error that must be handled. This is
 pretty rare in production because it generally gets caught pretty early on,
 and Protobuf and our events library help guarantee correctness.</p>
  </li>
  <li>
    <p>We run <strong>daily replays of all of the previous day’s core events</strong>, on the
 production bus. We don’t replay message events, but all of the core system
 events are replayed. This means that a service has a maximum window of 24
 hours to have missed an event before it sees it again. We’ll be adding 1 hour
 replays or similar in the next few quarters.</p>
  </li>
  <li>
    <p>We run Spark jobs that compare drift on some event-sourced DBs against the
 event store. This allows us to track how widespread any issues may be. We have
 dashboards that let us see what this looks like. It’s extremely small drift,
 and is generally insignificant. Recents runs show that average drift is
 0.005%, which is already very good, but is better than it looks because it
 <em>also</em> reflects changes that happen while the run is in flight. For all
 practical purposes this then simply reflects eventual consistency and in
 absolute numbers is basically zero.</p>
  </li>
</ol>

<h3 id="consistency">Consistency</h3>

<p>We handle consistency by assuming eventual consistency whenever possible.
Where it’s not possible, we do allow querying of a remote service to get data.
Events related to that data should only be published by one service. So it’s
possible to get an authoritative source when strictly necessary. This is done
over a synchronous RPC implementation on top of RabbitMQ with Protobuf thanks
to <a href="https://github.com/patoms">Tom Patterer</a> and <a href="https://github.com/whatyouhide">Andrea
Leopardi</a>. Andrea <a href="https://andrealeopardi.com/posts/rpc-over-rabbitmq-with-elixir/">wrote about this
implementation</a>.</p>

<p>Many of the frontend calls go through the monolith. This then provides some
consistency for the models it contains. We sometimes use the monolith as a
<a href="https://samnewman.io/patterns/architectural/bff/">BFF</a> where necessary, to
provide some consistency. For all other cases we query directly against the
services themselves via an API gateway.</p>

<h3 id="commands">Commands</h3>

<p>Following on the two year success of the event bus, we introduced commands,
which work quite similarly but which express a different idea. Commands are a
request for something in the system to happen (intent). Services may or may not
take action as a result. If they do, they may or may not generate events.
Commands are also public, archived, and replayable. This is so far working
quite well, but runs alongside other interaction methods described below. We’ll
phase out much of the async RPC in favor of Commands now that we have them.</p>

<h3 id="current-state">Current State</h3>

<p>We continue to iterate and improve on this system. Some more improvements and
efficiencies are slated for this year already. We’ll write more about those
when we get there. If you are a person who wants MOAR detail, keep reading
below.  I’ll attempt to describe some of this in more technical detail.</p>

<p><img src="/images/moar.png" alt="lion moaring" /></p>

<h2 id="more-details">More Details</h2>

<p>That gives a pretty good overview. But if you want to know more, read on.</p>

<h3 id="routing-exchanges-queues">Routing, Exchanges, Queues</h3>

<p>We don’t leverage headers for much routing, but there are a few places in the
system that add them and use them (e.g. for a consitent-hash-exchange). But
most routing is by event type and that’s our key. We learned early on that each
service should attach its own exchange to the top-level exchange and hang its
queue off of that. Exchanges are really cheap for RabbitMQ and this allows
services to use a different exchange type if necessary—and it also prevents
rapid connects/disconnects from impacting the main bus. This happened when an
service once went nuts and was bound to the top level. That hasn’t happened
since.</p>

<p>Most services will bind a single durable queue to their exchange to make sure
that they receive events even when down, and the work is spread across the
instances of the service when up. Some services use ephemeral queues that go
away when they are not running. Others use different topologies. RabbitMQ is
very flexible here and has been the bedrock of our implementation.</p>

<h4 id="archiving">Archiving</h4>

<p>Once things are on the bus, we archive every single one. We have a service that
subscribes to every event type on the two buses, aggregates them in memory into
file blobs, then uploads them to Amazon S3 (the “event store”). This archiver
only acks them from Rabbit once they are uploaded, so we don’t lose events.
Those files on S3 are in a run-length-encoded raw Protobuf format, straight
from the wire, grouped into directories by timestamp and event class. Filenames
are generated in a consistent way, and include some hashing of the contents so
that we prevent overwrites.</p>

<p>Like the majority of the services operating on the events store and bus, this
service is written in Elixir and leverages all of the badassery of Erlang’s OTP
and supervision trees to be always working. It doesn’t break.</p>

<h3 id="event-store-iteration">Event Store Iteration</h3>

<p>The main event store is the archive of raw Protobuf that was sent on the wire,
encoded with a run length encoding into blob files that are posted to S3, as
mentioned. After the first year, <a href="https://github.com/patoms">Tom Patterer</a> and
<a href="https://github.com/andrijaa">Aki Colovic</a> here in Europe built Apache Spark
jobs and a Go service to transform those into Parquet files on S3 more or less
as they get written—so there is very little latency between the two. We can
then leverage AWS Athena (Presto) for ad hoc queries, monitoring of events, and
understanding what is there.</p>

<p>And, the ability to query all of history for debugging is an amazing thing that
I don’t think I’d ever want to live without again. It takes a lot of pressure
off of distributed tracing, although we do have that for complex flows on Otel
thanks to <a href="https://github.com/jadencodes">Jaden Grossman</a>, <a href="https://github.com/talpert">Tatsuro
Alpert</a>, and <a href="https://github.com/bradleyd">Bradley
Smith</a>.</p>

<h4 id="replays">Replays</h4>

<p>We built a replayer service that initially could only do replays by event type
and time range, played straight from S3 to a private copy of the bus (“the
replay bus”). Services can hook up to that bus to get the replayed data. We
usually do this with a one-off deploy of the same code. That got us well into
this: it was enough for the first year.</p>

<p>Later on, we built further integration with AWS Athena that allows us to
arbitrarily query events from the event store in Athena from Parquet files on
S3, and also allows for the results of a query to be used in an event replay.
This allows for very targeted bootstrapping of new services, repairing outages,
fixing state when a bug caused a service to behave badly, etc. The ability to
arbitrarily query all of history also helps when looking for any issues in the
event store or your service. Athena is pretty quick, even with several years of
data. Partitioning by event date and type really helps. We actually use our Go
libs from Spark and <a href="https://relistan.com/writing-spark-udfs-in-go">I wrote about how to do
that</a>.</p>

<h4 id="snapshots">Snapshots</h4>

<p>An additional step we took later (thanks to <a href="https://github.com/mosche">Moritz
Mack</a>) was to use Spark to build daily snapshots in
the form of Parquet tables of events… that can also then be replayed. This
also speeds up querying and consistency checking by vastly reducing the amount
of queried data. We currently rebuild those snapshots nightly from all of
history so that there is no drift. We will move to incremental snapshotting at
some point, but Spark and S3 are hugely fast, and we have enough individual
files to run in parallel nicely.</p>

<h2 id="events-and-event-schemas">Events and Event Schemas</h2>

<p>The best decision we made was to use Protobuf for messaging. Protobuf is one
of the only modern encoding systems that had good support across all three of
our core languages. There were two Protobuf libraries in Elixir at the time. We
picked one, then later switched to
<a href="https://github.com/elixir-protobuf/protobuf">elixir-protobuf</a>, to which we are
now major contributors thanks to <a href="https://github.com/whatyouhide">Andrea
Leopardi</a>, <a href="https://github.com/britto">João
Britto</a>, and <a href="https://github.com/ericmj">Eric
Meadows-Jönsson</a>. Using Protobuf means that we can
guarantee compatibility on schemas going forward, have deprecation ability, and
because it is widely supported, we have access to it in all of the toolsets
where we need it. It also converts back and forth to JSON nicely when needed.</p>

<p>Protobuf schemas, unlike Avro, for example, don’t accompany the data payload.
This means that you need to provide them to the software that needs them out of
band. Andrea Leopardi <a href="https://andrealeopardi.com/posts/sharing-protobuf-schemas-across-services/">wrote about how we do
that</a>
so I won’t detail it here. We took most of the pain out of this by designing
the schemas in a way that means most services don’t always have to have the latest
schemas. And because Protobuf allows decoding all of a payload that you <em>do</em> know,
it means as long as we’re only adding fields we don’t have any issues.</p>

<p>To do this, we designed a schema where there is a common <code class="language-plaintext highlighter-rouge">Event</code> base, an
<code class="language-plaintext highlighter-rouge">Envelope</code> with a separate set of guaranteed fields for all events (e.g. <code class="language-plaintext highlighter-rouge">id</code>,
<code class="language-plaintext highlighter-rouge">timestamp</code>, and <code class="language-plaintext highlighter-rouge">type</code>). This allows systems to process events without always
having the very latest schemas unless they need to access the new event type or
new fields.</p>

<h2 id="other-communication-methods">Other Communication Methods</h2>

<p>Events make this all so much easier. But it’s hard to always use them (or
commands). It’s possible but there are just places where you need something
else. We have some get out of jail free cards. When we model new solutions in
our space we push the following hierarchy:</p>

<ol>
  <li>Events and Commands first. If it can be done with events/commands, we do
that.</li>
  <li>Async, stateless RPC over a queue. For things like queueing work to
yourself or another service. Private calls that don’t need archiving or,
replays.</li>
  <li>Synchronous RPC, using queues for call-response</li>
  <li>HTTP as a last resort</li>
</ol>

<p>I believe that in the core system there are only two remaining places that
widely leverage HTTP for things other than serving APIs to frontends.</p>

<h2 id="things-we-learned">Things We Learned</h2>

<p>You don’t build an entire system without running into issues. Here are some
things we learned. They are not strictly ordered but somewhat in order of
importance.</p>

<ul>
  <li>
    <p>Putting events on S3 was a really good decision. It has all the S3
goodnesss, and also unlocks all of the “Big Data” tools like Spark, Athena,
etc.</p>
  </li>
  <li>
    <p>People will find every way possible to publish events that aren’t right
in one way or another. This is no surprise. But really, if you are deciding
where to allocate time, it’s worth all the effort you can muster to put
validation into the publishing side.</p>
  </li>
  <li>
    <p>Being able to throw away your datastore and start over at any time is
powerful. For example: we run ElasticSearch but we never re-index. We just
throw away the cluster and make a new one from a replay thanks to <a href="https://github.com/alecrubin/">Alec
Rubin</a>, Joe Lepper, and <a href="https://github.com/bkjones">Brian
Jones</a>. If you have to swap from one kind of
datastore to another (e.g. Redis -&gt; Postgres) you can just rehydrate the new
store from a replay using the same event sourcing code you would have to
write anyway. Very little migration code.</p>
  </li>
  <li>
    <p>Sometimes you want to represent events as things that occurred. Sometimes
you want to represent them as the current state of something after the
occurrence. Some people will tell you not to do the latter. Making this a
first class behavior and clearly identifying which pattern an event is using
has been super helpful. We do both.</p>
  </li>
  <li>
    <p>Use the wrapper types in Protobuf. One major shortcoming of Protobuf is that
you can’t tell if something is null or the zero value for its type. The
wrappers fix that.</p>
  </li>
  <li>
    <p>If you have to interact with a DB and publish an event, sometimes the
right pattern is to publish the event, and consume it yourself before
writing to your DB. This helps with consistency issues and allows you to
replay your own events to fix bugs. Lee Marlow, Geoff Smith, and Jeffrey
Matthias called this pattern
<a href="https://en.wikipedia.org/wiki/Eating_your_own_dog_food">“dogfooding”</a>.
Sometimes the right pattern is to send yourself a command instead.</p>
  </li>
  <li>
    <p>Protobuf supports custom annotations. Those can be really helpful for
encoding things like which events are allowed on which bus, which actions are
allowed on the event, etc. Especially helpful when building supporting
libraries in more than one language.</p>
  </li>
  <li>
    <p>Daily replays allow you both an anti-entropy method as well as a daily
stress test of the whole system. This has been great for hammering out
issues. It also guarantees that services can deal with out-of-order events.
They get them at least every day. The main gate to rolling it out was fixing
all of the places this wasn’t right. Now it stays right.</p>
  </li>
  <li>
    <p>Event-based systems make reporting so much easier! And data exports. And
external integrations. And webhooks, and a million other things.</p>
  </li>
  <li>
    <p>The event store is a fantastic audit trail.</p>
  </li>
  <li>
    <p>We sometimes rewrite our event store if we’ve messed something up. We save
up a list of the screw-ups and then do it in batch. We leave the old copy on
S3 and make a new root in the bucket (e.g. <code class="language-plaintext highlighter-rouge">v4/</code>). We use Spark to do it. We
don’t remove the old store so it’s there for auditing.</p>
  </li>
  <li>
    <p>Write some tools to make working with your events nicer. We have things to
listen on a bus, to download and decode files from the raw store, etc.</p>
  </li>
  <li>
    <p>Local development seeds are easy when you have events. Just seed the DB with
a local replay of staging/dev data.</p>
  </li>
</ul>

<h2 id="future">Future</h2>

<p>On the roadmap for the next year is to take our existing service cookie cutter
repo and enable it to maintain live event-sourced projections in the most
common format for a few very commonly used event types. We’ll snapshot those
nightly and when standing up a new service, we can start from the latest DB
snapshot and only replay since. This will make things even more efficient.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="events" /><category term="event-sourcing" /><category term="community" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Coordination-free Database Query Sharding with PostgreSQL</title><link href="https://relistan.com/coordination-free-db-query-chunking" rel="alternate" type="text/html" title="Coordination-free Database Query Sharding with PostgreSQL" /><published>2021-07-20T00:00:00-04:00</published><updated>2021-07-20T00:00:00-04:00</updated><id>https://relistan.com/coordination-free-db-query-chunking</id><content type="html" xml:base="https://relistan.com/coordination-free-db-query-chunking"><![CDATA[<p><img src="/images/cherry-pie.png" alt="slice of pie" /></p>

<p>At <a href="https://community.com">Community.com</a>, we had a problem where a bunch of
workers needed to pick up and process a large amount of data from the same DB
table in a highly scalable manner, with high throughput. We wanted to be able to:</p>

<ol>
  <li>Not require any coordination between the workers</li>
  <li>Be able to arbitrarily change the amount each worker would pick up</li>
  <li>Not have to shard the table</li>
</ol>

<p>I came up with a solution that has been in production, at scale, for quite
awhile now. While it was my design, other great engineers at Community deserve
the credit for the excellent implementation and for rounding off some of the
rough edges. It was a team effort! There are undoubtedly other ways to solve
this problem, but I thought this was pretty interesting and I myself have never
seen anyone else do it before. I am not claiming it’s entirely novel. But this
is what we did to nicely solve this problem.</p>

<h2 id="the-problem">The Problem</h2>

<p>We have workers that process outbound SMS campaigns. They need to be able to
take a single message and dispatch it to a million plus phone numbers. And,
they need to do that at most once for each number. These workers have access
to a data store that maps some data related to the campaign to a set of
recipients. The workers don’t actually do the SMS, but they do the heavy
lifting: the expansion of one message to millions.</p>

<p>We wanted to be able to divide that audience up into chunks of a size large
enough to be efficient for querying and processing, and small enough that a
worker could shut down mid-campaign and not lose anything. Ideally each worker
would pick up an amount of work, crank through it, and then process another
piece of work, without checking in with anyone else.</p>

<p>You could, as we initially did, have a single process read in all the recipient
information from the DB and write batches of IDs into a queue for processing.
That is simple. It worked fine way back in the early days. But it’s very slow.
Single-threaded walking through a DB table is slow. It’s linear time. As that
line gets longer it gets really bad. Further, our whole system also uses UUID4
primary keys, so it’s not trivial to walk a DB table or to provide ranges of
IDs. If you were to use auto-increment integers this is easier, but then you
will have other problems to solve. I’d rather not have those problems.</p>

<h2 id="the-solution">The Solution</h2>

<p>We wanted a way for the single threaded job to be able to rapidly enqueue work
without querying the DB, and for all the querying to happen in the workers, in
parallel, against multiple read replicas. Our whole backbone is built on
RabbitMQ, so this was the natural place to queue work. Our campaign workers are
written in Elixir, like much of the rest of the system, and they can leverage
the awesome Broadway library for processing. So all we needed was a way to
divide up the table in a consistent way, so that given a set of parameters, the
results to any query would be fairly evenly divisible into chunks without knowing
ahead of time exactly how many there would be or which specific rows were assigned
to the worker.</p>

<p>You might think, “partitioning!” but for various reasons that doesn’t make
sense. You could argue positives and negatives of the tech, but the main
blocker was that we need to query the table and get a result in chunks that
will be reasonably evenly distributed <em>no matter what other parameters we want
to query on</em>. So we’re not just ranging over IDs, we’re ranging over the
results of an SQL query that filters on various fields in the row in arbitrary
combinations. And while newer Postgres Hash partitions could maybe be made to
work here, we need way more chunks than you’d ever want partitions. And we want
to arbitrarily assign which workers take which ranges.</p>

<p>To be clear: we will run the same query on multiple workers, with overlapping
queries, delays, etc, without any further coordination in the workers, and
we’ll make sure that none of the results have any overlapping data, without hard
partitioning in Postgres.</p>

<p>So, dividing a keyspace into buckets that can be addressed arbitrarily… that
sounds a lot like a hashing function. Like most problems with distributed
systems, hashing is part of the solution, if not the whole solution! What we
wanted was:</p>

<ol>
  <li>Assign a bucket to every row ahead of time</li>
  <li>Originating job publishes work covering bucket ranges</li>
  <li>Workers consume the work and run the query, supplying a bucket range</li>
  <li>Workers get results covering only those buckets, process them, return to 3</li>
</ol>

<p>The design was to have a fixed number of buckets. This necessarily needed to be
sized by the largest amount that made sense for a worker to process, given the
largest campaign size we expected to see. In the end we chose 1,000 buckets.
Armed with the knowledge of how many buckets there are and how many messages we
need to publish, the originating job could publish the right number of work
requests with the right number of bucket ranges in each. If your campaign only
has 1000 recipients, we’d queue 1 item into RabbitMQ, with a bucket range of
0-999. If you had 1 million recipients, we’d queue up 100 items with 10
shards each.</p>

<p>Distribution isn’t <em>perfect</em> but it’s darn good.</p>

<h2 id="the-details">The Details</h2>

<p>The implementation is surprisingly(?) simple.</p>

<p>How do you turn your DB table into buckets addressable by a hash? Do we store
it in the row? Nope. How about using PostgreSQL’s indexing on the result of a
function? On the insertion of every row, an index gets updated with the result
of the function as the key. Then, when you query with that function, you hit
the index instead, so all the expensive math is done on insert and what is
indexed is just an integer. There are undoubtedly other ways to make this work
in PostgreSQL. You could calculate the bucket in code and write it in a column.
But it’s pretty nice like this because Postgres manages all the annoying bits
and we don’t have to mess with the field anywhere since it exists only in the
index.</p>

<p>We need a hashing function with very good distribution. There are a number of
these, but most are not available inside Postgres. But MD5 is, so that’s what
we’re using.</p>

<p>This is our index:</p>

<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">CONCURRENTLY</span> <span class="o">&lt;</span><span class="k">table_name</span><span class="o">&gt;</span><span class="n">_sharding_md5_modulo_1000_index</span>
<span class="k">ON</span> <span class="o">&lt;</span><span class="k">table_name</span><span class="o">&gt;</span> <span class="p">(</span><span class="k">mod</span><span class="p">(</span><span class="k">abs</span><span class="p">((</span><span class="s1">'x'</span><span class="o">||</span><span class="n">substr</span><span class="p">(</span><span class="n">md5</span><span class="p">(</span><span class="n">id</span><span class="p">::</span><span class="nb">text</span><span class="p">),</span><span class="mi">1</span><span class="p">,</span><span class="mi">16</span><span class="p">))::</span><span class="nb">bit</span><span class="p">(</span><span class="mi">64</span><span class="p">)::</span><span class="nb">bigint</span><span class="p">),</span> <span class="mi">1000</span><span class="p">))</span></code></pre></figure>

<p>That is creating an index where the key to the index for this row is the result
of that function. When we want to query it, we do the following:</p>

<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="n">id</span> <span class="k">FROM</span> <span class="o">&lt;</span><span class="k">table_name</span><span class="o">&gt;</span> <span class="k">WHERE</span> <span class="o">&lt;</span><span class="n">arbitrary</span> <span class="n">query</span> <span class="n">here</span><span class="o">&gt;</span>
<span class="k">AND</span> <span class="k">mod</span><span class="p">(</span><span class="k">abs</span><span class="p">((</span><span class="s1">'x'</span><span class="o">||</span><span class="n">substr</span><span class="p">(</span><span class="n">md5</span><span class="p">(</span><span class="o">?</span><span class="p">::</span><span class="nb">text</span><span class="p">),</span><span class="mi">1</span><span class="p">,</span><span class="mi">16</span><span class="p">))::</span><span class="nb">bit</span><span class="p">(</span><span class="mi">64</span><span class="p">)::</span><span class="nb">bigint</span><span class="p">),</span> <span class="mi">1000</span><span class="p">)</span> <span class="k">BETWEEN</span> <span class="o">?</span> <span class="k">AND</span> <span class="o">?</span></code></pre></figure>

<p>Now those workers can dequeue their work item, run their query, and then stream
messages into RabbitMQ while they process the results of their query. Despite
how that <code class="language-plaintext highlighter-rouge">SELECT</code> may look, no actual math is being done to hash each of those
rows. This is hitting the pre-calculated index. The only hash function math is
done at insert time.</p>

<h2 id="conclusion">Conclusion</h2>

<p>So there you go, a super flexible way to break a table into chunks that can be
combined with any arbitrary query the table can support. I called it
“Coordination-free” in the title. Someone might object to that saying the
coordination is happening in the originating job. It’s true some coordination
is happening there. But it’s not happening in any high throughput part of the
system. Those parts just crank away at scale.</p>

<p>We’ve been running this in production for a year, processing billions of
messages through it. Performance is excellent and while we initially viewed
this as a temporary “hack” to work around a performance problem, it has become
a trusted part of the toolbox. Maybe it fits in your toolbox, too.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="databases" /><category term="community" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Writing Apache Spark UDFs in Go</title><link href="https://relistan.com/writing-spark-udfs-in-go" rel="alternate" type="text/html" title="Writing Apache Spark UDFs in Go" /><published>2020-12-07T00:00:00-05:00</published><updated>2020-12-07T00:00:00-05:00</updated><id>https://relistan.com/writing-spark-udfs-in-go</id><content type="html" xml:base="https://relistan.com/writing-spark-udfs-in-go"><![CDATA[<p><img src="/images/SparkGopher.png" alt="Spark Gopher" /></p>

<p>Apache Spark is a perfect fit for processing large amounts of data. It’s not, 
however, a perfect fit for our language stack at
<a href="https://community.com">Community</a>. We are largely an Elixir shop with a solid
amount of Go, while Spark’s native stack is Scala but also has a Python API.
We’re not JVM people so we use the Python API—via the Databricks platform. For
most of our work, that’s just fine. But, what happens when you want to write a
custom user defined function (UDF) to do some heavy lifting? We could write new
code in Python, or… we could use our existing Go libraries to do the job! This
means we have to wrap up our Go code into a jar file that can be loaded into the
classpath for the Spark job. This is how to do that.</p>

<p>(Note that everything below could equally apply to UDAFs—aggregate functions)</p>

<h2 id="why-did-you-do-this-crazy-thing">Why Did You Do This Crazy Thing?</h2>

<p>You might be wondering how well this works. To put that issue to rest: it works
well and has been effective for us. That being established, let’s talk about
our use case. We have a large amount of data stored in a binary container
format that wraps Protobuf records, stored on AWS S3. Spark is great with S3,
but cannot natively read these files. Complicating things further, schemas for
the Protobuf records need to be kept up to date for all the tools that process
this data in anything but the most trivial way.</p>

<p>Over time we have built a set of battle-tested Go libraries that work on this
data. Furthermore, we already maintain tooling to keep the Protobuf schemas and
Go libraries up to date. It seems natural to leverage all of that goodness in
our Spark jobs:</p>

<ol>
  <li>Battle-tested libs that have been in production for awhile, with good tests.</li>
  <li>We already manage pipelines to keep the schemas up to date for the Go libs.</li>
</ol>

<p>Given the tradeoffs of reimplementing those libraries and the CI jobs, or
wrapping Go code up into jar files for use as Spark UDFs, we did the sane
thing!</p>

<h2 id="spark-udfs">Spark UDFs</h2>

<p>Spark is flexible and you can customize your jobs to handle just about any
scenario. One of the most re-useable ways to work with data in Spark is <a href="https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-udfs.html">user
defined functions
(UDFs)</a>.
These can be called directly inside Spark jobs or Spark SQL. In our case we use
a UDF to transform our custom binary input and Protobuf into something that
Spark more easily understands: JSON.</p>

<p>This means we can do something like this:</p>

<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">pyspark.sql.functions</span> <span class="k">as</span> <span class="n">F</span>
<span class="kn">import</span> <span class="n">pyspark.sql.types</span> <span class="k">as</span> <span class="n">T</span>

<span class="n">spark</span><span class="p">.</span><span class="n">udf</span><span class="p">.</span><span class="nf">registerJavaFunction</span><span class="p">(</span><span class="sh">"</span><span class="s">decodeContent</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">com.community.EventsUDF.Decoder</span><span class="sh">"</span><span class="p">)</span>

<span class="c1"># Define a schema for the JSON fields we want (you must customize)
</span><span class="n">event_schema</span> <span class="o">=</span> <span class="n">T</span><span class="p">.</span><span class="nc">StructType</span><span class="p">([</span>
  <span class="n">T</span><span class="p">.</span><span class="nc">StructField</span><span class="p">(</span><span class="sh">"</span><span class="s">wrapper</span><span class="sh">"</span><span class="p">,</span> <span class="n">T</span><span class="p">.</span><span class="nc">StructType</span><span class="p">([</span>
    <span class="n">T</span><span class="p">.</span><span class="nc">StructField</span><span class="p">(</span><span class="sh">"</span><span class="s">id</span><span class="sh">"</span><span class="p">,</span> <span class="n">T</span><span class="p">.</span><span class="nc">StringType</span><span class="p">()),</span>
    <span class="n">T</span><span class="p">.</span><span class="nc">StructField</span><span class="p">(</span><span class="sh">"</span><span class="s">somefield</span><span class="sh">"</span><span class="p">,</span> <span class="n">T</span><span class="p">.</span><span class="nc">StringType</span><span class="p">()),</span>
    <span class="n">T</span><span class="p">.</span><span class="nc">StructField</span><span class="p">(</span><span class="sh">"</span><span class="s">timestamp</span><span class="sh">"</span><span class="p">,</span> <span class="n">T</span><span class="p">.</span><span class="nc">StringType</span><span class="p">()),</span>
    <span class="n">T</span><span class="p">.</span><span class="nc">StructField</span><span class="p">(</span><span class="sh">"</span><span class="s">anotherfield</span><span class="sh">"</span><span class="p">,</span> <span class="n">T</span><span class="p">.</span><span class="nc">StringType</span><span class="p">())</span>
  <span class="p">]))</span>
<span class="p">])</span>

<span class="n">events_df</span> <span class="o">=</span> <span class="p">(</span>
  <span class="n">spark</span><span class="p">.</span><span class="n">read</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="sh">"</span><span class="s">binaryFile</span><span class="sh">"</span><span class="p">)</span>
  <span class="p">.</span><span class="nf">load</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">s3a://path_to_data/*</span><span class="sh">"</span><span class="p">)</span>
  <span class="p">.</span><span class="nf">withColumn</span><span class="p">(</span><span class="sh">"</span><span class="s">content</span><span class="sh">"</span><span class="p">,</span> <span class="n">F</span><span class="p">.</span><span class="nf">expr</span><span class="p">(</span><span class="sh">"</span><span class="s">decodeContent(content)</span><span class="sh">"</span><span class="p">).</span><span class="nf">cast</span><span class="p">(</span><span class="n">T</span><span class="p">.</span><span class="nc">StringType</span><span class="p">()))</span>
<span class="p">)</span>

<span class="n">parsed_json_df</span> <span class="o">=</span> <span class="p">(</span>
  <span class="p">(</span><span class="n">events_df</span><span class="p">)</span>
  <span class="p">.</span><span class="nf">withColumn</span><span class="p">(</span><span class="sh">"</span><span class="s">json_data</span><span class="sh">"</span><span class="p">,</span> <span class="n">F</span><span class="p">.</span><span class="nf">from_json</span><span class="p">(</span><span class="sh">"</span><span class="s">event</span><span class="sh">"</span><span class="p">,</span> <span class="n">event_schema</span><span class="p">))</span>
  <span class="c1"># Break out part of the JSON into a column
</span>  <span class="p">.</span><span class="nf">withColumn</span><span class="p">(</span><span class="sh">"</span><span class="s">type</span><span class="sh">"</span><span class="p">,</span> <span class="n">F</span><span class="p">.</span><span class="nf">col</span><span class="p">(</span><span class="sh">"</span><span class="s">json_data.yourjsonfield.type</span><span class="sh">"</span><span class="p">))</span>
<span class="p">)</span>

<span class="c1"># Continue working with the dataframe</span></code></pre></figure>

<p>By transforming the data into JSON, we get to work on something that is much
more native to Spark than Protobuf. Because we use the libraries that are
already maintained with up to date bindings, we don’t have to manage that
inside Spark itself. We only have to update our UDF in the jar loaded by the
job.</p>

<h2 id="building-a-spark-udf-in-go">Building a Spark UDF in Go</h2>

<p>Before we look too hard at anything else we need to get Go code into a wrapper that
lets us call it from Java. Then we’ll need to write a little Java glue to wrap
the Go code into the right types to make Spark happy when calling it. Luckily,
for this use case it’s all pretty simple.</p>

<h3 id="wrapping-go-code-into-a-java-library">Wrapping Go Code Into a Java Library</h3>

<p>First, we need some Go code to wrap. The library we talked about was built for
one purpose, but we need a little wrapper function to do just the part we need
to call from Spark. In our case we’ll take a single value from Spark and return
a single value back. Both of those values will be of type <code class="language-plaintext highlighter-rouge">[]byte</code> which makes
things really simple going between Go and Java, where this type interchanges
easily. I made a file called <code class="language-plaintext highlighter-rouge">spark_udf.go</code> that looks like this:</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="k">package</span> <span class="n">eventdecoder</span>

<span class="k">import</span> <span class="p">(</span>
        <span class="s">"bytes"</span>
        <span class="s">"compress/gzip"</span>
        <span class="s">"io/ioutil"</span>

        <span class="s">"github.com/company/yourlib"</span>
<span class="p">)</span>

<span class="k">func</span> <span class="n">Decode</span><span class="p">(</span><span class="n">data</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">)</span> <span class="p">[]</span><span class="kt">byte</span> <span class="p">{</span>
        <span class="n">zReader</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">gzip</span><span class="o">.</span><span class="n">NewReader</span><span class="p">(</span><span class="n">bytes</span><span class="o">.</span><span class="n">NewReader</span><span class="p">(</span><span class="n">data</span><span class="p">))</span>
        <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                <span class="c">// We return an error as the body!</span>
                <span class="k">return</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">(</span><span class="n">err</span><span class="o">.</span><span class="n">Error</span><span class="p">())</span>
        <span class="p">}</span>

        <span class="n">uncompressedData</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">ioutil</span><span class="o">.</span><span class="n">ReadAll</span><span class="p">(</span><span class="n">zReader</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                <span class="c">// We return an error as the body!</span>
                <span class="k">return</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">(</span><span class="n">err</span><span class="o">.</span><span class="n">Error</span><span class="p">())</span>
        <span class="p">}</span>

        <span class="n">output</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">yourlib</span><span class="o">.</span><span class="n">DoSomeWork</span><span class="p">(</span><span class="n">uncompressedData</span><span class="p">)</span> 

        <span class="c">// Handle errors, etc</span>

        <span class="k">return</span> <span class="n">bytes</span><span class="o">.</span><span class="n">Join</span><span class="p">(</span><span class="n">output</span><span class="p">,</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">(</span><span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">))</span>
<span class="p">}</span></code></pre></figure>

<p>Now that we have that, we need to get it into a jar file with all the bindings
that will let Java call into the Go function. As part of the <code class="language-plaintext highlighter-rouge">gomobile</code> efforts
to integrate Go with Android, there are some tools that will handle two way
wrappers between Java and Go.
<a href="https://godoc.org/golang.org/x/mobile/cmd/gobind"><code class="language-plaintext highlighter-rouge">gobind</code></a> will generate Java
bindings for exported functions in a Go module. You could start there and build
up the necessary tools to make this work. Sadly, it’s not at all trivial to get
it into a shape that will build nicely.</p>

<p>After messing around with it for awhile, I found a tool called
<a href="https://github.com/sridharv/gojava"><code class="language-plaintext highlighter-rouge">gojava</code></a> that wraps all the hard parts
from gobind into a single, easy to use tool. It’s not perfect: it does not
appear to be under active development, and does not support Go modules. But, it
makes life so much easier, and because none of this stuff changes that often,
the lack of active development isn’t much of a hindrance here. The ease of use
makes it worth it for us. Getting a working Java jar file is a single step:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">JAVA_HOME</span><span class="o">=</span>&lt;your_java_home&gt; <span class="se">\</span>
        <span class="nv">GO111MODULE</span><span class="o">=</span>off <span class="se">\</span>
        gojava <span class="nt">-v</span> <span class="nt">-o</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/eventdecoder.jar build github.com/&lt;your_module_path&gt;</code></pre></figure>

<p>This will generate a file called <code class="language-plaintext highlighter-rouge">eventdecoder.jar</code> that contains your Go code
and the Java wrappers to call it. Great, right? If you are using Go modules,
just use <code class="language-plaintext highlighter-rouge">go mod vendor</code> before running <code class="language-plaintext highlighter-rouge">gojava</code> to make sure that you have all
your dependencies in a form that <code class="language-plaintext highlighter-rouge">gojava</code> can handle.</p>

<h3 id="adding-the-right-interface">Adding the Right Interface</h3>

<p>But we are not done yet. The Go code in the jar we built does not have the
right interface for a Spark UDF. So we need a little code to wrap it. You could
do this in Scala, but for us Java is more accessible, so I used that. Spark
UDFs can take up to 22 arguments and there are different interfaces defined for
each set of arguments, named <code class="language-plaintext highlighter-rouge">UDF1</code> through <code class="language-plaintext highlighter-rouge">UDF22</code>. In our case we only want
one input: the raw binary. So that means we’ll use the <code class="language-plaintext highlighter-rouge">UDF1</code> interface. Here’s
what the Java wrapper looks like:</p>

<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">package</span> <span class="nn">com.community.EventsUDF</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">org.apache.spark.sql.api.java.UDF1</span><span class="o">;</span>
<span class="kn">import</span> <span class="nn">go.eventdecoder.Eventdecoder</span><span class="o">;</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Decoder</span> <span class="kd">implements</span> <span class="no">UDF1</span><span class="o">&lt;</span><span class="kt">byte</span><span class="o">[],</span> <span class="kt">byte</span><span class="o">[]&gt;</span> <span class="o">{</span>
        <span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="kt">long</span> <span class="n">serialVersionUID</span> <span class="o">=</span> <span class="mi">1L</span><span class="o">;</span>

        <span class="nd">@Override</span>
        <span class="kd">public</span> <span class="kt">byte</span><span class="o">[]</span> <span class="nf">call</span><span class="o">(</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">input</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">Exception</span> <span class="o">{</span>
				<span class="c1">// Call our Go code</span>
                <span class="k">return</span> <span class="nc">Eventdecoder</span><span class="o">.</span><span class="na">Decode</span><span class="o">(</span><span class="n">input</span><span class="o">);</span>
        <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<p>We stick that in our path following the expected Java layout. So if
<code class="language-plaintext highlighter-rouge">spark_udf.go</code> is in the current directory, below it we put the above files in
<code class="language-plaintext highlighter-rouge">java/com/community/EventsUDF/Decoder.java</code>. Note that this needs to match your
package name inside the Java source file.</p>

<h3 id="assembling-it">Assembling It</h3>

<p>We’re almost there! But, we need the Spark jar files that we’ll compile this
against. Our project has a <code class="language-plaintext highlighter-rouge">Makefile</code> (which I’ll share at the end of this
post) that downloads the correct jars and sticks them in <code class="language-plaintext highlighter-rouge">./spark_jars</code>. With
those being present, we can compile the Java code:</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell">javac <span class="nt">-cp</span> <span class="se">\</span>
	spark_jars/spark-core_<span class="si">$(</span>VERSION<span class="si">)</span>.jar:eventdecoder.jar:spark_jars/spark-sql_<span class="si">$(</span>VERSION<span class="si">)</span>.jar <span class="se">\</span>
	java/com/community/EventsUDF/Decoder.java</code></pre></figure>

<p>With that compiled we can just add it to our Jar file and we’re ready to go!</p>

<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nb">cd </span>java <span class="o">&amp;&amp;</span> jar uf ../eventdecoder.jar com/community/EventsUDF/<span class="k">*</span>.class</code></pre></figure>

<p>That will update the jar file qnd insert our class. You can load that up into
Spark and call it as shown at the start of this post.</p>

<h2 id="important-notes">Important Notes</h2>

<p>You need to build this for the architecture where you will run your Spark job!
Spark may be in Scala, and we may have wrapped this stuff up into a Java jar,
but the code inside is still native binary. You may have some success with
<code class="language-plaintext highlighter-rouge">GOOS</code> and <code class="language-plaintext highlighter-rouge">GOARCH</code> settings, but it has not been repeatable for us with
<code class="language-plaintext highlighter-rouge">gojava</code>. We’ve built our UDFs that we use on Databricks’ platform under Ubuntu
18.04 with Go 1.14 and they work great.</p>

<p>Another gotcha that <a href="https://github.com/andrijaa">Aki Colović</a> found when
working with this stuff, is that your Go package <em>cannot contain an underscore</em>
or it makes the Java class loader unhappy. So don’t do that.</p>

<h2 id="a-makefile-to-make-it-simpler">A Makefile to Make it Simpler</h2>

<p>There were a few finicky steps above and this needs to be repeatable. So, I
built a <code class="language-plaintext highlighter-rouge">Makefile</code> to facilitate these UDF builds. It does all the important
steps in one go. As you will see below, we have cached the right versions of
the Spark jars in S3, but you can pull them from wherever you like. If you are
into Java build tools, feel free to use those.</p>

<p>The <code class="language-plaintext highlighter-rouge">Makefile</code> I wrote looks like this. You will likely have to make little
changes to make it work for you.</p>

<figure class="highlight"><pre><code class="language-makefile" data-lang="makefile"><span class="c"># Makefile to build and test the eventdecoder.jar containing the
# Apache Spark UDF for decoding the events files in the Community
# event store.
</span><span class="nv">SCALA_VERSION</span> <span class="o">:=</span> 2.11
<span class="nv">SPARK_VERSION</span> <span class="o">:=</span> 2.4.5
<span class="nv">VERSION</span> <span class="o">:=</span> <span class="p">$(</span>SCALA_VERSION<span class="p">)</span>-<span class="p">$(</span>SPARK_VERSION<span class="p">)</span>
<span class="nv">BUCKET_PATH</span> <span class="o">?=</span> somewhere-in-s3/spark
<span class="nv">JAVA_HOME</span> <span class="o">?=</span> /usr/lib/jvm/java-8-openjdk-amd64
<span class="nv">TEMPFILE</span> <span class="o">:=</span> <span class="p">$(</span>shell <span class="nb">mktemp</span><span class="p">)</span>

<span class="nl">all</span><span class="o">:</span> <span class="nf">../vendor udf</span>

<span class="nl">../vendor</span><span class="o">:</span>
	go mod vendor

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">gojava</span>
<span class="nl">gojava</span><span class="o">:</span>
	go get <span class="nt">-u</span> github.com/sridharv/gojava
	go <span class="nb">install </span>github.com/sridharv/gojava

<span class="nl">eventdecoder.jar</span><span class="o">:</span> <span class="nf">gojava</span>
	<span class="nv">JAVA_HOME</span><span class="o">=</span><span class="p">$(</span>JAVA_HOME<span class="p">)</span> <span class="se">\</span>
		<span class="nv">GO111MODULE</span><span class="o">=</span>off <span class="se">\</span>
		gojava <span class="nt">-v</span> <span class="nt">-o</span> <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/eventdecoder.jar build github.com/my-package/eventdecoder

<span class="nl">spark_jars/spark-sql_$(VERSION).jar</span><span class="o">:</span>
	<span class="nb">mkdir</span> <span class="nt">-p</span> spark_jars
	aws s3 <span class="nb">cp </span>s3://<span class="p">$(</span>BUCKET_PATH<span class="p">)</span>/spark-sql_<span class="p">$(</span>VERSION<span class="p">)</span>.jar spark_jars

<span class="nl">spark_jars/spark-core_$(VERSION).jar</span><span class="o">:</span>
	<span class="nb">mkdir</span> <span class="nt">-p</span> spark_jars
	aws s3 <span class="nb">cp </span>s3://<span class="p">$(</span>BUCKET_PATH<span class="p">)</span>/spark-core_<span class="p">$(</span>VERSION<span class="p">)</span>.jar spark_jars

<span class="nl">spark-binaries</span><span class="o">:</span> <span class="nf">spark_jars/spark-core_$(VERSION).jar spark_jars/spark-sql_$(VERSION).jar</span>

<span class="c"># Build the UDF code and insert into the jar
</span><span class="nl">udf</span><span class="o">:</span> <span class="nf">spark-binaries eventdecoder.jar</span>
	javac <span class="nt">-cp</span> spark_jars/spark-core_<span class="p">$(</span>VERSION<span class="p">)</span>.jar:eventdecoder.jar:spark_jars/spark-sql_<span class="p">$(</span>VERSION<span class="p">)</span>.jar java/com/community/EventsUDF/Decoder.java
	<span class="nb">cd </span>java <span class="o">&amp;&amp;</span> jar uf ../eventdecoder.jar com/community/EventsUDF/<span class="k">*</span>.class

<span class="nl">.PHONY</span><span class="o">:</span> <span class="nf">clean</span>
<span class="nl">clean</span><span class="o">:</span>
	<span class="nb">rm</span> <span class="nt">-f</span> eventdecoder.jar</code></pre></figure>

<h2 id="conclusion">Conclusion</h2>

<p>What may seem like a slightly crazy idea turned out to be super useful and
we’ve now built a few UDFs based on this original. Re-using Go libraries in
your Spark UDF is not only possible, it can be pretty productive. Give it a
shot if it makes sense for you.</p>

<h2 id="credits">Credits</h2>

<p>The original UDF project at Community was done by <a href="https://github.com/alecrubin">Alec
Rubin</a> and me, while improvements were made by
<a href="https://github.com/andrijaa">Aki Colović</a>, and <a href="https://github.com/patoms">Tom
Patterer</a>.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="go" /><category term="java" /><category term="spark" /><category term="community" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Hello World on the W65C265 (65816) from macOS</title><link href="https://relistan.com/wdc-w65c265-mensch-hello-world" rel="alternate" type="text/html" title="Hello World on the W65C265 (65816) from macOS" /><published>2020-05-31T00:00:00-04:00</published><updated>2020-05-31T00:00:00-04:00</updated><id>https://relistan.com/wdc-w65c265-mensch-hello-world</id><content type="html" xml:base="https://relistan.com/wdc-w65c265-mensch-hello-world"><![CDATA[<p><img src="./images/Mensch-sxb.jpg" alt="Mensch SXB" /></p>

<p>I have been lucky enough to be healthy so far and have been using lockdown time
to get back into electronics and microcontrollers, a hobby that is very
compatible with being stuck at home. I have a bunch of Arduino, STM32, ESP8266,
and ESP32 boards, but something was calling me more to the retro side of
things. My first computer was an Apple //e and I learned to do some 6502
assembly when I was a kid. I later upgraded to an Apple IIgs which was based on
the 65816 (which also powered the Super Nintendo). When I was about 13, I saved
up and bought myself Orca/M which was an assembler targeting the 65816 on the
IIgs. I learned a small amount then before moving on to Orca/C. But I always
regretted not learning more. So I wanted something along those lines to play
with now.</p>

<p>There are a few options here. There is always emulation. This is a reasonable
option, but then I don’t get to play with the electronics of the computer. I
could buy a vintage IIgs but they are expensive and take up a bunch of space.
Neither of these appealed heavily, though I’m not ruling out a IIgs in the
future.</p>

<p>After a little research, I ordered a Mensch single-board computer from Western
Design Center, the keepers of the 6502 flame. Bill Mensch was on the original
6502 design team and is the force behind the 65C02 and the 65816. His company
still supplies the 65C02, 65C816, and a few other chips. One of those is the
W65C265, which is a microcontroller based on the 65816. This is not like the
Arduino or STM32 blue pill or other microcontrollers in the sense that it has
no on board flash. What it <em>does</em> have is an on board machine code monitor and
a library of code in ROM that makes accessing the peripherals pretty easy. It
has an interesting mix of peripherals, too, including 8 16bit timers, 4 UARTs,
and two tone generators that emit digital sine waves. Also quite unlike many
microcontrollers, it is set up to easily add off-board RAM. You do kind of need
to if you’re doing much, also, because it has only 520 bytes of RAM, about a
quarter of which is occupied by the monitor. The board is pretty inexpensive
fun at $18 plus shipping and they happily shipped it to me in Ireland. You can
find one for yourself <a href="https://wdc65xx.com/shop/">on the WDC site</a>.</p>

<p>I started looking around and found that there is actually decent tool support
for this board even though not a lot of people seem to be playing with it or
documenting it recently. The chip has been around since the early 90’s, though,
and the programming manuals for the 65816 and 65C02 all apply so there is a lot
of documentation to start from. WDC supplies tools (compilers, assemblers) for
the chip but I was unable to get the download link to work from email on their
site. I have notified them. In the meantime I started looking for alternatives,
and it turns out there are a few good options. I’ll document the route I chose
here to get my first program working.</p>

<h2 id="getting-started">Getting Started</h2>

<p>I found a series of <a href="https://www.mikekohn.net/micro/modern_6502.php">blog
posts</a> by Mike Kohn talking
about getting started with the board. I ended up following along with what he
did to make things work. There were a few bumps along the way to get it running
on macOS.</p>

<p>First, to get code onto the board you need to be able to upload to the board
over serial. Some microcontoller prototyping boards have USB and some have
serial interfaces. The Mensch is in the second category. I have had an FTDI
cable for a long time so I just hooked that up and got going. If you don’t have
one, you’ll want to pick up one of the many options available. <a href="https://www.sparkfun.com/products/9716">Here’s a
Sparkfun example</a> an <a href="https://www.adafruit.com/product/70">Adafruit
example</a> or one from
<a href="https://www.banggood.com/FT232RL-FTDI-USB-To-TTL-Serial-Converter-Adapter-Module-p-917226.html">Banggood</a>.</p>

<p>Note that the ground pin in on the left when facing the board, so the FTDI
cable goes on upside down. See the photo at the top of this post.</p>

<p>Next, we need a way to get things to the board. A gentleman named Joe Davisson
wrote a tool called <a href="https://github.com/JoeDavisson/EasySXB">EasySXB</a> that
makes it easy to load code into the Mensch. I was not able to find available
binaries for macOS, however. So I ended up pulling the source and then having
to patch and upgrade dependencies to get it to compile under clang on macOS. I
have opened a PR to get the changes upstreamed, but in the meantime, macOS High
Sierra and newer binaries are <a href="https://github.com/relistan/EasySXB/releases/tag/v0.1.5">available on my
fork</a>.</p>

<p>A good <a href="https://wdc65xx.com/getting-started/mensch-getting-started/">“Getting Started”
page</a> on the WDC
site walks you through using this to upload code, so I won’t repeat that here.
It works the same on macOS as on Windows once you have the native binaries.
Note that when starting EasySXB on the command line, you can supply the port
with the <code class="language-plaintext highlighter-rouge">--port</code> option which makes it ever so much easier to use since the
tool doesn’t support point-and-click to find the serial port. In my case I ran
it with:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ./easysxb --port /dev/cu.usbserial-FTDOMLSO
</code></pre></div></div>

<p>Once the UI starts you can use the menus to connect.</p>

<h2 id="running-some-code">Running Some Code</h2>

<p>Now we need to write some code (or grab it from somewhere) and assemble it. I
started with the hello world blinking LED program from Mike’s post. He and Joe
Davisson maintain an assembler that can target the 65816 called <code class="language-plaintext highlighter-rouge">naken_asm</code>. I
initially tried to use the <code class="language-plaintext highlighter-rouge">acme</code> assembler but it uses different syntax than
Mike’s code so I took the easy wrote and used
<a href="https://github.com/mikeakohn/naken_asm"><code class="language-plaintext highlighter-rouge">naken_asm</code></a>. It compiles cleanly on
macOS and was ready to use in a few seconds.</p>

<p>The <a href="https://github.com/JoeDavisson/EasySXB/blob/master/samples/led_blink/led_blink_65c816.asm">code from Mike Kohn’s
post</a>
doesn’t quite run out of the box on the Mensch. He has been using external
memory on his board and therefore the base address in the code is above that
available in the base Mensch like I have. I read through <a href="http://www.westerndesigncenter.com/wdc/documentation/w65c265s.pdf">the datasheet for the
W65C265</a> and
found the cleanest contiguous block of memory available and modified the program
to load itself there rather than <code class="language-plaintext highlighter-rouge">0x1000</code>. Line 5 now reads:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.org 0x00B6
</code></pre></div></div>

<p>This is easily assembled with a single command:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ naken_asm -h led_blink_65c816.asm

naken_asm

Authors: Michael Kohn
         Joe Davisson
    CPU: 1802, 4004, 6502, 65C816, 68HC08, 6809, 68000, 8048, 8051, 86000,
         ARM, AVR8, Cell BE, Copper, CP1610, dsPIC, Epiphany, Java, LC-3,
         MIPS, MSP430, PIC14, PIC24, PIC32, Playstation 2 EE, PowerPC,
         Propeller, PSoC M8C, RISC-V, SH-4, STM8, SuperFX, SWEET16,
         THUMB, TMS1000, TMS1100, TMS9900, WebAssembly, Xtensa, Z80
    Web: http://www.mikekohn.net/
  Email: mike@mikekohn.net
Version: April 25, 2020

 Input file: led_blink_65c816.asm
Output file: out.hex

Pass 1...
Pass 2...

Program Info:
Include Paths: .
               /usr/local/share/naken_asm/include
               include
 Instructions: 29
   Code Bytes: 64
   Data Bytes: 0
  Low Address: 00b6 (182)
 High Address: 00f5 (245)
</code></pre></div></div>

<p>This generates an Intel hex format output that is suitable for upload to the
Mensch board.</p>

<p>We can now use the menu in EasySXB to upload the program to the board. If the
board is not responsive, you may need to hit the reset button before trying
again. You can validate that things are working by loading the registers from
the monitor with the <code class="language-plaintext highlighter-rouge">Get</code> button. Once you’ve successfully uploaded the code,
you can enter <code class="language-plaintext highlighter-rouge">0x00B6</code> into to the <code class="language-plaintext highlighter-rouge">Address:</code> field and hit <code class="language-plaintext highlighter-rouge">JML</code> (jump long).
You should see the LED on P72 begin to blink!</p>

<p>This program really is the most basic thing you could do. I took a little time
and enhanced it to manage the PCS7 (chip select) register and the control
register to get all 8 of the LEDs flashing in groups of 4, and two external
LEDs flashing as well. I added some defines from the datasheet to make it a lot
clearer what we’re doing here and to prevent having to remember all the memory
addresses.</p>

<p>It took a little bit to figure out that I needed to set the PCS7 register up
properly to get all the LEDs working. The datasheet is very complete, however,
and with a little head scratching I got it working. This is the first board
I’ve ever had with a monitor built in and I found it incredibly helpful here
for debugging the program. I was able to manipulate registers by hand in the
monitor until I understood all the necessary settings to do what I needed. Very
handy!</p>

<p>You can find my improved Hello World <a href="https://github.com/relistan/w65c265/blob/master/led_blink_65c816.asm">in this
repo</a>.</p>

<p>Note that if you hook external LEDs, they should be on Port 0 and for best
effect you should choose any 2 pins next to each other.</p>

<h2 id="wrap-up">Wrap Up</h2>

<p>Hello world is not the most exciting thing in the world, but to get there you
have to get the whole toolset working. Now that we have that, we can start to
explore more things with the baord. It took the better part of a day for me
to make all this work on my MacBook Pro. Hopefully this post saves another retro
enthusiast some time.</p>

<p>I’ve got some Cypress SRAM chips now and I’m looking forward to hooking them up
to this board.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="assembly" /><category term="microcontroller" /><category term="65c02" /><category term="65816" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Kernel Change That May Be Slowing Down Your App</title><link href="https://relistan.com/the-kernel-may-be-slowing-down-your-app" rel="alternate" type="text/html" title="The Kernel Change That May Be Slowing Down Your App" /><published>2019-12-22T00:00:00-05:00</published><updated>2019-12-22T00:00:00-05:00</updated><id>https://relistan.com/the-kernel-may-be-slowing-down-your-app</id><content type="html" xml:base="https://relistan.com/the-kernel-may-be-slowing-down-your-app"><![CDATA[<p>A kernel “bug fix” that happened at the end of last year may be killing the
performance of your Kubernetes- or Mesos-hosted applications. Here’s how we
discovered that it was affecting us and what we did about it at
<a href="https://community.com">Community</a>.</p>

<p><img style="float: left; display: block; margin-bottom: 1em;" src="/images/NewRelic-PythonImprovement.png" alt="Huge Improvement" /></p>

<h2 id="bad-90th-percentile">Bad 90th Percentile</h2>

<p>For most of 2019 we were seeing some outlying bad performance across our apps.
We run our stack on top of Mesos and Docker. Performance in production seemed
worse than when not in production and we are a busy startup and only devoted a
bit of time here and there to trying to understand the problem.</p>

<p>In the late summer of 2019 we started to notice that a few of our simplest
apps were performing in a noticeably strange way. Both applications should have
highly predictable performance, but were seeing response time 90th percentiles
that were hugely out of our expectations. We looked into it and while we’re
busy ramping up our platform, didn’t take the time to devote to fixing it.</p>

<p>The application we first noticed this with accepts an HTTP request, looks in a
Redis cache, and publishes on RabbitMQ. It is written in Elixir, our main app
stack. This application’s median response times were somewhere around 7-8ms.
It’s 90th percentile was 45-50ms. That is a huge slow down! Appsignal, that we
use for Elixir application monitoring, was showing a strange issue where for
slow requests, either Redis <em>or</em> RabbitMQ would be slow, and that external call
would take almost all of the 45-50ms.  Redis and RabbitMQ both deliver very
reliable performance until they are under huge, swamping load, so this is a
really strange pattern.</p>

<p>We started to look at other applications across our stack in closer detail and
found that many of them were seeing the same pattern. It was often just harder
to pick out the issue because they were doing more complex operations and had
less predictable behavior.</p>

<p>One of those applications accepts a request, talks to Redis, and returns a
response. Its median response times were sub-millisecond. We were seeing 90th
percentile response times around… 45-50ms! Numbers semm familiar? That
application was written in Go.</p>

<p>For those not familiar with how response time percentiles work: a bad 90th
percentile means 1 in 10 requests is getting terrible performance.  Even having
5% of our requests getting bad performance is completely unacceptable. For
complex applications it seemed to affect us less, but we rely on low latency
from a couple of apps and it was hurting us.</p>

<h2 id="digging-in">Digging In</h2>

<p>We use Appsignal for monitoring our Elixir applications and New Relic for
monitoring Go and Python apps. Our main stack is Elixir, and the first app we
saw the issue with was in Elixir, so Appsignal is where most of the
troubleshooting happened. We started by looking at our own settings that might
be affecting applications. We weren’t doing anything crazy and I was
co-troubleshooting at this point was <a href="https://github.com/whatyouhide/">Andrea
Leopardi</a>, a member of our team and an Elixir
core team member. He and I evaluated the application side and couldn’t find
anything wrong. We then started to look at all the underlying components that
could be affecting it: network, individual hosts, load balancers, etc and
eliminating them one-by-one.</p>

<p>We isolated the application in our dev environment (just like prod, but no
load) and gave it an extremely low, but predictable throughput of 1 request per
second. Amazingly, we could still see the 90th percentile as a huge outlier.
To eliminate any network considerations, we generated the load from the same
host. Same behavior! At this point I started to think it was probably a kernel
issue but kernel issues are pretty rare in situations like this with basically
no load.</p>

<p><img style="float: right" src="/images/Appsignal-kernel-bug-2.png" alt="Kernel issue" /></p>

<p>Andrea and I talked through all the issues and decided we had better look at
Docker settings as a culprit. We decided to turn off all CPU and memory limits
and deploy the application to our dev environment again. Keep in mind there is
<em>no load</em> on this application. It is seeing 1 request per second in this
synthetic environment. Deployed without limits, it suddenly behaved as
expected! To prove that it was CPU limits, my hunch, we re-enabled memory
limits, re-deployed, ran the same scenario and we were back to bad performance.
The chart at right shows Appsignal sampling the slowest transactions each
minute. This is really the max outlier rather than the 90th but we ought to be
able to improve that more easily. You can see in the output that without limits
(green) it performed fine and with limits (red) it was a mess. Remember, there
is basically no load here so the CPU limits shouldn’t much change app
performance if they are in place.</p>

<p>We then thought there might be an interaction between the BEAM scheduler
(Erlang VM that Elixir runs on) and CPU limits. We tried various BEAM settings
with the CPU limits enabled and got little reward for our efforts. We run all
of our apps under a <a href="https://github.com/Nitro/sidecar-executor">custom Mesos
executor</a> that I co-wrote with
co-workers at <a href="https://gonitro.com">Nitro</a> over the last few years before
joining <a href="https://community.com">Community</a>. With our executor it was easy to
switch our enforcement from the Linux Completely Fair Scheduler enforcement
that works best for most service workloads, to the older CPU shares style. This
would be a compromise position to let us run some applications in production
with better performance without turning off limits entirely. Not great, but a
start, if it worked. I did that and we measured performance again.  Even
ramping the throughput up slightly to 5 requests per second, the application
performed as expected. The following chart shows all this experimentation:</p>

<p><img src="/images/Appsignal-kernel-bug.png" alt="All of our experiments" /></p>

<h2 id="jackpot">Jackpot</h2>

<p>Moving beyond our Elixir core stack, we discovered that we were seeing the same
behavior in a Go application and a Python app as well. The Python app’s New
Relic chart is the one at the start of this article. Once Andrea and I realized
it was affecting non-Elixir apps and was in no way an interaction specific to
the BEAM, I started looking for other reports on the Internet of issues with
Completely Fair Scheduler CPU limits.</p>

<p>Jackpot! After a bit of work, I came across <a href="https://lkml.org/lkml/2019/5/17/581">this
issue</a> on the kernel mailing list. Take a
quick second to read the opening two paragraphs of that issue. What is
happening is that when you have apps that aren’t completely CPU-bound, when you
have CPU limits applied, the kernel is effectively penalizing you for CPU time
you did not use, as if you were a busy application. In highly threaded
environments, this leads to thread starvation for certain workloads. The key
here is that the issue says this only affects non-CPU-bound workloads. We found
that in practice, it’s really affecting all of our workloads. Contrary to my
expectations that it would affect Go and BEAM schedulers the worst, in fact,
one of the most affected workloads was a Python webapp.</p>

<p>The other detail that might not be clear from first read of that Kernel issue
is that the behavior that is killing our performance is how the kernel was
<em>supposed</em> to work originally. But in 2014, a patch was applied that
effectively disabled the slice expiration. It has been broken (according to the
design) for 4.5 years. Everything was good until that “bug” was fixed and the
original, intended behavior was unblocked. The recent work to fix this issue
involves actually returning to the previous status quo by removing the
expiration “feature” entirely.</p>

<p>If you run on K8s or Mesos, or another platform that uses the Linux CFS CPU
limits, you are almost certainly getting affected by this issue as well. At
least for all the blends of Linux I looked at, and unless your kernel is more
than about a year old.</p>

<h2 id="fixing-it">Fixing It</h2>

<p>So, what to do about it? We run Ubuntu 16.04 LTS and looking at the available
kernels on the LTS stream, there was nothing that included the fix. This was a
month ago. To get a kernel that had the right patches, <a href="https://github.com/dselans/">Dan
Selans</a> and I had to enable the <code class="language-plaintext highlighter-rouge">proposed</code> package
stream for Ubuntu and upgrade to the latest proposed kernel.  This is not a
small jump and you should consider carefully if this will be the right thing to
do on your platform. In our case we found over the last month that this has
been a very reliable update. And most importantly, <strong>it fixed the performance
issue!</strong> Things may have changed in the intervening period and this patch may
be back-ported to the LTS stream. I did not look into patching for other
distributions.</p>

<p>What we’ve seen is a pretty dramatic improvement in a number of applications.
I think that the description the kernel issue under sells the effect. At least
on the various workloads in our system. Here are some more charts, this time
from New Relic:</p>

<p><strong>Patching the system, one application’s view</strong>:
<img src="/images/NewRelic-kernel-patching.png" alt="Kernel Patching" /></p>

<p><strong>Same application, before and after:</strong>
<img src="/images/NewRelic-post-patching.png" alt="Before and After" /></p>

<p><strong>The most dramatically affected:</strong>
<img src="/images/NewRelic-PythonImprovement.png" alt="Dramatic improvement" /></p>

<h2 id="wrap-up">Wrap Up</h2>

<p>Is this affecting your apps? Maybe. Is it worth fixing? Maybe. For us both of
those are definitely true. You should at least look into it. We’ve seen very
noticeable improvements in latency, but also in throughput for the same amount
of CPU for certain applications. It has stabilized throughput and latency in a
few parts of the system that were a lot less predictable before. Applying the
<code class="language-plaintext highlighter-rouge">proposed</code> stream to our production boxes and getting all of our environments
patched was a big step, but it has paid off. We at
<a href="https://community.com">Community</a> all hope this article helps someone else!</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="linux" /><category term="kernel" /><category term="ops" /><category term="go" /><category term="elixir" /><category term="community" /><summary type="html"><![CDATA[A kernel “bug fix” that happened at the end of last year may be killing the performance of your Kubernetes- or Mesos-hosted applications. Here’s how we discovered that it was affecting us and what we did about it at Community.]]></summary></entry><entry><title type="html">Dynamic Nginx Router… in Go!</title><link href="https://relistan.com/dynamic-nginx-router-in-go" rel="alternate" type="text/html" title="Dynamic Nginx Router… in Go!" /><published>2018-05-07T00:00:00-04:00</published><updated>2018-05-07T00:00:00-04:00</updated><id>https://relistan.com/dynamic-nginx-router-in-go</id><content type="html" xml:base="https://relistan.com/dynamic-nginx-router-in-go"><![CDATA[<p><span style="font-size: x-small; float: left; text-align: left; width: 30%"><img src="/images/nginxlogo.png" alt="Nginx logo" /></span></p>

<p>We needed a specialized load balancer at <a href="https://gonitro.com">Nitro</a>. After
some study, <a href="https://twitter.com/MihaiTodor">Mihai Todor</a> and I built a
solution that leverages Nginx, the Redis protocol, and a Go-based request
router where Nginx does all the heavy lifting and the router carries no traffic
itself. This solution has worked great in production for the last year. Here’s
what we did and why we did it.</p>

<h2 id="why">Why?</h2>

<p>The new service we were building would be behind a pool of load balancers and
was going to do some expensive calculations—and therefore do some local
caching.  To optimize for the cache, we wanted to try to send requests for the
same resources to the same host if it were available.</p>

<p>There are a number of off-the-shelf ways to solve this problem. A
non-exhaustive list of possibilities includes:</p>

<p><span style="font-size: x-small; float: right; text-align: right; width: 25ex"><img src="/images/go-gopher2.jpg" alt="Go Gopher" /><br />Go Gopher by Renee French.</span></p>

<ul>
  <li>Using cookies to maintain session stickiness</li>
  <li>Using a header to do the same</li>
  <li>Stickiness based on source IP</li>
  <li>HTTP Redirects to the correct instace</li>
</ul>

<p>This service will be hit several times per page load and so HTTP redirects are
not viable for performance reasons. The rest of those solutions all work well
if all the inbound requests are passing through the same load balancer. If, on
the other hand, your frontend is a pool of load balancers, you need to be able
to either share state between them or implement more sophisticated routing
logic. We weren’t interested in the design changes needed to share state
between load balancers at the moment and so opted for more sophisticated
routing logic for this service.</p>

<h2 id="our-architecture">Our Architecture</h2>

<p>It probably helps to understand our motiviation a little better to understand a
bit about our architecture.</p>

<p>We have a pool of frontend load balancers and instances of the service are
deployed on Mesos so they may come and go depending on scale and resource
availability. Getting a list of hosts and ports into the load balancer is not
an issue, that’s already core to our platform.</p>

<p>Because everything is running on Mesos, and we have a <a href="https://github.com/Nitro/nmesos">simple way to define and
deploy services</a>, adding any new service is a
trivial task.</p>

<p>On top of Mesos, we run gossip-based
<a href="https://github.com/Nitro/sidecar">Sidecar</a> everywhere to manage service
discovery. Our frontend load balancers are Lyft’s
<a href="https://envoyproxy.io">Envoy</a> backed by Sidecar’s Envoy integration. For most
services that is enough. The Envoy hosts run on dedicated instances but the
services all move between hosts as needed, directed by Mesos and the
<a href="https://github.com/HubSpot/Singularity">Singularity</a> scheduler.</p>

<p>The Mesos nodes for the service under consideration here would have disks for local
caching.</p>

<h2 id="design">Design</h2>

<p>Looking at the problem we decided we really wanted a consistent hash ring. We
could have nodes come and go as needed and only the requests being served by
those nodes would be re-routed. All the remaining nodes would continue to serve
any open sessions. We could easily back the consistent hash ring with data from
Sidecar (you could substitute Mesos or K8s here). Sidecar health checks nodes
and so we could rely on nodes being available if they are alive in Sidecar.</p>

<p>We needed to then somehow bolt the consistent hash to something that could
direct traffic to the right node. It would need to receive each request,
identify the resource in question, and then pass the request to the exact
instance of the service that was prepped to handle that resource.</p>

<p>Of course, the resource identification is easily handled by a URL and any load
balancer can take those apart to handle simple routing. So we just needed to tie
that to the consistent hash and we’d have a solution.</p>

<p>You could do this in Lua in Nginx, possibly in HAproxy with Lua as well. No one
at Nitro is a Lua expert and libraries to implement the pieces we needed were
not obviously available. Ideally the routing logic would be in Go, which is
already a critical language in our stack and well supported.</p>

<p>Nginx has a rich ecosystem, though, and a little thinking outside the box
turned up a couple of interesting Nginx plugins, however. The first of these is
the <a href="https://github.com/vkholodkov/nginx-eval-module"><code class="language-plaintext highlighter-rouge">nginx-eval-module</code></a> by
Valery Kholodkov. This allows you to make a call from Nginx to an endpoint and
then evaluate the result into an Nginx variable. Among other possible uses, the
significance of that for us is that it allows you to dynamically decide which
endpoint should receive a proxy-pass. That’s what we wanted to do. You make a
call from Nginx to somewhere, you get a result, and then your make a routing
decision based on that value.</p>

<p>You <em>could</em> implement the recipient of that request with an HTTP service that
returns only a string with the hostname and port of the destination service
endpoint. That service would maintain the consistent hash and then tell Nginx
where to route the traffic for each request. But making a separate HTTP
request, even if were always contained on the same node, is a bit heavy. The
whole expected body of the reply would be something like the string
<code class="language-plaintext highlighter-rouge">10.10.10.5:23453</code>. With HTTP, we’d be passing headers in both directions that
would vastly exceed the size of the response.</p>

<p>So I started to look at other protocols supported by Nginx. Memcache protocol
and Redis protocol are both supported. Of those, the best supported from a Go
service is Redis. So that was where we turned.</p>

<p>There are two Redis modules for Nginx. One of them is suitable for use with the
<code class="language-plaintext highlighter-rouge">nginx-eval-module</code>. The best Go library for Redis is
<a href="https://github.com/bsm/redeo">Redeo</a>. It implements a really simple handler
mechanism much like the stdlib <code class="language-plaintext highlighter-rouge">http</code> package. Any Redis procotol command will
invoke a handler function, and they are really simple to write. Alas, it only
supports a newer Redis protocol than the Nginx plugin can handle. So, I dusted
off my C skills and <a href="https://github.com/relistan/ngx_http_redis">patched the Nginx
plugin</a> to use the newest Redis
protocol encoding.</p>

<p>So the solution we ended up with is:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text"> [Internet] -&gt; [Envoy] -&gt; [Nginx] -(2)--&gt; [Service endpoint]
                             \
                          (1) \ (redis proto)
                               \
                                -&gt; [Go router]</code></pre></figure>

<p>The call comes in from the Internet, hits an Envoy node, then an Nginx node.
The Nginx node (1) asks the router where to send it, and then (2) Nginx passes
the request to the endpoint.</p>

<h2 id="implementation">Implementation</h2>

<p>We built a library in Go to manage our consistent hash backed by Sidecar or by Hashicorp’s Memberlist library. We called that library <a href="https://github.com/Nitro/ringman">Ringman</a>. We then bolted that libary into a service which serves Redis protocol requests via <a href="https://github.com/bsm/redeo">Redeo</a>.</p>

<p>Only two Redis commands are required: <code class="language-plaintext highlighter-rouge">GET</code> and <code class="language-plaintext highlighter-rouge">SELECT</code>. We chose to implement a few more commands for debugging purposes, including <code class="language-plaintext highlighter-rouge">INFO</code> which can reply with any server state you’d like. Of the two required commands, we can safely ignore <code class="language-plaintext highlighter-rouge">SELECT</code>, which is for selecting the Redis DB to use for any subsequent calls. We just accept it and do nothing.<code class="language-plaintext highlighter-rouge">GET</code>, which does all the work, was easy to implement. Here’s the entire function to serve the Ringman endpoint over Redis with Redeo. Nginx passes the URL it received, and we return the endpoint from the hash ring.</p>

<figure class="highlight"><pre><code class="language-go" data-lang="go"><span class="n">srv</span><span class="o">.</span><span class="n">HandleFunc</span><span class="p">(</span><span class="s">"get"</span><span class="p">,</span> <span class="k">func</span><span class="p">(</span><span class="n">out</span> <span class="o">*</span><span class="n">redeo</span><span class="o">.</span><span class="n">Responder</span><span class="p">,</span> <span class="n">req</span> <span class="o">*</span><span class="n">redeo</span><span class="o">.</span><span class="n">Request</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">req</span><span class="o">.</span><span class="n">Args</span><span class="p">)</span> <span class="o">!=</span> <span class="m">1</span> <span class="p">{</span>
		<span class="k">return</span> <span class="n">req</span><span class="o">.</span><span class="n">WrongNumberOfArgs</span><span class="p">()</span>
	<span class="p">}</span>
	<span class="n">node</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">ringman</span><span class="o">.</span><span class="n">GetNode</span><span class="p">(</span><span class="n">req</span><span class="o">.</span><span class="n">Args</span><span class="p">[</span><span class="m">0</span><span class="p">])</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">log</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"Error fetching key '%s': %s"</span><span class="p">,</span> <span class="n">req</span><span class="o">.</span><span class="n">Args</span><span class="p">[</span><span class="m">0</span><span class="p">],</span> <span class="n">err</span><span class="p">)</span>
		<span class="k">return</span> <span class="n">err</span>
	<span class="p">}</span>

	<span class="n">out</span><span class="o">.</span><span class="n">WriteString</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
	<span class="k">return</span> <span class="no">nil</span>
<span class="p">})</span></code></pre></figure>

<p>That is called by Nginx using the following config:</p>

<figure class="highlight"><pre><code class="language-nginx" data-lang="nginx"><span class="c1"># NGiNX configuration for Go router proxy.</span>
<span class="c1"># Relies on the ngx_http_redis, nginx-eval modules,</span>
<span class="c1"># and http_stub_status modules.</span>

<span class="k">error_log</span> <span class="n">/dev/stderr</span><span class="p">;</span>
<span class="k">pid</span>       <span class="n">/tmp/nginx.pid</span><span class="p">;</span>
<span class="k">daemon</span>    <span class="no">off</span><span class="p">;</span>

<span class="k">worker_processes</span> <span class="mi">1</span><span class="p">;</span>

<span class="k">events</span> <span class="p">{</span>
  <span class="kn">worker_connections</span>  <span class="mi">1024</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">http</span> <span class="p">{</span>
  <span class="kn">access_log</span>   <span class="n">/dev/stdout</span><span class="p">;</span>

  <span class="kn">include</span>     <span class="s">mime.types</span><span class="p">;</span>
  <span class="kn">default_type</span>  <span class="nc">application/octet-stream</span><span class="p">;</span>

  <span class="kn">sendfile</span>       <span class="no">off</span><span class="p">;</span>
  <span class="kn">keepalive_timeout</span>  <span class="mi">65</span><span class="p">;</span>

  <span class="kn">upstream</span> <span class="s">redis_servers</span> <span class="p">{</span>
    <span class="kn">keepalive</span> <span class="mi">10</span><span class="p">;</span>

    <span class="c1"># Local (on-box) instance of our Go router</span>
    <span class="kn">server</span> <span class="nf">services.nitro.us</span><span class="p">:</span><span class="mi">10109</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="kn">server</span> <span class="p">{</span>
    <span class="kn">listen</span>      <span class="mi">8010</span><span class="p">;</span>
    <span class="kn">server_name</span> <span class="s">localhost</span><span class="p">;</span>

    <span class="kn">resolver</span> <span class="mf">127.0</span><span class="s">.0.1</span><span class="p">;</span>

    <span class="c1"># Grab the filename/path and then rewrite to /proxy. Can't do the</span>
    <span class="c1"># eval in this block because it can't handle a regex path.</span>
    <span class="kn">location</span> <span class="p">~</span><span class="sr">*</span> <span class="n">/documents/</span><span class="s">(.*)</span> <span class="p">{</span>
      <span class="kn">set</span> <span class="nv">$key</span> <span class="nv">$1</span><span class="p">;</span>

      <span class="kn">rewrite</span> <span class="s">^</span> <span class="n">/proxy</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1"># Take the $key we set, do the Redis lookup and then set</span>
    <span class="c1"># $target_host as the return value. Finally, proxy_pass</span>
    <span class="c1"># to the URL formed from the pieces.</span>
    <span class="kn">location</span> <span class="n">/proxy</span> <span class="p">{</span>
      <span class="kn">eval</span> <span class="nv">$target_host</span> <span class="p">{</span>
        <span class="kn">set</span> <span class="nv">$redis_key</span> <span class="nv">$key</span><span class="p">;</span>
        <span class="kn">redis_pass</span> <span class="s">redis_servers</span><span class="p">;</span>
      <span class="p">}</span>

      <span class="c1">#add_header "X-Debug-Proxy" "$uri -- $key -- $target_host";</span>

      <span class="kn">proxy_pass</span> <span class="s">"http://</span><span class="nv">$target_host</span><span class="n">/documents/</span><span class="nv">$key</span><span class="s">?</span><span class="nv">$args</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1"># Used to health check the service and to report basic statistics</span>
    <span class="c1"># on the current load of the proxy service.</span>
    <span class="kn">location</span> <span class="p">~</span> <span class="sr">^/(status|health)$</span> <span class="p">{</span>
      <span class="kn">stub_status</span> <span class="no">on</span><span class="p">;</span>
      <span class="kn">access_log</span>  <span class="no">off</span><span class="p">;</span>
      <span class="kn">allow</span> <span class="mf">10.0</span><span class="s">.0.0/8</span><span class="p">;</span>    <span class="c1"># Allow anyone on private network</span>
      <span class="kn">allow</span> <span class="mf">172.16</span><span class="s">.0.0/12</span><span class="p">;</span> <span class="c1"># Allow anyone on Docker bridge network</span>
      <span class="kn">allow</span> <span class="mf">127.0</span><span class="s">.0.0/8</span><span class="p">;</span>   <span class="c1"># Allow localhost</span>
      <span class="kn">deny</span> <span class="s">all</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="kn">error_page</span>   <span class="mi">500</span> <span class="mi">502</span> <span class="mi">503</span> <span class="mi">504</span>  <span class="n">/50x.html</span><span class="p">;</span>
    <span class="kn">location</span> <span class="p">=</span> <span class="n">/50x.html</span> <span class="p">{</span>
      <span class="kn">root</span>   <span class="s">html</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span></code></pre></figure>

<p>We deploy Nginx and the router in containers and they run on the same hosts
so we have a very low call overhead between them.</p>

<p>We build Nginx like this:</p>

<figure class="highlight"><pre><code class="language-bash" data-lang="bash">./configure <span class="nt">--add-module</span><span class="o">=</span>plugins/nginx-eval-module <span class="se">\</span>
      <span class="nt">--add-module</span><span class="o">=</span>plugins/ngx_http_redis <span class="se">\</span>
      <span class="nt">--with-cpu-opt</span><span class="o">=</span>generic <span class="se">\</span>
      <span class="nt">--with-http_stub_status_module</span> <span class="se">\</span>
      <span class="nt">--with-cc-opt</span><span class="o">=</span><span class="s2">"-static -static-libgcc"</span> <span class="se">\</span>
      <span class="nt">--with-ld-opt</span><span class="o">=</span><span class="s2">"-static"</span> <span class="se">\</span>
      <span class="nt">--with-cpu-opt</span><span class="o">=</span>generic

make <span class="nt">-j8</span></code></pre></figure>

<h2 id="performance">Performance</h2>

<p>We’ve tested the performance of this extensively and in our environment we see
about 0.2-0.3ms response times on average for a round trip from Nginx to the Go
router over Redis protocol. Since the median response time from the upstream
service is about 70ms, this is a negligeable delay.</p>

<p>A more complex Nginx config might be able to do more sophisticated error handling.
Reliability after a year in service is extremly good and performance has been
constant.</p>

<h2 id="wrap-up">Wrap-Up</h2>

<p>If you have a similar need, you can re-use most of the components. Just follow
the links above to actual source code. If you are interested in adding support
for K8s or Mesos directly to Ringman, that would be welcome.</p>

<p>This solution started out sounding a bit like a hack and in the end has been a
great addition to our infrastructure. Hopefully it helps someone else solve a
similar problem.</p>]]></content><author><name>Karl Matthias</name></author><category term="articles" /><category term="go" /><category term="programming" /><category term="nginx" /><category term="http" /><summary type="html"><![CDATA[]]></summary></entry></feed>