<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://code.deepinspace.net/feed.xml" rel="self" type="application/atom+xml" /><link href="https://code.deepinspace.net/" rel="alternate" type="text/html" /><updated>2025-04-27T07:33:28+00:00</updated><id>https://code.deepinspace.net/feed.xml</id><title type="html">On Code, and Other Things</title><subtitle></subtitle><author><name>Hrishikesh Barua</name></author><entry><title type="html">Skipping Optional Fields in Prisma 5.x</title><link href="https://code.deepinspace.net/posts/2025/04/26/skipping-optional-fields-in-prisma-5.x/" rel="alternate" type="text/html" title="Skipping Optional Fields in Prisma 5.x" /><published>2025-04-26T00:00:00+00:00</published><updated>2025-04-26T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2025/04/26/skipping-optional-fields-in-prisma-5.x</id><content type="html" xml:base="https://code.deepinspace.net/posts/2025/04/26/skipping-optional-fields-in-prisma-5.x/"><![CDATA[<h2 id="introduction">Introduction</h2>

<p>I use Prisma ORM for database access in my TypeScript projects. I recently ran into an issue where I needed to skip optional fields in an upsert query.
Setting undefined is <a href="https://www.prisma.io/docs/orm/prisma-client/special-fields-and-types/null-and-undefined">not allowed</a> in Prisma to avoid
 unexpected results. The prescribed solution - to set Prisma.skip - does not work in the version I’m using. Upgrading Prisma was not an option as I was in the middle of a feature.</p>

<p>So here’s what worked for me.</p>

<p><strong>tldr;</strong> I wrote a function that creates args objects which have only the fields that are present in the incoming data object.</p>

<h3 id="the-schema">The schema</h3>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">model</span> <span class="nx">PSP</span> <span class="p">{</span>
  <span class="nx">id</span>           <span class="nx">Int</span>      <span class="p">@</span><span class="nd">id</span> <span class="p">@</span><span class="nd">default</span><span class="p">(</span><span class="nx">autoincrement</span><span class="p">())</span>
  <span class="nx">userId</span>       <span class="nb">String</span>
  <span class="nx">pageId</span>       <span class="nb">String</span>   <span class="p">@</span><span class="nd">unique</span>
  <span class="nx">pageTitle</span>    <span class="nb">String</span>
  <span class="nx">companyURL</span>   <span class="nb">String</span><span class="p">?</span>
  <span class="nx">supportEmail</span> <span class="nb">String</span><span class="p">?</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The last 3 fields are optional.</p>

<h3 id="the-query">The query</h3>
<p>A Prisma query for upserting in the case where none of the fields are optional would look like:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
    <span class="kd">const</span> <span class="nx">pspData</span> <span class="o">=</span> <span class="p">....</span><span class="c1">//Data object with values from the user</span>
    <span class="p">....</span>

    <span class="kd">const</span> <span class="nx">psp</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">db</span><span class="p">.</span><span class="nx">pSP</span><span class="p">.</span><span class="nx">upsert</span><span class="p">({</span>
        <span class="na">where</span><span class="p">:</span> <span class="p">{</span>
            <span class="na">pageId</span><span class="p">:</span> <span class="nx">update</span><span class="p">.</span><span class="nx">pageId</span><span class="p">,</span>
            <span class="na">userId</span><span class="p">:</span> <span class="nx">update</span><span class="p">.</span><span class="nx">userId</span>
        <span class="p">},</span>
        <span class="na">create</span><span class="p">:</span> <span class="p">{</span>
            <span class="na">pageTitle</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">pageTitle</span><span class="p">,</span>
            <span class="na">companyURL</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">companyURL</span><span class="p">,</span><span class="c1">//Optional</span>
            <span class="na">supportEmail</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">supportEmail</span><span class="p">,</span><span class="c1">//Optional</span>
        <span class="p">},</span>
        <span class="na">update</span><span class="p">:</span> <span class="p">{</span>
            <span class="na">pageTitle</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">pageTitle</span><span class="p">,</span>
            <span class="na">companyURL</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">companyURL</span><span class="p">,</span><span class="c1">//Optional</span>
            <span class="na">supportEmail</span><span class="p">:</span> <span class="nx">pspData</span><span class="p">.</span><span class="nx">supportEmail</span><span class="p">,</span><span class="c1">//Optional</span>
        <span class="p">}</span>
    <span class="p">});</span>
</code></pre></div></div>

<p>Obviously, this does not work as the optional fields would be undefined, and Prisma does not allow that.</p>

<p>The first approach is a brute force way where I created the args objects based on which fields were present in the incoming data object.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">getCreateUpdateArgs</span> <span class="o">=</span> <span class="p">(</span><span class="nx">update</span><span class="p">:</span> <span class="nx">PSPUpdate</span><span class="p">):</span> <span class="p">{</span> <span class="nl">createArgs</span><span class="p">:</span> <span class="kr">any</span><span class="p">,</span> <span class="nx">updateArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="p">}</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="na">createArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="kd">const</span> <span class="na">updateArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">if</span> <span class="p">(</span><span class="nx">update</span><span class="p">.</span><span class="nx">pageTitle</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">createArgs</span><span class="p">.</span><span class="nx">pageTitle</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">pageTitle</span><span class="p">;</span>
        <span class="nx">updateArgs</span><span class="p">.</span><span class="nx">pageTitle</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">pageTitle</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">if</span> <span class="p">(</span><span class="nx">update</span><span class="p">.</span><span class="nx">companyURL</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">createArgs</span><span class="p">.</span><span class="nx">companyURL</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">companyURL</span><span class="p">;</span>
        <span class="nx">updateArgs</span><span class="p">.</span><span class="nx">companyURL</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">companyURL</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">if</span> <span class="p">(</span><span class="nx">update</span><span class="p">.</span><span class="nx">supportEmail</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">createArgs</span><span class="p">.</span><span class="nx">supportEmail</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">supportEmail</span><span class="p">;</span>
        <span class="nx">updateArgs</span><span class="p">.</span><span class="nx">supportEmail</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">supportEmail</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="nx">createArgs</span><span class="p">.</span><span class="nx">pageId</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">pageId</span><span class="p">;</span>
    <span class="nx">createArgs</span><span class="p">.</span><span class="nx">userId</span> <span class="o">=</span> <span class="nx">update</span><span class="p">.</span><span class="nx">userId</span><span class="p">;</span>

    <span class="k">return</span> <span class="p">{</span> <span class="nx">createArgs</span><span class="p">,</span> <span class="nx">updateArgs</span> <span class="p">};</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I then used the function’s return values in the upsert query.</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kd">const</span> <span class="p">{</span> <span class="nx">createArgs</span><span class="p">,</span> <span class="nx">updateArgs</span> <span class="p">}</span> <span class="o">=</span> <span class="nx">getCreateUpdateArgs</span><span class="p">(</span><span class="nx">pspData</span><span class="p">);</span>

    <span class="kd">const</span> <span class="nx">psp</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">db</span><span class="p">.</span><span class="nx">pSP</span><span class="p">.</span><span class="nx">upsert</span><span class="p">({</span>
        <span class="na">where</span><span class="p">:</span> <span class="p">{</span>
            <span class="na">pageId</span><span class="p">:</span> <span class="nx">update</span><span class="p">.</span><span class="nx">pageId</span><span class="p">,</span>
            <span class="na">userId</span><span class="p">:</span> <span class="nx">update</span><span class="p">.</span><span class="nx">userId</span>
        <span class="p">},</span>
        <span class="na">create</span><span class="p">:</span> <span class="p">{</span>
            <span class="p">...</span><span class="nx">createArgs</span>
        <span class="p">},</span>
        <span class="na">update</span><span class="p">:</span> <span class="p">{</span>
            <span class="p">...</span><span class="nx">updateArgs</span>
        <span class="p">}</span>
    <span class="p">});</span>
</code></pre></div></div>

<p>This works, but I did not like the idea of checking each field in the incoming data object. My JavaScript skills are not great, so after a bit of searching this is what I 
came up with:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kd">const</span> <span class="nx">getCreateUpdateArgs2</span> <span class="o">=</span> <span class="p">(</span><span class="nx">update</span><span class="p">:</span> <span class="nx">PSPUpdate</span><span class="p">):</span> <span class="p">{</span> <span class="nl">createArgs</span><span class="p">:</span> <span class="kr">any</span><span class="p">,</span> <span class="nx">updateArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="p">}</span> <span class="o">=&gt;</span> <span class="p">{</span>
    <span class="kd">const</span> <span class="na">createArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="kd">const</span> <span class="na">updateArgs</span><span class="p">:</span> <span class="kr">any</span> <span class="o">=</span> <span class="p">{}</span>

    <span class="k">if</span> <span class="p">(</span><span class="k">typeof</span> <span class="nx">update</span> <span class="o">!==</span> <span class="dl">'</span><span class="s1">object</span><span class="dl">'</span> <span class="o">||</span> <span class="nx">update</span> <span class="o">===</span> <span class="kc">null</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="k">new</span> <span class="nb">Error</span><span class="p">(</span><span class="dl">"</span><span class="s2">Invalid update object</span><span class="dl">"</span><span class="p">);</span><span class="c1">//Unexpected</span>
    <span class="p">}</span>

    <span class="k">if</span> <span class="p">(</span><span class="nb">Object</span><span class="p">.</span><span class="nx">getOwnPropertyNames</span><span class="p">(</span><span class="nx">update</span><span class="p">).</span><span class="nx">length</span> <span class="o">===</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">return</span> <span class="p">{</span> <span class="nx">createArgs</span><span class="p">,</span> <span class="nx">updateArgs</span> <span class="p">};</span>
    <span class="p">}</span>

    <span class="nb">Object</span><span class="p">.</span><span class="nx">entries</span><span class="p">(</span><span class="nx">update</span><span class="p">).</span><span class="nx">forEach</span><span class="p">(([</span><span class="nx">key</span><span class="p">,</span> <span class="nx">value</span><span class="p">])</span> <span class="o">=&gt;</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="nx">value</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="nx">immutables</span><span class="p">.</span><span class="nx">includes</span><span class="p">(</span><span class="nx">key</span><span class="p">))</span> <span class="p">{</span>
            <span class="nx">updateArgs</span><span class="p">[</span><span class="nx">key</span><span class="p">]</span> <span class="o">=</span> <span class="nx">value</span><span class="p">;</span>
            <span class="nx">createArgs</span><span class="p">[</span><span class="nx">key</span><span class="p">]</span> <span class="o">=</span> <span class="nx">value</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">});</span>

    <span class="k">return</span> <span class="p">{</span> <span class="nx">createArgs</span><span class="p">,</span> <span class="nx">updateArgs</span> <span class="p">};</span>
<span class="p">}</span>

</code></pre></div></div>

<p>This assumes a few things:</p>
<ul>
  <li>That there are no nested objects.</li>
  <li>There are no fields we want to exclude from the update.</li>
</ul>

<p>I’m sure there are edge cases I’m not considering, but it works for now.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="typescript" /><category term="orm" /><category term="prisma" /><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Summarizing SRE/Ops Podcasts Using an LLM</title><link href="https://code.deepinspace.net/posts/2025/02/03/summarizing-sre-ops-podcasts/" rel="alternate" type="text/html" title="Summarizing SRE/Ops Podcasts Using an LLM" /><published>2025-02-03T00:00:00+00:00</published><updated>2025-02-03T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2025/02/03/summarizing-sre-ops-podcasts</id><content type="html" xml:base="https://code.deepinspace.net/posts/2025/02/03/summarizing-sre-ops-podcasts/"><![CDATA[<h2 id="introduction">Introduction</h2>

<p>There are plenty of good SRE/Ops related podcasts out there. I follow a few of them and listen to episodes whose titles sound interesting. 
The problem with podcasts is that some episodes focus on one topic, and other episodes deal with a host of topics. In between there is
filler and things that are not relevant to the topic but are necessary to carry on a conversation. Spending 30-60 minutes listening to podcasts is not always a great use of time.</p>

<p>A while ago I decided to create a tool that summarizes podcasts for me using an LLM. If I find the summary interesting enough or there is something that I want to learn about, I go and listen to the entire
episode. Such a tool might be useful to others also, so I made a website for it - <a href="https://www.srenews.info">https://www.srenews.info</a>.</p>

<p>I encourage you to listen to the complete episodes if you find a summary interesting - they are linked from each summary page.</p>

<p>My personal favorites include Google’s SRE Prodcast, Incidentally Reliable, and Slight Reliability.</p>

<h2 id="architecture">Architecture</h2>

<p>The architecture is pretty simple.</p>

<p><img src="/assets/images/podcast-summary-architecture.png" alt="Podcast Summarizer Architecture" /></p>

<p>Behind the scenes:</p>
<ul>
  <li>The feed checker is not automatic yet. I run it manually for now. The code runs on Cloudflare workers.</li>
  <li>The checker triggers a Typescript program which fetches Youtube metadata and the transcript. I chose Typescript as Python packages are not available on Cloudflare workers <a href="https://developers.cloudflare.com/workers/languages/python/">as of this writing</a>, otherwise Python is my first choice for such things.</li>
  <li>The transcript is fed into OpenAI which generates a summary.</li>
  <li>The summary and the metadata are used to generate a post based on a Hugo template and pushed to Git.</li>
  <li>Netlify deploys the site automatically on Git push.</li>
</ul>

<p>So far the only costs I’m incurring for this are for the domain name and OpenAI’s API - and it’s worth it.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="podcast" /><category term="sre" /><category term="ops" /><category term="llm" /><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Today I Learned - How to Add Different Passport Bearer Auth Methods for Different Routes</title><link href="https://code.deepinspace.net/posts/2025/01/23/til-how-to-add-different-passport-bearer-auth-routes/" rel="alternate" type="text/html" title="Today I Learned - How to Add Different Passport Bearer Auth Methods for Different Routes" /><published>2025-01-23T00:00:00+00:00</published><updated>2025-01-23T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2025/01/23/til-how-to-add-different-passport-bearer-auth-routes</id><content type="html" xml:base="https://code.deepinspace.net/posts/2025/01/23/til-how-to-add-different-passport-bearer-auth-routes/"><![CDATA[<p>I have a route in ExpressJS that is protected by Passport bearer auth. The docs have a straightforward example which works if that is the only strategy you need.</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">passport</span><span class="p">.</span><span class="nx">use</span><span class="p">(</span><span class="k">new</span> <span class="nx">Strategy</span><span class="p">(</span>
    <span class="kd">function</span> <span class="p">(</span><span class="nx">token</span><span class="p">:</span> <span class="nx">any</span><span class="p">,</span> <span class="nx">cb</span><span class="p">:</span> <span class="nx">any</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">validateToken</span><span class="p">(</span><span class="nx">token</span><span class="p">).</span><span class="nx">then</span><span class="p">(</span><span class="nx">userid</span> <span class="o">=&gt;</span> <span class="p">{</span>
            <span class="k">if</span> <span class="p">(</span><span class="nx">userid</span><span class="p">)</span> <span class="p">{</span>
                <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="nx">userid</span><span class="p">)</span>
            <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
                <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="kc">false</span><span class="p">)</span>
            <span class="p">}</span>
        <span class="p">}).</span><span class="k">catch</span><span class="p">(</span><span class="nx">error</span> <span class="o">=&gt;</span> <span class="p">{</span>
            <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">[EXT_ROUTES]Failed bearer auth</span><span class="dl">"</span><span class="p">,</span> <span class="nx">error</span><span class="p">);</span>
            <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="nx">error</span><span class="p">);</span>
        <span class="p">})</span>
    <span class="p">}</span>
<span class="p">));</span>
</code></pre></div></div>
<p>which is used later like this:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">app</span><span class="p">.</span><span class="nx">use</span><span class="p">(</span><span class="dl">'</span><span class="s1">/api/v1/ext</span><span class="dl">'</span><span class="p">,</span>
    <span class="nx">passport</span><span class="p">.</span><span class="nx">authenticate</span><span class="p">(</span><span class="dl">'</span><span class="s1">bearer</span><span class="dl">'</span><span class="p">,</span> <span class="p">{</span> <span class="na">session</span><span class="p">:</span> <span class="kc">false</span> <span class="p">}),</span>
    <span class="nx">externalRoutes</span><span class="p">,</span>
<span class="p">)</span>
</code></pre></div></div>

<p>Now I need to add a route that is protected by a different bearer auth strategy. The <a href="https://www.passportjs.org/packages/passport-http-bearer/">official docs</a> don’t have clarity on this.</p>

<p>It turns out that the string ‘bearer’ in the passport.authenticate call is an identifier for the strategy. Defining a new strategy then becomes:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">passport</span><span class="p">.</span><span class="nx">use</span><span class="p">(</span><span class="dl">"</span><span class="s2">cached-bearer</span><span class="dl">"</span><span class="p">,</span> <span class="k">new</span> <span class="nx">Strategy</span><span class="p">(</span>
    <span class="kd">function</span> <span class="p">(</span><span class="nx">token</span><span class="p">,</span> <span class="nx">cb</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">validateCachedToken</span><span class="p">(</span><span class="nx">token</span><span class="p">).</span><span class="nx">then</span><span class="p">(</span><span class="nx">userid</span> <span class="o">=&gt;</span> <span class="p">{</span>
            <span class="k">if</span> <span class="p">(</span><span class="nx">userid</span><span class="p">)</span> <span class="p">{</span>
                <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="nx">userid</span><span class="p">);</span>
            <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
                <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="kc">null</span><span class="p">,</span> <span class="kc">false</span><span class="p">);</span>
            <span class="p">}</span>
        <span class="p">}).</span><span class="k">catch</span><span class="p">(</span><span class="nx">error</span> <span class="o">=&gt;</span> <span class="p">{</span>
            <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">[EXT_ROUTES]Failed cached bearer auth</span><span class="dl">"</span><span class="p">,</span> <span class="nx">error</span><span class="p">);</span>
            <span class="k">return</span> <span class="nx">cb</span><span class="p">(</span><span class="nx">error</span><span class="p">);</span>
        <span class="p">})</span>
    <span class="p">}</span>
<span class="p">));</span>
</code></pre></div></div>
<p>and it can be used as:</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">app</span><span class="p">.</span><span class="nx">use</span><span class="p">(</span><span class="dl">'</span><span class="s1">/api/v1/inbound</span><span class="dl">'</span><span class="p">,</span>
    <span class="nx">passport</span><span class="p">.</span><span class="nx">authenticate</span><span class="p">(</span><span class="dl">'</span><span class="s1">cached-bearer</span><span class="dl">'</span><span class="p">,</span> <span class="p">{</span> <span class="na">session</span><span class="p">:</span> <span class="kc">false</span> <span class="p">}),</span>
    <span class="nx">inboundRoutes</span><span class="p">,</span>
<span class="p">);</span>
</code></pre></div></div>

<p>This <a href="https://stackoverflow.com/a/57047241">Stackoverflow answer</a> pointed me towards the solution.</p>

<p>It’s curious as to how a lot of official docs don’t handle anything beyond the simplest cases, and also don’t explain basics. The docs are the first thing I look at when doing something with a new library, and it’s often a struggle
when they are not sufficient.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="today-I-learned" /><summary type="html"><![CDATA[I have a route in ExpressJS that is protected by Passport bearer auth. The docs have a straightforward example which works if that is the only strategy you need.]]></summary></entry><entry><title type="html">Today I Learned - How to Recover a GCP Instance With 0 Boot Disk Space</title><link href="https://code.deepinspace.net/posts/2025/01/06/today-I-learned-how-to-recover-gcp-instance-boot-disk-space/" rel="alternate" type="text/html" title="Today I Learned - How to Recover a GCP Instance With 0 Boot Disk Space" /><published>2025-01-06T00:00:00+00:00</published><updated>2025-01-06T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2025/01/06/today-I-learned-how-to-recover-gcp-instance-boot-disk-space-</id><content type="html" xml:base="https://code.deepinspace.net/posts/2025/01/06/today-I-learned-how-to-recover-gcp-instance-boot-disk-space/"><![CDATA[<p>I have a GCP instance that I bring up periodically using <a href="https://cloud.google.com/compute/docs/instances/schedule-instance-start-stop">instance schedules</a> to run database backups. 
My database provider has backups of its own but I have an additional backup in place. The GCP instance on boot runs a user script which:</p>
<ul>
  <li>Creates a database dump into a temporary file</li>
  <li>tar + gz the dump file</li>
  <li>Uploads it to a secure bucket</li>
</ul>

<p>If this backup fails, I get alerted on Slack.</p>

<p>The Slack alert fired today. I checked the instance and it seemed to have booted up and went down as expected. I booted it up manually and tried to ssh and got a public key error. 
Attempting to ssh using the browser from the GCP cloud console also failed with the same error.</p>

<p>I enabled serial port logging - and the logs showed that the instance booted up but failed to write the temporary backup files to disk due to lack of space.</p>

<p>Now increasing the boot disk size seemed like one option. However, increasing the disk size from the GCP console just increases the size of the disk and not the partition size. So I needed a way to either increase the partition size or 
delete some files to free up space. But since I could not login to the instance, I could not do either.</p>

<p>The generally prescribed option here is to:</p>
<ul>
  <li>Create a snapshot of the boot disk</li>
  <li>Create a new disk from the snapshot with a larger size</li>
  <li>Boot with the new disk (or create a new instance with the new disk)</li>
</ul>

<p>There seemed to be a shortcut - which was to “detach the boot disk” from the instance and attach it to a new temporary instance, and then:</p>
<ul>
  <li>Boot into the temporary instance</li>
  <li>Delete files from the disk</li>
  <li>Delete the temporary instance</li>
  <li>Reattach the disk to the original instance</li>
</ul>

<p>This worked.</p>

<p>The reason behind the disk going out of space was that my backup script was not deleting the temporary files after the backup was complete, which I fixed.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="today-I-learned" /><summary type="html"><![CDATA[I have a GCP instance that I bring up periodically using instance schedules to run database backups. My database provider has backups of its own but I have an additional backup in place. The GCP instance on boot runs a user script which: Creates a database dump into a temporary file tar + gz the dump file Uploads it to a secure bucket If this backup fails, I get alerted on Slack.]]></summary></entry><entry><title type="html">On Customer Service</title><link href="https://code.deepinspace.net/posts/2024/11/17/On-customer-service/" rel="alternate" type="text/html" title="On Customer Service" /><published>2024-11-17T00:00:00+00:00</published><updated>2024-11-17T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/11/17/On-customer-service</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/11/17/On-customer-service/"><![CDATA[<p>This is a ruminative post, and I won’t be using bullet points.</p>

<p>I also learnt the difference between ruminative and ruminating while writing this.</p>

<p>Recently, I got one of my early customers for my SaaS <a href="https://incidenthub.cloud">IncidentHub</a>. In our back and forth during the trial period,
they happened to mention that one of the deciding factors for them in choosing to go with IncidentHub against a similar product was our “excellent support”.</p>

<p>But what exactly is support? For a software product, or any other product?</p>

<p>For me personally, the key element of support is being meaningfully responsive.</p>

<p>Being responsive is to let the customer know that I am aware of whatever it is that they need help with. 
Once that initial assertion is made, it is completely up to <em>me</em> - and not the customer - to let the customer know about progress with their problems, or any further roadblocks, until
they are satisfied. Even if the end result is that I cannot solve their problem, I have to let them know.</p>

<p>I want to go back a bit since I’m musing anyway. Who is a customer? Are all users - paying and non-paying - also customers?</p>

<p>A customer is somebody to whom I am providing something - a product, a service, an assurance that something will be done in a specific time period. By this definition, if I’m in a team,
all <a href="https://www.linkedin.com/pulse/your-first-customer-team-hrishikesh-barua/">my team members are my customers</a>. If I have a product that users are using, they are my customers.</p>

<p>My thoughts on customer service have been shaped by various people throughout my career, and I wanted to capture these thoughts in one place for two reasons. 
One - to provide clarity to myself. Two - to express gratefulness to those people even though I won’t be naming names.</p>

<p>My first job out of college was in a middleware company called <a href="https://pramati.com/">Pramati</a>. The product team I joined was building a J2EE appserver - which incidentally was also the first to be J2EE 1.3 certified in the world. 
Someday I would love to talk about that in another post. I don’t know what piece of luck landed me in that team after several failed interviews in other companies. I joined an experienced engineering team along with a few other 
fresh-out-of-college folks. It was a new, exciting experience for all of us. Perhaps what set the tone for the rest of my career was the way we used to work. A focus on technical excellence founded on depth of understanding,
a friendly, easy-going atmosphere, and a company led by a <a href="https://blogs.iiit.ac.in/monthly_news/16367/">visionary CEO</a> and a co-founder who were far ahead of their time.</p>

<p>It was a small team. Engineering would often work closely with customer support. Every customer was valuable. We got to see firsthand how the support team interacted with customers, and we sometimes got on calls ourselves.
Now that I look back at it, this experience - so early on in my career - created an awareness that many engineers seem to lack. It’s an eye-opening moment when you see the impact - bad or good - that your code
has in an actual customer deployment.</p>

<p>It would still take me many years before I understood the full value of what I had learnt then.</p>

<p>Years later, in a different team, a boss taught me that the commonly understood definition of <code class="language-plaintext highlighter-rouge">customer</code> as somebody who pays you for your product or service is severely restrictive. 
A customer is actually anybody who depends on you for something, and thus includes your team members. My boss was trying to make us work better as a team - and I think he succeeded to some extent.</p>

<p>The importance of engineering folks working with the support team - or even as part of the support team for a few weeks - cannot be understated. I think it should be part of every junior engineer’s training.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="customer-service" /><category term="philosophy" /><summary type="html"><![CDATA[This is a ruminative post, and I won’t be using bullet points.]]></summary></entry><entry><title type="html">Today I Learned - How To Setup GitHub Actions for a Node Monorepo</title><link href="https://code.deepinspace.net/posts/2024/06/27/Today-I-Learned-GitHub-Actions-Node-Monorepo/" rel="alternate" type="text/html" title="Today I Learned - How To Setup GitHub Actions for a Node Monorepo" /><published>2024-06-27T00:00:00+00:00</published><updated>2024-06-27T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/06/27/Today-I-Learned-GitHub-Actions-Node-Monorepo</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/06/27/Today-I-Learned-GitHub-Actions-Node-Monorepo/"><![CDATA[<p>A monorepo is a repository with more than one logical project. I was trying to setup GitHub Actions to build my repository
with two distinct projects inside it. GitHub Actions will run your YAML configuration whenever you commit. 
Turns out it’s not so straightforward.</p>

<p>This is what my project structure looks like</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>project-root
  - frontend
    - package-lock.json
    - src/
    - ... &lt;etc&gt;
  - backend
    - package-lock.json
    - src/
    - ... &lt;etc&gt;
</code></pre></div></div>

<p>The autogenerated node.js.yml looks like this</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">name</span><span class="pi">:</span> <span class="s">Node.js CI</span>

<span class="na">on</span><span class="pi">:</span>
  <span class="na">push</span><span class="pi">:</span>
    <span class="na">branches</span><span class="pi">:</span> <span class="pi">[</span> <span class="s2">"</span><span class="s">main"</span> <span class="pi">]</span>
  <span class="na">pull_request</span><span class="pi">:</span>
    <span class="na">branches</span><span class="pi">:</span> <span class="pi">[</span> <span class="s2">"</span><span class="s">main"</span> <span class="pi">]</span>

<span class="na">jobs</span><span class="pi">:</span>
  <span class="na">build</span><span class="pi">:</span>
    <span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
    <span class="na">strategy</span><span class="pi">:</span>
      <span class="na">matrix</span><span class="pi">:</span>
        <span class="na">node-version</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">14.x</span><span class="pi">,</span> <span class="nv">16.x</span><span class="pi">,</span> <span class="nv">18.x</span><span class="pi">]</span>
        <span class="c1"># See supported Node.js release schedule at https://nodejs.org/en/about/releases/</span>
    <span class="na">steps</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Use Node.js $</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/setup-node@v3</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">node-version</span><span class="pi">:</span> <span class="s">$</span>
        <span class="na">cache</span><span class="pi">:</span> <span class="s1">'</span><span class="s">npm'</span>
    <span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">npm ci</span>
    <span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">npm run build --if-present</span>
</code></pre></div></div>

<p>How do I specify that the jobs have to run inside each subdirectory? And there is no top level package-lock.json (which is where
some of the proposed solutions break)</p>

<p>A combination of <a href="https://stackoverflow.com/a/71281459/10996">SO</a> and <a href="https://medium.com/@owumifestus/configuring-github-actions-in-a-multi-directory-repository-structure-c4d2b04e6312">Medium</a> 
posts and a <a href="https://github.com/actions/setup-node/issues/706#issuecomment-1557983832">GitHub issue comment</a> helped me to get to a working configuration. The key points are</p>
<ul>
  <li>Use the “matrix” keyword to declare the equivalent of “run this for d in directories” where directories is the list of your subdirectories</li>
  <li>Specify a working-directory under defaults, with the value being $d from the matrix</li>
  <li>Specify the cache-dependency-path with the same $d/package-lock.json</li>
  <li>Add an npm install before doing anything</li>
</ul>

<p>The final working YAML looks like this</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="c1"># This workflow will do a clean installation of node dependencies, cache/restore them, build the source code and run tests across different versions of node</span>
<span class="c1"># For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-nodejs</span>

<span class="na">name</span><span class="pi">:</span> <span class="s">Node.js CI</span>

<span class="na">on</span><span class="pi">:</span>
  <span class="na">push</span><span class="pi">:</span>
    <span class="na">branches</span><span class="pi">:</span> <span class="pi">[</span> <span class="s2">"</span><span class="s">main"</span> <span class="pi">]</span>
  <span class="na">pull_request</span><span class="pi">:</span>
    <span class="na">branches</span><span class="pi">:</span> <span class="pi">[</span> <span class="s2">"</span><span class="s">main"</span> <span class="pi">]</span>

<span class="na">jobs</span><span class="pi">:</span>
  <span class="na">build</span><span class="pi">:</span>
    <span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
    <span class="na">strategy</span><span class="pi">:</span>
      <span class="na">matrix</span><span class="pi">:</span> <span class="pi">{</span> <span class="nv">dir</span><span class="pi">:</span> <span class="pi">[</span><span class="s1">'</span><span class="s">./backend'</span><span class="pi">,</span> <span class="s1">'</span><span class="s">./frontend'</span><span class="pi">],</span> <span class="nv">node-version</span><span class="pi">:</span> <span class="pi">[</span><span class="s1">'</span><span class="s">20.x'</span><span class="pi">]</span> <span class="pi">}</span>
        <span class="c1"># See supported Node.js release schedule at https://nodejs.org/en/about/releases/</span>
    <span class="na">defaults</span><span class="pi">:</span>
      <span class="na">run</span><span class="pi">:</span>
        <span class="na">working-directory</span><span class="pi">:</span> <span class="s">$</span>
    <span class="na">steps</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>
    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Use Node.js $</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/setup-node@v3</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">node-version</span><span class="pi">:</span> <span class="s">$</span>
        <span class="na">cache</span><span class="pi">:</span> <span class="s1">'</span><span class="s">npm'</span>
        <span class="na">cache-dependency-path</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$/package-lock.json'</span>
    <span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">npm install</span>
    <span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">npm ci</span>
    <span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">npm run build --if-present</span>
<span class="c1">#    - run: npm run test</span>
</code></pre></div></div>
<p>Note that I did not have to include the node-version in the matrix - it is just to illustrate the syntax.</p>

<p>Thank you for reading, and do reach out via comments or on <a href="https://twitter.com/talonx" target="_blank">Twitter</a> if you want to chat or share your thoughts.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="today-I-learned" /><category term="javascript" /><category term="github-actions" /><summary type="html"><![CDATA[A monorepo is a repository with more than one logical project. I was trying to setup GitHub Actions to build my repository with two distinct projects inside it. GitHub Actions will run your YAML configuration whenever you commit. Turns out it’s not so straightforward.]]></summary></entry><entry><title type="html">Today I Learned - Disabling Object Listing for Google Storage Buckets with Public Access</title><link href="https://code.deepinspace.net/posts/2024/06/22/Today-I-Learned-Google-Storage-Legacy/" rel="alternate" type="text/html" title="Today I Learned - Disabling Object Listing for Google Storage Buckets with Public Access" /><published>2024-06-22T00:00:00+00:00</published><updated>2024-06-22T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/06/22/Today-I-Learned-Google-Storage-Legacy</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/06/22/Today-I-Learned-Google-Storage-Legacy/"><![CDATA[<p>Using a Google Cloud Storage (GCS) bucket for static storage is a very easy way to serve static content over HTTPS. For this to work,
public access has to be enabled on the bucket’s objects. The access should be read only at the public level and can be set
using one of Google IAM’s predefined roles.</p>

<p>At first glance, the role for this would seem to be <code class="language-plaintext highlighter-rouge">Storage Object Viewer</code>, and that’s what I went with when setting up a bucket 
recently to serve images. This role though also exposes the bucket contents as an XML, which is not something you want.</p>

<p>It turns out that the appropriate role is <code class="language-plaintext highlighter-rouge">Storage Legacy Object Reader</code>. The difference between the roles can be seen
in their permissions.</p>

<p>Storage Object Viewer has both list and get permissions:</p>

<p><img src="/assets/images/viewer-gcs.png" alt="Storage Object Viewer" /></p>

<p>whereas Storage Legacy Object Reader has just a get permission:</p>

<p><img src="/assets/images/legacy-gcs.png" alt="Storage Legacy Object Reader" /></p>

<p>Thank you for reading, and do reach out via comments or on <a href="https://twitter.com/talonx" target="_blank">Twitter</a> if you want to chat or share your thoughts.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="today-I-learned" /><category term="google-cloud-platform" /><category term="iam" /><summary type="html"><![CDATA[Using a Google Cloud Storage (GCS) bucket for static storage is a very easy way to serve static content over HTTPS. For this to work, public access has to be enabled on the bucket’s objects. The access should be read only at the public level and can be set using one of Google IAM’s predefined roles.]]></summary></entry><entry><title type="html">Software Defined Networking - A Short Introduction</title><link href="https://code.deepinspace.net/posts/2024/04/19/Software-Defined-Networking-a-Short-Introduction/" rel="alternate" type="text/html" title="Software Defined Networking - A Short Introduction" /><published>2024-04-19T00:00:00+00:00</published><updated>2024-04-19T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/04/19/Software-Defined-Networking-a-Short-Introduction</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/04/19/Software-Defined-Networking-a-Short-Introduction/"><![CDATA[<h2 id="programmable-networks">Programmable Networks</h2>

<p>Let’s take a simple example. To set up a simple network switch for your home network, you have to connect it using LAN cables to your router, and then to the individual 
devices - laptops, desktops, your NAS, and so on. You “define” the routes between the devices using cables and sockets. Switches are “smart” in the sense they learn routes 
to the devices connected to them. You cannot change the way the routes are defined - your switch has 6 or 8 or more ports which it knows about, and how to route packets between the devices. The routing logic is hardcoded in the device.</p>

<p>But what if you wanted to change the routing logic to something custom? And what if you had multiple switches, and you wanted to control the routing logic from one place? Traditional networking devices won’t allow you to do this.</p>

<h2 id="why-should-you-an-ops-engineer-know-anything-about-sdn">Why Should You, an Ops Engineer, Know Anything About SDN?</h2>

<p>The entire foundation of a cloud is based on the virtualization of resources - storage, compute, and network. The concepts of SDN make it easier to enable network virtualization. 
Clouds are fundamentally multi-tenanted, and having an idea of how things work behind the scenes can give us a better appreciation of what goes on under the hood. As a cloud user or admin,
you don’t need to know how SDN works because it’s never exposed to you. Nevertheless, it can be fascinating especially if you are interested in networking, like I am.</p>

<h2 id="the-control-plane-and-the-data-plane">The Control Plane and the Data Plane</h2>

<p>Before going ahead we need to understand these two terms. There are two primary functions where data transfer is concerned - transferring the data itself and communicating the routing 
messages that define how the data should be transferred.</p>

<p><strong>Control plane</strong> - The part of a networking infrastructure that decides how to handle traffic.</p>

<p><strong>Data plane</strong> - The part of a networking infrastructure that forwards traffic according to the control plane’s decisions.</p>

<p>In addition, there is a “controller” which sits on top of the control plane and is the gateway to configure the data plane devices. A single software controller can control multiple data planes using APIs.</p>

<h2 id="a-bit-of-history">A Bit of History</h2>

<p>Networking equipment used to be configurable using vendor-specific interfaces on individual devices. The evolution of SDN was driven by several factors, and happened in roughly three stages, a
s described in Feamster et al’s paper [1].</p>

<h4 id="active-networks"><strong>Active networks</strong></h4>

<p>Active networks were an initiative where network devices (we will refer to each one as a node) exposed internal resources using a network API. Packets passing through a node could undergo custom p
rocessing (in the data plane). This was born out of frustration with long timeframes to deploy new network services, the difficulty of customizability for specific applications, and researchers’ desire for the ability to experiment at a large scale.</p>

<h4 id="control-and-data-plane-separation"><strong>Control and data plane separation</strong></h4>

<p>As commodity hardware became cheaper and more powerful, and ISPs had to manage larger networks, research projects started to focus on the control plane, rather than data plane (which was the 
case in active networking) programmability. Visibility into the entire network also became a requirement. There was some initial skepticism around <em>not</em> having a single point of failure (e.g. 
the control plane failing and the data plane continuing to work) but similar problems already existed in existing hardware.</p>

<h4 id="the-openflow-api"><strong>The OpenFlow API</strong></h4>

<p>OpenFlow is a protocol specification of the data plane functionality and also a protocol between the controllers and the data plane devices in a setup where these two planes were separated.
OpenFlow allowed packet forwarding rules to be defined on much more than the destination IP address, which was the case with traditional devices.</p>

<p>So what identifies a network as software-defined?</p>

<ul>
  <li>
    <p>Separate control and data planes.</p>
  </li>
  <li>
    <p>The control plane is a centralized controller or set of controllers that can view and control the entire network or networks and is implemented as software that can run on commodity hardware.</p>
  </li>
  <li>
    <p>Data plane devices are “dumb” forwarding devices.</p>
  </li>
  <li>
    <p>Well-known (public) interfaces exist between the control plane devices (controllers) and the data plane devices).</p>
  </li>
  <li>
    <p>Other software can program the network using the SDN controllers to suit their needs.</p>
  </li>
</ul>

<h2 id="the-openflow-protocol">The OpenFlow Protocol</h2>

<p>The initial OpenFlow spec was born out of the idea that the already existing flow tables (or access control lists) in network devices could be used to describe newer packet forwarding behaviour [2].
There was also a need to divide (“slice”) the flow tables so that researchers could run experiments on production networks without impacting real-world traffic. </p>

<p>After separating the planes, admins could program the control plane from any operating system remotely, and thus define the flow tables the way they wanted without programming the device itself.</p>

<h2 id="virtualizing-the-network">Virtualizing the Network</h2>

<p>Network virtualization (NV) allows one or more virtual networks to exist on top of a shared physical network. Virtual networks predate the idea of SDN. The concept is related to SDN because
SDN enables NV more easily. </p>

<p>If you use any cloud provider, you already use network virtualization. The VPC in AWS, or in GCP - are just virtual (and your personal) networks built on top of shared physical ones.</p>

<h2 id="setup-a-virtual-network-on-your-own">Setup a Virtual Network on Your Own</h2>

<p><a href="https://mininet.org/" target="_blank">Mininet</a> is a software that can create a virtual network or networks on a single machine, letting you play around with various network topologies on your 
laptop. The network devices are emulated in software.</p>

<p>You can run Mininet</p>

<ul>
  <li>
    <p>From the CLI</p>
  </li>
  <li>
    <p>Programmatically</p>
  </li>
  <li>
    <p>Using miniedit - a rudimentary GUI. I don’t recommend this as it’s easy to trip on some bugs.</p>
  </li>
</ul>

<p>We’ll take the programmatic approach.</p>

<p>Installing Mininet is simple using your package manager if you’re on Linux.</p>

<p>On Debian-based distros, run</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt install mininet
</code></pre></div></div>

<p>Although you can create complex topologies (Ring, Tree, and so on), we will create simple host and switch-based ones here. You can refer to the <a href="http://mininet.org/api/annotated.html" target="_blank">Mininet docs</a> for more information.</p>

<h3 id="a-simple-two-host-network-topology"><strong>A Simple Two-Host Network Topology</strong></h3>

<p>Source: <a href="https://github.com/talonx/mininet-demo/blob/main/basic_topology.py">https://github.com/talonx/mininet-demo/blob/main/basic_topology.py</a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>from mininet.topo import Topo, LinearTopo;
from mininet.net import Mininet;
from mininet.cli import CLI;

class Basic(Topo):

    def __init__(self):
        Topo.__init__(self)

        h1 = self.addHost("h1");
        h2 = self.addHost("h2");

        s1 = self.addSwitch("s1");

        self.addLink(h1, s1, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);
        self.addLink(h2, s1, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);

topo = Basic();
net = Mininet(topo=topo) # Uses the default reference controller
net.start()
CLI(net)
net.stop()
</code></pre></div></div>

<p>This defines two hosts h1 and h2 and a switch connecting them using the addLink method. This forms the core of any Mininet topology definition.</p>

<p>There are different ways to start the network simulator, but the easiest is to invoke</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>net.start()
</code></pre></div></div>

<p>and then move into interactive mode with</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CLI(net)
</code></pre></div></div>

<p>which will drop you into a command prompt (mininet&gt;) where you can run various commands to interact with your virtual network.</p>

<p>Run the Python script by invoking</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo python3 basic_topology_run_cmd.py
</code></pre></div></div>

<p>The dump command shows you the hosts and devices</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mininet&gt; dump
&lt;Host h1: h1-eth0:10.0.0.1 pid=885505&gt;
&lt;Host h2: h2-eth0:10.0.0.2 pid=885507&gt;
&lt;OVSSwitch s1: lo:127.0.0.1,s1-eth1:None,s1-eth2:None pid=885512&gt;
&lt;OVSController c0: 127.0.0.1:6653 pid=885498&gt;
</code></pre></div></div>

<p>Note that Mininet created a default SDN controller since we did not provide one. This is sufficient for simple topologies.</p>

<p>Now try pinging h2 from h1</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mininet&gt; h1 ping h2 -c 3
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=4.45 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.280 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.081 ms

--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2023ms
rtt min/avg/max/mdev = 0.081/1.602/4.445/2.011 ms
</code></pre></div></div>

<p>Type quit or exit to exit from the CLI.</p>

<h3 id="running-commands-inside-the-hosts"><strong>Running Commands Inside the “Hosts”</strong></h3>

<p>Source: <a href="https://github.com/talonx/mininet-demo/blob/main/basic_topology_run_cmd.py">https://github.com/talonx/mininet-demo/blob/main/basic_topology_run_cmd.py</a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>from mininet.topo import Topo, LinearTopo;
from mininet.net import Mininet;
from mininet.cli import CLI;

class Basic(Topo):

    def __init__(self):
        Topo.__init__(self)

        h1 = self.addHost("h1");
        h2 = self.addHost("h2");

        s1 = self.addSwitch("s1");

        self.addLink(h1, s1, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);
        self.addLink(h2, s1, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);

topo = Basic();
net = Mininet(topo=topo) # Uses the default reference controller
net.start()

h1 = net.get("h1");
res = h1.cmd("route -n")
print(res)

net.stop()
</code></pre></div></div>

<p>This illustrates running a command from inside one of the hosts.</p>

<h3 id="a-multi-switch-topology"><strong>A Multi-Switch Topology</strong></h3>

<p>Source: <a href="https://github.com/talonx/mininet-demo/blob/main/multi_switch.py">https://github.com/talonx/mininet-demo/blob/main/multi_switch.py</a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>from mininet.topo import Topo, LinearTopo;
from mininet.net import Mininet;
from mininet.cli import CLI;

class Lan(Topo):

    def __init__(self):
        Topo.__init__(self)

        s1 = self.addSwitch("s1");
        s2 = self.addSwitch("s2");
        h1 = self.addHost("h1");
        h2 = self.addHost("h2");

        # Link host 1 to switch 1
        self.addLink(h1, s1, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);
        # Link host 2 to switch 2
        self.addLink(h2, s2, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);
        # Link switch 1 to switch 2
        self.addLink(s1, s2, bw=1, delay="10ms", loss=0, max_queue_size=1000, use_htb=True);

topo = Lan();
net = Mininet(topo=topo) # Uses the default reference controller
net.start()
CLI(net)
net.stop()
</code></pre></div></div>

<p>This illustrates two switches, with one host connected to each. Check the links created once you’re inside the CLI</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mininet&gt; links
h1-eth0&lt;-&gt;s1-eth1 (OK OK)
h2-eth0&lt;-&gt;s2-eth1 (OK OK)
s1-eth2&lt;-&gt;s2-eth2 (OK OK)
</code></pre></div></div>

<p>And then ping h2 from h1 (they are not connected to the same switch)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mininet&gt; h1 ping h2 -c 3
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.077 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.077 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.080 ms

--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2025ms
rtt min/avg/max/mdev = 0.077/0.078/0.080/0.001 ms
</code></pre></div></div>

<p>Mininet uses the reference implementation of an SDN controller if you don’t specify anything.</p>

<p>There are other software controllers available</p>

<ol>
  <li>
    <p>Pox - <a href="https://github.com/noxrepo/pox">https://github.com/noxrepo/pox</a></p>
  </li>
  <li>
    <p>Ryu - <a href="https://ryu-sdn.org/">https://ryu-sdn.org/</a></p>
  </li>
  <li>
    <p>ONOS - <a href="https://opennetworking.org/onos/">https://opennetworking.org/onos/</a></p>
  </li>
  <li>
    <p>OpenVSwitch - <a href="https://www.openvswitch.org/">https://www.openvswitch.org/</a></p>
  </li>
  <li>
    <p>OpenDaylight - <a href="https://www.opendaylight.org/">https://www.opendaylight.org/</a></p>
  </li>
</ol>

<h4 id="references">References</h4>

<ol>
  <li>
    <p>The Road to SDN - An intellectual history of programmable networks - <a href="https://queue.acm.org/detail.cfm?id=2560327">https://queue.acm.org/detail.cfm?id=2560327</a></p>
  </li>
  <li>
    <p>Cloud Native Data Center Networking: Architecture, Protocols, and Tools - Dinesh G. Dutt - <a href="https://www.oreilly.com/library/view/cloud-native-data/9781492045595/">https://www.oreilly.com/library/view/cloud-native-data/9781492045595/</a></p>
  </li>
  <li>
    <p>Foundations of Modern Networking - William Stallings - <a href="https://www.oreilly.com/library/view/foundations-of-modern/9780134175478/">https://www.oreilly.com/library/view/foundations-of-modern/9780134175478/</a></p>
  </li>
</ol>

<p>Thank you for reading, and do reach out via comments or on <a href="https://twitter.com/talonx" target="_blank">Twitter</a> if you want to chat or share your thoughts.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="softwaredefinednetworking" /><category term="computernetworking" /><summary type="html"><![CDATA[Programmable Networks]]></summary></entry><entry><title type="html">How the Domain Name System Uses Anycast for Low Latency</title><link href="https://code.deepinspace.net/posts/2024/02/29/How-the-Domain-Name-System-Uses-Anycast-for-Low-Latency/" rel="alternate" type="text/html" title="How the Domain Name System Uses Anycast for Low Latency" /><published>2024-02-29T00:00:00+00:00</published><updated>2024-02-29T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/02/29/How-the-Domain-Name-System-Uses-Anycast-for-Low-Latency</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/02/29/How-the-Domain-Name-System-Uses-Anycast-for-Low-Latency/"><![CDATA[<p>In this article, I will explore what Anycast is in internetworking and how it is used to reduce latency.</p>

<p>Anycast is a concept that involves a group of servers that share the same IP address, and the server that is closest to
the client gets to serve the request. This definition raises some questions — won’t there be IP conflicts? How is “closest”
determined? How does the request reach the “closest” server?</p>

<p>We will take the example of DNS throughout this article. All major DNS providers use Anycast.</p>

<p>First, let’s look at how DNS resolution works.</p>

<h2 id="the-mechanics-of-dns-resolution">The Mechanics of DNS Resolution</h2>

<p>A request for the domain <em>google.com</em> from your browser goes to a DNS resolver, which resolves it to an IP address. 
This resolution happens by querying nameservers recursively. Why recursively? Each query in the recursive process resolves 
one part of the domain and the process starts from the tail-end.</p>

<p>The resolver is usually on your laptop (e.g. unbound or resolved on Linux). It contacts one of the 
<a href="https://www.iana.org/domains/root/servers" target="_blank">13 root nameservers</a> first. From there, it fetches the IP
of the nameserver that knows about the TLD (top-level domain), in this case, <em>.com</em>. Next, it contacts the <em>.com</em> nameserver 
to ask who knows about <em>google.com</em>. The response is the IP of another nameserver — called an authoritative nameserver. 
The authoritative nameserver responds with the IP address of <em>google.com</em>. If we had queried for a subdomain of <em>google.com</em> 
(www.google.com or images.google.com) the query would have continued similarly.</p>

<p>What’s important to note here is the server that responds at each step.</p>

<p>What if one or all of the 13 nameservers, or the other servers in the chain, were down or unreachable because of hardware failure,
a damaged undersea cable, or a Distributed Denial-of-Service (DDoS) attack?</p>

<p>In reality, the 13 servers are 13 IP addresses, each backed by multiple actual servers. So are the TLD and 
authoritative nameservers. So which server actually responds to our DNS resolver query? That is where Anycast comes in.</p>

<p>We have to dive a bit into how routing works on the internet to understand this.</p>

<h2 id="routing-on-the-internet">Routing on the Internet</h2>

<p>The client (the resolver) gets the IP address of a root nameserver and it sends out a query — let’s call it Q. 
How is Q routed?</p>

<p>The internet is made up of Autonomous Systems (ASs) — blocks of network owned by different entities, 
many of them ISPs. Each AS knows how to route packets within its own network, and it advertises the network prefixes to
which it can route. These prefixes include its own network prefix as well as other ASs it connects to and to which it
can forward packets. Routers in one AS announce these advertisements using the Border Gateway Protocol (BGP) to routers in other
ASs. This is how routers know how to send a packet originating in one AS to its destination. BGP is used for most of the routing on the internet.</p>

<p>Our Q will also use BGP. If we had just one physical server for a root nameserver IP (let’s say 198.41.0.4),
Q will get routed through various ASs until it reaches the root nameserver. BGP will use its shortest path algo to send the packet to 198.41.0.4.</p>

<p>But what if</p>
<ul>
  <li>The client and 198.41.0.4 are far away (as measured by BGP).</li>
  <li>There is a damaged undersea cable, further lengthening the path Q has to take.</li>
  <li>198.41.0.4’s data center has a power failure. Q will never receive a response. If 198.41.0.4 were to be replicated 
behind a load balancer and multiple data centers, a single network disruption could still make it unreachable.</li>
</ul>

<p>Anycast is used to mitigate such issues.</p>

<h2 id="anycast-in-a-nutshell">Anycast in a Nutshell</h2>

<p>Multiple servers in different locations (ASs) announce the same address (198.41.0.4 in this example) to their routing device.
This is possible because internet routes for a particular prefix can come from 
<a href="https://bgp.potaroo.net/as6447/bgp-multi-org-prefix.txt" target="_blank">multiple ASs</a>. BGP uses this information to 
create routes. When Q reaches its first routing point in Q’s AS, BGP calculates the shortest path to 198.41.0.4 from Q’s AS.
This router might have multiple paths to 198.41.0.4, which in reality points to different servers - but the router thinks 
they are the same endpoint through different routes. Based on the client’s location, and thus the path, the actual server to
which 198.41.0.4 is mapped might be different. The packet gets routed to the closest server which answers to 198.41.0.4.
A client in a different location might get routed to another server which also answers to 198.41.0.4.</p>

<p>The servers are geographically distributed, which helps to prevent or minimize disruptions from outages.</p>

<p>If you think about this for a moment, some things stand out:</p>
<ul>
  <li>All servers that answer to 198.41.0.4 must have the same information.</li>
  <li>The admin of 198.41.0.4 should be able to announce the same IP address into the routing system at multiple points.</li>
</ul>

<p>Anycast is not a different protocol, does not need any different hardware, and does not require any special capabilities.</p>

<h2 id="anycast-for-stateful-applications">Anycast for Stateful Applications</h2>

<p>The original RFC <a href="https://datatracker.ietf.org/doc/html/rfc1546" target="_blank">describing</a> Anycast raised some interesting points.</p>

<p>What stops a packet from reaching multiple servers since the Internet Protocol (IP) is allowed to misroute and duplicate packets?</p>

<p>Imagine that Q reaches its target but the ACK is lost. The packet will get redelivered, but will it reach the same server as
before or a different server? What if BGP’s shortest path algorithm determines a new path in the time between the redelivery 
attempts because of a change? Does it even matter?</p>

<p>At the network level, these issues matter for stateful protocols like TCP. TCP’s connections will get reset frequently 
if the destination server changes in the middle due to routing changes.</p>

<p>DNS queries use mostly UDP, so these issues don’t arise. However, if other applications using UDP attempt to maintain stateful 
connections, they might run into such issues when using Anycast. The RFC goes on to say:</p>
<blockquote>
  <p><em>“The obvious solutions to these issues are to require applications which wish to maintain state to learn the unicast address of their peer on the first exchange of UDP datagrams or during the first TCP connection and use the unicast address in future conversations.”</em></p>
</blockquote>

<p>What about the application level? Can we use Anycast?</p>

<p>We can, in theory, but it would restrict application capabilities and raise new challenges:</p>
<ul>
  <li>Transactions will have to be short-lived enough to get routed to the same server.</li>
  <li>Servers will have to synchronize data between themselves.</li>
</ul>

<p>But wait, many Content Delivery Networks (CDNs) also use Anycast for data transfer over HTTP. HTTP is a stateless protocol 
but uses TCP which is stateful. So how does it work?</p>

<p>CDNs mostly serve short-lived, static content that can be served with a single request, so this issue is of 
low-importance for such cases. There are also some studies indicating that the actual “switching” of servers due to routing 
changes is <a href="https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf" target="_blank">very low</a> (PDF link). 
For longer-lived connections, some services use a strategy where the initial address 
reached using Anycast is used to redirect the client to a nearby server, which is possibly co-located in the same data center. 
All subsequent communication happens using that.</p>

<p>Anycast is not a load-balancing mechanism. Also, BGP’s path selection algorithm uses the AS_PATH metric between ASs, which 
determines the shortest path based on the number of ASs that have to be traversed. It does not take into account network delays or capacity.</p>

<h2 id="anycast-in-ddos-mitigation">Anycast in DDoS Mitigation</h2>

<p>Anycast is also used to mitigate DDoS attacks. Eliminating a single point of failure improves resiliency of the service. 
In the case of DNS, the root nameservers are replicated. During an attack, the bulk of the DDoS traffic can be localized
to specific regions, and thus avoid taking down the entire service.</p>

<p>Anycast remains a key mechanism for global internet services like DNS and CDN to reduce latency and has been put into operation by most big providers.</p>

<h3 id="references">References</h3>
<ul>
  <li>List of Anycast-related RFCs
    <ul>
      <li>Host Anycasting Service <a href="https://www.rfc-editor.org/info/rfc1546" target="_blank">https://www.rfc-editor.org/info/rfc1546</a></li>
      <li>Distributing Authoritative Name Servers via Shared Unicast Addresses <a href="https://www.rfc-editor.org/info/rfc3258" target="_blank">https://www.rfc-editor.org/info/rfc3258</a></li>
      <li>Operation of Anycast Services <a href="https://www.rfc-editor.org/info/rfc4786" target="_blank">https://www.rfc-editor.org/info/rfc4786</a></li>
      <li>Architectural Considerations of IP Anycast <a href="https://www.rfc-editor.org/info/rfc7094" target="_blank">https://www.rfc-editor.org/info/rfc7094</a></li>
    </ul>
  </li>
  <li>Submarine Cable Map <a href="https://www.submarinecablemap.com/" target="_blank">https://www.submarinecablemap.com/</a></li>
</ul>

<p>Thank you for reading, and do reach out via comments or on <a href="https://twitter.com/talonx" target="_blank">Twitter</a> if you want to chat or share your thoughts.</p>]]></content><author><name>Hrishikesh Barua</name></author><category term="anycast" /><category term="dns" /><category term="computernetworking" /><summary type="html"><![CDATA[In this article, I will explore what Anycast is in internetworking and how it is used to reduce latency.]]></summary></entry><entry><title type="html">Some New Tech I Learnt in 2023</title><link href="https://code.deepinspace.net/posts/2024/02/05/Some-New-Tech-I-Learned-In-2023/" rel="alternate" type="text/html" title="Some New Tech I Learnt in 2023" /><published>2024-02-05T00:00:00+00:00</published><updated>2024-02-05T00:00:00+00:00</updated><id>https://code.deepinspace.net/posts/2024/02/05/Some-New-Tech-I-Learned-In-2023</id><content type="html" xml:base="https://code.deepinspace.net/posts/2024/02/05/Some-New-Tech-I-Learned-In-2023/"><![CDATA[<p>I took a career break in the middle of last year to pursue some of my other interests - both technical and non-technical.</p>

<p>In the process I ended up tinkering with quite a few new things in tech, and I’ve made some of them part of my ongoing and future work.</p>

<ul>
  <li>I explored the Fast AI API using Jeremy Howards’s fantastic set of videos. I completed most of Part 1 in his <a href="https://course.fast.ai/" target="_blank">Practical Deep Learning course</a>. I learnt some of
the Fast.ai API, a little bit of deep learning concepts, and how to use Jupyter notebooks on Kaggle.</li>
  <li>I wrote a <a href="/posts/2023/10/07/Writing-a-Twitter-bot-with-Google-Cloud-Functions" target="_blank">Twitter bot</a> that tweets classical music trivia. This was just for fun. <br />
It’s built with Python, Google Cloud Functions, and Firebase. The thing does not cost me a single penny as it runs entirely on GCP’s free tier. I learnt about Twitter bot types and its auth API and a bit of Firebase.</li>
  <li>While working on a frontend for a personal project, I had to try out <a href="https://hotwired.dev/" target="_blank">Hotwire</a>. A friend had mentioned this. I did a few tutorials which used 
ROR. My ROR knowledge is essentially zero, but Hotwire is not tied to ROR - it can be used with any backend.</li>
  <li>For the same project I had to use <a href="https://jte.gg/" target="_blank">Java Template Engine</a>. It’s a good thing that it comes with Spring Boot integration.</li>
  <li>Spring boot persistence with JPA/Hibernate - This was for a personal project that I’m still working on. JPA is a beast, and I’m thinking of switching to something simpler, if it exists.</li>
  <li>Rust. This is a whole new way of thinking. It is also a lot of struggling with the compiler for the first few months.</li>
</ul>]]></content><author><name>Hrishikesh Barua</name></author><category term="learning" /><summary type="html"><![CDATA[I took a career break in the middle of last year to pursue some of my other interests - both technical and non-technical.]]></summary></entry></feed>