<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>S Anand</title>
    <link>https://www.s-anand.net/blog/</link>
    <description>Recent content on S Anand</description>
    <generator>Hugo -- 0.156.0</generator>
    <language>en-us</language>
    <lastBuildDate>Fri, 05 Jun 2026 09:41:33 +0800</lastBuildDate>
    <atom:link href="https://www.s-anand.net/blog/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>My changing AI opinions</title>
      <link>https://www.s-anand.net/blog/my-changing-ai-opinions/</link>
      <pubDate>Fri, 05 Jun 2026 09:41:33 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/my-changing-ai-opinions/</guid>
      <description>&lt;p&gt;I asked Claude about my AI opinions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Based on my transcripts and blog posts, find the three claims I make most consistently, the three I&amp;rsquo;ve quietly reversed, and the one assumption I&amp;rsquo;ve never questioned but everything depends on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!-- https://claude.ai/chat/1d41b5a3-7963-4e87-aade-c44a37d5f355 --&gt;
&lt;p&gt;Here are things I&amp;rsquo;ve changed my opinion on:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: One frontier model will win - not specialization. &lt;strong&gt;NOW&lt;/strong&gt;: Gemini for media, Claude for strategy/style, GPT for rigor. SLMs as tools.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Carefully curate my course content. &lt;strong&gt;NOW&lt;/strong&gt;: Give students prompts directly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Web apps are differentiated artifacts. &lt;strong&gt;NOW&lt;/strong&gt;: HTML is easier to generate than PPT - a signal of slop, not craft.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Human in the loop. &lt;strong&gt;NOW&lt;/strong&gt;: Human NOT in the loop, bottlenecking it. On-the-loop, etc. is fine.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Minimal single-agent loop, avoid sub-agents&amp;quot; &lt;strong&gt;NOW&lt;/strong&gt;: Multi-agent, sub-agent, and agent teams.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEN&lt;/strong&gt;: Avoid MCP, prefer SKILLS.md. &lt;strong&gt;NOW&lt;/strong&gt;: Use MCP because integrating with Claude / ChatGPT / &amp;hellip; is easy.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There are the top contradictions in my opinions.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&amp;ldquo;Vibe code everything, end-to-end&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Don&amp;rsquo;t commit based on vibe-coding, it&amp;rsquo;s not fully reliable&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Experience is a liability&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Domain expertise is an edge&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Paid plans are fine for private data&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Don&amp;rsquo;t assume paid = private&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Get out of the way / delegate everything&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Verification is the whole job&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Build, don&amp;rsquo;t plan&amp;rdquo; &lt;em&gt;vs&lt;/em&gt; &amp;ldquo;Plan → Correct → Execute&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Some things, I have never questioned.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;For numbers, math, or correctness, make AI write and run code &amp;ndash; never trust prose arithmetic.&lt;/li&gt;
&lt;li&gt;Judge AI against human accuracy, never against perfection &amp;ndash; because experts disagree among themselves.&lt;/li&gt;
&lt;li&gt;Use AI heavily &amp;ndash; reach for it first, high volume.&lt;/li&gt;
&lt;li&gt;Don&amp;rsquo;t build a foundation model from scratch &amp;ndash; steer existing general models instead.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Things I usually say, &lt;strong&gt;but&lt;/strong&gt; there are exceptions.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&amp;ldquo;Always verify&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; blindly trust AI in non-core areas where you are unskilled (such as personal finance).&lt;/li&gt;
&lt;li&gt;&amp;ldquo;AI is improving fast&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; there is a jagged edge, to verify before upgrading.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Don&amp;rsquo;t build models&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; go ahead if it&amp;rsquo;s easy and beats LLMs clearly.&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Models keep getting cheaper&amp;rdquo;. &lt;strong&gt;But&lt;/strong&gt; not in May 2026.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-03-05-my-changing-ai-opinions.avif&#34;&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>My most memorable anniversary</title>
      <link>https://www.s-anand.net/blog/my-most-memorable-anniversary/</link>
      <pubDate>Wed, 03 Jun 2026 09:27:14 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/my-most-memorable-anniversary/</guid>
      <description>&lt;p&gt;It 9:30 pm, I checked my calendar for tomorrow&amp;rsquo;s appointments, alt-tabbed frantically into ChatGPT, and started typing:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Tomorrow is my 24th anniversary. It&amp;rsquo;s a bit late for me to buy anything (except maybe an online service) or prepare something.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!-- https://chatgpt.com/c/6a1d867e-47bc-83ec-b25f-54d858826d24 --&gt;
&lt;p&gt;This has become a habit &amp;ndash; leaving things to the last minute &lt;em&gt;and&lt;/em&gt; asking ChatGPT to save my day.&lt;/p&gt;
&lt;p&gt;I &lt;em&gt;did&lt;/em&gt; give it good context, though.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You remember the OCBC expenses treemap you created by analyzing my transactions? That will give you a good guessable idea of the kinds of things she spends on and hopefully, therefore, what she likes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I was analyzing my bank transactions over the weekend, so that came in handy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What can I do (either create for her, like I made a &lt;a href=&#34;https://www.s-anand.net/blog/birthday-sandwich-cake/&#34;&gt;peanut jelly nutella sandwich birthday for Dhyeya&lt;/a&gt; or buy - if it&amp;rsquo;s an online service / gift voucher kind of thing) in the next hour or so?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;ChatGPT began with gyaan - which I promptly ignored:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Your best move is not &amp;ldquo;buy a gift&amp;rdquo;. It is &amp;ldquo;prove that I have noticed your life, then attach a redeemable escape hatch&amp;hellip;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; and moved on to analyzing her personality and recommending from different perspectives:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Practical person&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No-decision day&lt;/strong&gt;: &amp;ldquo;Tomorrow, you decide nothing. I handle transport, food, booking, payment, and exit plan.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Service relief voucher&lt;/strong&gt;: 24 hours where she forwards any household/admin nuisance to you.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal-care float&lt;/strong&gt;: ClassPass/spa/salon voucher plus the promise: &amp;ldquo;I will book the slot and taxi, not just give you a coupon.&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;A romance playwright&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;24 tiny vows&lt;/strong&gt;: not grand vows; operational vows: &amp;ldquo;I will stop asking where the charger is.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The ordinary-places pilgrimage&lt;/strong&gt;: one route across 3-5 mundane but meaningful places.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Private audio guide&lt;/strong&gt;: 8 short voice notes she unlocks at home / taxi / restaurant / temple.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;hellip; and so on.&lt;/p&gt;
&lt;p&gt;For some of these, I&amp;rsquo;m absolutely the worst person to execute. ClassPass? Spa? Salon? I don&amp;rsquo;t even know the difference between them.&lt;/p&gt;
&lt;p&gt;But 1.2 and 2.4 were interesting. So, after a few iterations with ChatGPT, I woke up the next morning with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Today&amp;rsquo;s our 24th anniversary. For the next 24 hours, I&amp;rsquo;m going to knock off 24 things from your TODO list.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I rescheduled my meetings. I spent the entire day doing nothing but what was on her list. Water purifier, doctor appointments, toothbrush shopping, investments, drying clothes, duct-taping, &amp;hellip; and we dined out.&lt;/p&gt;
&lt;p&gt;At 10:00 pm: &amp;ldquo;I think it was my most memorable anniversary. Thank you.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-03-my-most-memorable-anniversary.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;Thank you, ChatGPT. Thank you, habit of setting low expectations.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>It&#39;s who you know</title>
      <link>https://www.s-anand.net/blog/it-s-who-you-know/</link>
      <pubDate>Tue, 02 Jun 2026 09:47:12 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/it-s-who-you-know/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/dharmendrasingh17/&#34;&gt;Dharmendra Singh&lt;/a&gt; shared how they built an app with AI.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-06-02-paymentpulse.webp&#34;&gt;&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s normal. I&amp;rsquo;m just thrilled they used client transcripts as the source.&lt;/p&gt;
&lt;p&gt;Basically, they converted the &amp;ldquo;voice of the client&amp;rdquo; to working software. To quote them: &amp;ldquo;A strong spoken business narrative can be converted into a usable product brief quickly when the capture step is disciplined.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;You know what this means? &lt;strong&gt;Interviewing is a skill to hire for&lt;/strong&gt;. Better questions = better answers = better apps.&lt;/p&gt;
&lt;p&gt;&amp;hellip; until AI starts interviewing better than us (which it might be already). At that point, picking &lt;em&gt;whom to interview&lt;/em&gt; becomes important.&lt;/p&gt;
&lt;p&gt;You know what that means? &lt;strong&gt;People management is a skill to hire for&lt;/strong&gt;. Better stakeholders = better interviews = better apps.&lt;/p&gt;
&lt;p&gt;&amp;hellip; until AI understand people by mining signals better than us (which it might be already). At which point, &lt;em&gt;stuff you can&amp;rsquo;t capture or express&lt;/em&gt; (body language, trust, um&amp;hellip; &lt;a href=&#34;https://en.wikipedia.org/wiki/Nepotism?&#34;&gt;nepotism&lt;/a&gt; becomes more important.&lt;/p&gt;
&lt;p&gt;You know what that means? &lt;strong&gt;It&amp;rsquo;s who you know&lt;/strong&gt;, not what you know.&lt;/p&gt;
&lt;p&gt;But wait&amp;hellip; isn&amp;rsquo;t there supposed to be something wrong with that?&lt;/p&gt;
&lt;p&gt;Sigh&amp;hellip; time to review &lt;a href=&#34;https://straive.com/&#34;&gt;Straive&lt;/a&gt; hiring policies.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>AI Coding Agent Subscription ROI</title>
      <link>https://www.s-anand.net/blog/ai-coding-agent-subscription-roi/</link>
      <pubDate>Sat, 30 May 2026 23:19:34 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/ai-coding-agent-subscription-roi/</guid>
      <description>&lt;p&gt;I ran &lt;a href=&#34;https://github.com/ryoppippi/ccusage&#34;&gt;&lt;code&gt;npx -y ccusage monthly --compact&lt;/code&gt;&lt;/a&gt; to get the following break-up of my AI coding agent costs.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Month&lt;/th&gt;
          &lt;th style=&#34;text-align: right&#34;&gt;Codex&lt;/th&gt;
          &lt;th style=&#34;text-align: right&#34;&gt;Claude&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-09&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$37.47&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$2.29&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-10&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$106.79&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$9.13&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-11&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$100.35&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$14.24&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2025-12&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$240.69&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$24.88&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-01&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$100.89&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$20.28&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-02&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$323.21&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$29.46&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-03&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$1996.32&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$134.87&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-04&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$401.36&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$47.07&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2026-05&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$378.20&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$45.13&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This shows the ROI of my $20 subscriptions to each. I get ~$35 worth of API calls for my $20 Claude Pro subscription and ~$400 of API calls for my $20 ChatGPT Plus subscription (on top of my ChatGPT chats.)&lt;/p&gt;
&lt;p&gt;I end up using Codex a lot more - partly because it&amp;rsquo;s a bit more diligent, but mostly because it&amp;rsquo;s a lot cheaper.&lt;/p&gt;
&lt;p&gt;Clearly, subscriptions are good deal for individuals. Codex, especially.&lt;/p&gt;
&lt;p&gt;This may not be true for corporates. &lt;a href=&#34;https://simonwillison.net/2026/May/27/product-market-fit/&#34;&gt;Simon Willison&lt;/a&gt; says that Anthropic and OpenAI both changed &lt;em&gt;enterprise&lt;/em&gt; pricing to align with token prices. That means the cost of enterprise AI security is ~2-20 &lt;em&gt;times&lt;/em&gt; their token budget - which is growing rapidly.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;BTW, my moment of &lt;a href=&#34;https://en.wikipedia.org/wiki/Chatbot_psychosis&#34;&gt;AI psychosis&lt;/a&gt; was in March 2026. The coding agents had increased their limits and I was tokenmaxxing. I&amp;rsquo;m far from that limit today, but the symptoms linger.&lt;/p&gt;
&lt;noscript&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-ai-coding-agent-subscription-roi.avif&#34;&gt;&lt;/p&gt;
&lt;/noscript&gt;
&lt;p&gt;&lt;canvas id=&#34;ai-coding-agent-usage&#34;&gt;&lt;/canvas&gt;&lt;/p&gt;
&lt;script&gt;
  (async function () {
    const rows = [
      { month: &#39;2025-09&#39;, claude: 2.29, codex: 37.47 },
      { month: &#39;2025-10&#39;, claude: 9.13, codex: 106.79 },
      { month: &#39;2025-11&#39;, claude: 14.24, codex: 100.35 },
      { month: &#39;2025-12&#39;, claude: 24.88, codex: 240.69 },
      { month: &#39;2026-01&#39;, claude: 20.28, codex: 100.89 },
      { month: &#39;2026-02&#39;, claude: 29.46, codex: 323.21 },
      { month: &#39;2026-03&#39;, claude: 134.87, codex: 1996.32 },
      { month: &#39;2026-04&#39;, claude: 47.07, codex: 401.36 },
      { month: &#39;2026-05&#39;, claude: 45.13, codex: 378.20 }
    ];

    const theme = {
      ink: &#39;#231f20&#39;,
      muted: &#39;#6b625c&#39;,
      grid: &#39;rgba(35, 31, 32, 0.11)&#39;,
      axis: &#39;rgba(35, 31, 32, 0.22)&#39;,
      claude: &#39;#b96d3a&#39;,
      codex: &#39;#2d5f87&#39;,
      tooltip: &#39;rgba(35, 31, 32, 0.94)&#39;
    };

    const canvas = document.getElementById(&#39;ai-coding-agent-usage&#39;);

    Object.assign(canvas.style, {
      display: &#39;block&#39;,
      width: &#39;100%&#39;,
      height: &#39;100%&#39;,
      minHeight: &#39;480px&#39;
    });

    function loadChartJs() {
      if (window.Chart) return Promise.resolve();
      return new Promise((resolve, reject) =&gt; {
        const script = document.createElement(&#39;script&#39;);
        script.src = &#39;https://cdn.jsdelivr.net/npm/chart.js@4.4.1/dist/chart.umd.min.js&#39;;
        script.onload = resolve;
        script.onerror = reject;
        document.head.appendChild(script);
      });
    }

    await loadChartJs();

    const usd = new Intl.NumberFormat(&#39;en-US&#39;, {
      style: &#39;currency&#39;,
      currency: &#39;USD&#39;,
      maximumFractionDigits: 2
    });

    const compactUsd = new Intl.NumberFormat(&#39;en-US&#39;, {
      style: &#39;currency&#39;,
      currency: &#39;USD&#39;,
      notation: &#39;compact&#39;,
      maximumFractionDigits: 1
    });

    Chart.defaults.font.family = document.body.style.fontFamily;
    Chart.defaults.color = theme.muted;

    new Chart(canvas, {
      type: &#39;line&#39;,
      data: {
        labels: rows.map(d =&gt; d.month),
        datasets: [
          {
            label: &#39;Claude&#39;,
            data: rows.map(d =&gt; d.claude),
            borderColor: theme.claude,
            backgroundColor: theme.claude,
            pointBackgroundColor: &#39;#ffffff&#39;,
            pointBorderColor: theme.claude,
            pointBorderWidth: 2.5,
            pointRadius: 4,
            pointHoverRadius: 7,
            borderWidth: 3,
            tension: 0.22
          },
          {
            label: &#39;Codex&#39;,
            data: rows.map(d =&gt; d.codex),
            borderColor: theme.codex,
            backgroundColor: theme.codex,
            pointBackgroundColor: &#39;#ffffff&#39;,
            pointBorderColor: theme.codex,
            pointBorderWidth: 2.5,
            pointRadius: 4,
            pointHoverRadius: 7,
            borderWidth: 3,
            tension: 0.22
          }
        ]
      },
      options: {
        responsive: true,
        maintainAspectRatio: false,
        interaction: { mode: &#39;index&#39;, intersect: false },
        layout: { padding: { top: 12, right: 18, bottom: 4, left: 8 } },
        plugins: {
          legend: {
            position: &#39;bottom&#39;,
            labels: {
              usePointStyle: true,
              pointStyle: &#39;circle&#39;,
              boxWidth: 8,
              boxHeight: 8,
              padding: 22,
              color: theme.muted,
              font: { size: 13, weight: &#39;650&#39; }
            }
          },
          tooltip: {
            enabled: true,
            mode: &#39;index&#39;,
            intersect: false,
            backgroundColor: theme.tooltip,
            bodyFont: { size: 13, weight: &#39;650&#39; },
            padding: 13,
            displayColors: true,
            callbacks: {
              title: items =&gt; items[0].label,
              label: item =&gt; `${item.dataset.label}: ${usd.format(item.parsed.y)}`,
              afterBody: items =&gt; {
                const i = items[0].dataIndex;
                const total = rows[i].claude + rows[i].codex;
                return `Combined: ${usd.format(total)}`;
              }
            }
          }
        },
        scales: {
          x: {
            grid: { color: &#39;rgba(35,31,32,0.07)&#39;, drawTicks: false },
            border: { color: theme.axis },
            ticks: { maxRotation: 0, autoSkip: false, color: theme.muted, font: { size: 12 } }
          },
          y: {
            beginAtZero: true,
            suggestedMax: 2200,
            grid: { color: theme.grid },
            border: { display: false },
            ticks: {
              padding: 8,
              color: theme.muted,
              callback: value =&gt; value &gt;= 1000 ? compactUsd.format(value) : &#39;$&#39; + value
            },
            title: {
              display: true,
              text: &#39;Cost (USD)&#39;,
              color: theme.muted,
              font: { size: 12, weight: &#39;650&#39; }
            }
          }
        }
      }
    });
  })();
&lt;/script&gt;
</description>
    </item>
    <item>
      <title>Retire the Verify Button</title>
      <link>https://www.s-anand.net/blog/retire-the-verify-button/</link>
      <pubDate>Sat, 30 May 2026 16:25:35 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/retire-the-verify-button/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-retire-the-verify-button.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;My post &lt;a href=&#34;https://www.s-anand.net/blog/add-a-verify-button/&#34;&gt;&amp;ldquo;Add a Verify Button&amp;rdquo;&lt;/a&gt; has a problem. When &lt;a href=&#34;https://www.linkedin.com/in/rohitsaran/&#34;&gt;Rohit&lt;/a&gt; requested hyperlocal news for every PIN code in Mumbai, we&amp;rsquo;d need a &amp;ldquo;verify&amp;rdquo; button on &lt;em&gt;every&lt;/em&gt; &lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/&#34;&gt;Statoistics card&lt;/a&gt; - hundreds of PIN codes, &lt;em&gt;every day&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Verifying every output introduces new bottleneck: a person inspecting every unit. &lt;strong&gt;That&amp;rsquo;s 100% inspection - which you do when you don&amp;rsquo;t yet trust the process.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Manufacturing solved this a century ago. At Western Electric&amp;rsquo;s Hawthorne Works (famous for the &lt;a href=&#34;https://en.wikipedia.org/wiki/Hawthorne_effect&#34;&gt;Hawthorne Effect&lt;/a&gt;), quality control meant inspecting finished products and pulling the defective ones. &lt;a href=&#34;https://en.wikipedia.org/wiki/Walter_A._Shewhart&#34;&gt;Walter Shewhart&lt;/a&gt; sent his boss a &lt;a href=&#34;https://deming.org/the-first-control-chart/&#34;&gt;one-page memo&lt;/a&gt;; about a third of it was a control chart.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://deming.org/wp-content/uploads/2021/04/Screen-Shot-2021-04-18-at-7.30.33-PM.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/W._Edwards_Deming&#34;&gt;Deming&lt;/a&gt; turned this approach into his third point: &lt;em&gt;&amp;ldquo;Stop relying on inspection for quality.&amp;rdquo;&lt;/em&gt; Build quality in from the start instead of inspecting defects out at the end.&lt;/p&gt;
&lt;p&gt;His process tells us what to do with a verify button as volume climbs.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Measure how often it&amp;rsquo;s right.&lt;/strong&gt; Don&amp;rsquo;t retire inspection until you know your defect rate. For example, on &lt;a href=&#34;https://sanand0.github.io/llmevals/double-checking/&#34;&gt;one classification task I benchmarked&lt;/a&gt;, the average model error was about 14%. Until we know that number, &amp;ldquo;it&amp;rsquo;s probably fine&amp;rdquo; is just a feeling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stratify.&lt;/strong&gt; &amp;ldquo;The garden has 18 plants&amp;rdquo; is easy to validate and low-risk if wrong. &amp;ldquo;This loan is denied&amp;rdquo; is neither. Verify the risky things carefully, let the cheap things through with low effort. Equal effort on both is waste.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sample.&lt;/strong&gt; Nobody inspected every artillery shell in the war. Shewhart&amp;rsquo;s Bell Labs colleagues &lt;a href=&#34;https://en.wikipedia.org/wiki/Harold_F._Dodge&#34;&gt;Harold Dodge&lt;/a&gt; and Harry Romig put sampling inspection on a statistical basis. Check a sample at known confidence; watch whether the process drifts. The equivalent: verify a random sample of cards, track the rate, and react when the rate moves, not when one card looks off.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Augment with other models.&lt;/strong&gt; When I &lt;a href=&#34;https://sanand0.github.io/llmevals/double-checking/&#34;&gt;correlated two models&amp;rsquo; errors&lt;/a&gt;, the correlation was about 20%. If one gets a case wrong, the other usually doesn&amp;rsquo;t miss the same one. So a second model is a cheap, imperfect inspector. Asking AI to generate verifiable output lets another model to spot obvious errors.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Also, it&amp;rsquo;s best to avoid overreacting to defects. Deming called this (re-tuning the process after every defect) &lt;em&gt;tampering&lt;/em&gt;. It makes the variation worse. It&amp;rsquo;s worth collecting data and finding the real causes before changing the process.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s what &lt;a href=&#34;https://www.linkedin.com/in/ankorrai&#34;&gt;Ankor&lt;/a&gt; calls the &lt;a href=&#34;https://sanand0.github.io/talks/2026-03-18-verifiable-agents/&#34;&gt;future of verifiable autonomy&lt;/a&gt;. It starts with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;we are going to have to move beyond testing correctness to standard testing… if we test the pipeline once before deployment, we can trust that every single output produced by that pipeline, unless we make any adjustment to it, can be trusted.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;His analogy is software. Verification becomes a standard layer in the production loop, like how CI/CD is a standard step before you ship. Over a few years the need for human validation drops, and programmatic checks plus triage take over.&lt;/p&gt;
&lt;p&gt;Regulated finance has a lot of experience with this. After the GFC, the Fed and OCC issued &lt;a href=&#34;https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm&#34;&gt;SR 11-7&lt;/a&gt; in April 2011. Every quantitative model going into production needs independent validation by people separate from the developers, plus ongoing monitoring, before it ships. &amp;ldquo;Retire the verify button&amp;rdquo; doesn&amp;rsquo;t mean stop checking. &lt;strong&gt;It means have an independent validation layer with an owner.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Of course, this incurs cost - at scale. For us, it led to concerns from the Finance team that the token costs overhead was climbing up. But, to quote &lt;a href=&#34;https://www.linkedin.com/in/srinivasankg/&#34;&gt;KG&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Token cost cannot be overhead. Token cost is direct cost because you&amp;rsquo;re replacing people.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So I now &lt;a href=&#34;https://sanand0.github.io/llmpricing/&#34;&gt;benchmark cost alongside accuracy&lt;/a&gt;. A contract-validation demo I run checks a contract against a clause checklist, citing where each clause sits, for about 3 cents and 6 seconds. Pricing it lets me decide whether a reviewer&amp;rsquo;s half-hour is worth more than 3 cents. Usually it is. Sometimes it isn&amp;rsquo;t.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Sometimes, this isn&amp;rsquo;t good enough. A client wanted PII scrubbed from 3 million user images with &lt;em&gt;zero&lt;/em&gt; leaks. I did the arithmetic out loud:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;with 99.9%, we&amp;rsquo;re talking about 3,000 images with personally identifiable information potentially slipping through. Is that OK?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He said, &amp;ldquo;No.&amp;rdquo; I told him we couldn&amp;rsquo;t do it. It needs more technology than we had. (Our sales team nearly had a heart attack.) &lt;strong&gt;A critical output of measuring is to check if it&amp;rsquo;s even possible.&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;I still manually verify AI output for new stuff. I don&amp;rsquo;t trust every pipeline yet. But when the scale becomes unwieldy, this is the process I switch to.&lt;/p&gt;
&lt;!-- https://claude.ai/chat/36780e30-48ca-4f84-af7a-4308e0880ce4 --&gt;
</description>
    </item>
    <item>
      <title>Add a Verify Button</title>
      <link>https://www.s-anand.net/blog/add-a-verify-button/</link>
      <pubDate>Sat, 30 May 2026 11:39:10 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/add-a-verify-button/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-add-a-verify-button.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/rohitsaran/&#34;&gt;Rohit Saran&lt;/a&gt; looked at the &lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/&#34;&gt;Statoistics cards&lt;/a&gt; my AI agents are generating for &lt;a href=&#34;https://x.com/hashtag/STATOISTICS&#34;&gt;The Times of India&lt;/a&gt;, and asked about a small button under each one.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/journalists/statnostics/2026-04-27-citizen-survey/03-family-doctor-everyone-wants-nobody-has.svg&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-30-statoistics-card.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the list of Statoistics that you had put, I saw there&amp;rsquo;s a button called &amp;lsquo;Verify.&amp;rsquo; What was that meant to be or will do in future?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That verify button explains the claim, mentions the sources, and shows how to check the claim.&lt;/p&gt;
&lt;p&gt;One card said &amp;ldquo;9 in 10 Indians want a family doctor and barely 1 in 35 has one&amp;rdquo;. The button breaks that down:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;87% want a family doctor, 2.8% outpatient visits were to an Asha worker…&amp;rdquo;&lt;/li&gt;
&lt;li&gt;It identifies in the source document what are the columns that we were looking at, what numbers it verified.&lt;/li&gt;
&lt;li&gt;It links to the program that it wrote to do the verification.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I said, &amp;ldquo;it lets humans check if the numbers are right - by giving them steps &amp;ndash; where exactly to check, how to check if it is correct.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/sajeev-kumarapuram-205ba933&#34;&gt;Sajeev&lt;/a&gt; pushed back: &lt;em&gt;&amp;ldquo;It&amp;rsquo;s more &amp;rsquo;explain&amp;rsquo; than &amp;lsquo;verify&amp;rsquo; really.&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;True. &lt;a href=&#34;https://timesofindia.indiatimes.com/toireporter/author-Saurabh-Banerjee-479202560.cms&#34;&gt;Saurabh&lt;/a&gt; had asked for exactly this earlier: while a person is checking by hand, give them something that shows how the AI got to its answer. &lt;strong&gt;A verify button&amp;rsquo;s first job is not to prove the AI is right. It&amp;rsquo;s to let a nervous journalist check, cheaply, until they stop being nervous.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This instinct is old. The Royal Society took &lt;a href=&#34;https://royalsociety.org/about-us/who-we-are/history/&#34;&gt;&lt;em&gt;nullius in verba&lt;/em&gt;&lt;/a&gt; as its motto around 1662, &amp;ldquo;take nobody&amp;rsquo;s word for it.&amp;rdquo; They didn&amp;rsquo;t print claims and ask you to trust the author. In 1663 they made &lt;a href=&#34;https://en.wikipedia.org/wiki/Robert_Hooke&#34;&gt;Robert Hooke&lt;/a&gt; their Curator of Experiments, whose job was to re-run the demonstration in front of the Fellows. A verify button is that, without Hooke.&lt;/p&gt;
&lt;p&gt;(Merchants got there two centuries earlier: double-entry bookkeeping, codified by &lt;a href=&#34;https://en.wikipedia.org/wiki/Luca_Pacioli&#34;&gt;Pacioli&lt;/a&gt; in 1494, means every entry has a counter-entry and the books either balance or they don&amp;rsquo;t.)&lt;/p&gt;
&lt;p&gt;Rohit&amp;rsquo;s reason for liking it went somewhere I hadn&amp;rsquo;t fully thought through. He went to brand.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It&amp;rsquo;s like why a product with 10-year guarantee is likely to be made better than a product with 2-year warranty, because the company has confidence to tell the customer, &amp;lsquo;Look, I am standing behind this product for 10 years.&amp;rsquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And later:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Any brand that is saying, &amp;lsquo;Whatever I write is verifiable,&amp;rsquo; is so much more in this age of misinformation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;His version of why this matters for a newspaper: &lt;em&gt;&amp;ldquo;a brand is only about trust. Rest is news is anyway a commodity.&amp;rdquo;&lt;/em&gt; &lt;strong&gt;A verify button is a public claim that you&amp;rsquo;re willing to be checked.&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Here&amp;rsquo;s how I actually build one &amp;ldquo;Verify&amp;rdquo; buttons, in increasing order of effort.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Link plus a searchable string.&lt;/strong&gt; A hyperlink may still be wrong. I want a link &lt;em&gt;and&lt;/em&gt; a short quote I can paste into the page&amp;rsquo;s search box and find. &lt;em&gt;&amp;ldquo;When I click on that link, I should be able to literally search for and find that piece of text, verifying that it did not hallucinate&amp;rdquo;&lt;/em&gt; Then even a plain program (not even an LLM) can open every link and confirm the text is there.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For numbers, the SQL query.&lt;/strong&gt; If it&amp;rsquo;s data, the SQL query (or Python script) that fetches that particular result is the closest equivalent. The button should just run the query against live data and shows the number. The user doesn&amp;rsquo;t need to know SQL - they just see that the number matches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The procedure as a checklist.&lt;/strong&gt; The button breaks the card into steps: this is the claim, this is the number, this is the column it came from, check that the D1A value matches. A person ticks down it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify with an AI agent.&lt;/strong&gt; Add a link that opens the claim in Google AI mode with a pre-filled prompt asking it to fact-check the claim. For example: &lt;a href=&#34;https://tools.s-anand.net/askai/?q=Fact-check+with+step-by-step+evidence%3A+According+to+Citizen+Survey+2022-23%2C+87%25+of+Indians+want+a+dedicated+family+doctor+but+only+2.8%25+actually+use+one.&#34;&gt;Fact-check with step-by-step evidence: According to Citizen Survey 2022-23, 87% of Indians want a dedicated family doctor but only 2.8% actually use one. How might it have changed since the publication?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Rohit framed verification as three jobs, not one: &lt;em&gt;&amp;ldquo;Verification has sourcing, verification, and updation.&amp;rdquo;&lt;/em&gt; The last clause lets you also ask whether the number has gone stale since you published it.&lt;/p&gt;
&lt;p&gt;Getting the source right is not the same as getting the conclusion right. Rohit said: &lt;em&gt;&amp;ldquo;you are asking AI not only to get right source and right data, but now we are asking to interpret.&amp;rdquo;&lt;/em&gt; And interpretation is subjective on both ends. The button can confirm the number is real but not &lt;em&gt;prove&lt;/em&gt; the argument is sound.&lt;/p&gt;
&lt;p&gt;Of course, the sources could be wrong. &amp;ldquo;Check the source&amp;rdquo; assumes good data quality. Luckily, data is more often right than wrong, and verification can shine a light on bad data.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;We can start simple. The cheapest version: &lt;em&gt;every&lt;/em&gt; AI output has a &amp;ldquo;Verify&amp;rdquo; link to a search query the user can easily inspect. That changes their question from &amp;ldquo;can I trust this?&amp;rdquo; to &amp;ldquo;let me check.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;If this can establish trust and a brand for India&amp;rsquo;s largest newspaper, enterprises AI apps might do well to follow.&lt;/p&gt;
&lt;!-- https://claude.ai/chat/36780e30-48ca-4f84-af7a-4308e0880ce4 --&gt;
</description>
    </item>
    <item>
      <title>One extra push-up every day</title>
      <link>https://www.s-anand.net/blog/one-extra-push-up-every-day/</link>
      <pubDate>Fri, 29 May 2026 09:57:03 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/one-extra-push-up-every-day/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-29-one-extra-push-up-every-day.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m doing one extra push-up every day.&lt;/p&gt;
&lt;p&gt;One of my &lt;a href=&#34;https://www.s-anand.net/blog/my-year-in-2025/&#34;&gt;2026 goals&lt;/a&gt; is to build muscles. I haven&amp;rsquo;t done anything about it until May.&lt;/p&gt;
&lt;p&gt;This month, I figured I would do the absolute minimum, at least to get started, because I seem to have starting trouble more than anything else. &lt;a href=&#34;https://chatgpt.com/share/6a18f311-dcf0-83ec-aa24-9ecc82053f37&#34;&gt;I asked ChatGPT&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want to build muscles. What&amp;rsquo;s the most effective thing that I can do that would take no more than one minute that I can practice every day without any equipment and I can do this anywhere and will have the most impact on building muscles? Research, give me the top five options and recommend one for me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It suggested push-ups. &lt;a href=&#34;https://claude.ai/share/d6138707-2e80-48b7-9c9f-8ff07d424d9f&#34;&gt;So did Claude&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Since I do yoga every day, I decided to do push-ups after that. I kept forgetting, so I decided to do push-ups &lt;em&gt;before&lt;/em&gt; that. (This worked.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: &lt;a href=&#34;https://jamesclear.com/habit-stacking&#34;&gt;Habit stacking&lt;/a&gt; works. Habit pre-stacking works better.&lt;/p&gt;
&lt;p&gt;I remember that story about a boy carrying a young bull as a child, and as they both grew up, he grew into a man strong enough to carry an adult bull. I am applying a similar practice.&lt;/p&gt;
&lt;p&gt;I started with 10 push-ups a day. Every day, I&amp;rsquo;m adding one push-up to it. I just finished 23. That is, I have spent the last ~16 days (with 3 misses in between) adding one push-up each day to my routine.&lt;/p&gt;
&lt;p&gt;This seems to be just the right level of incremental difficulty. Every day feels as miserable as the previous one. I began with absolutely not being able to do a single push-up more than 10 push-ups. I just finished my 23 push-up routine, absolutely not able to even one more. And it&amp;rsquo;s felt exactly the same way as every day.&lt;/p&gt;
&lt;p&gt;Maybe it&amp;rsquo;s because I &lt;em&gt;know&lt;/em&gt; the quota and the brain decides that&amp;rsquo;s &lt;em&gt;exactly&lt;/em&gt; the limit of what&amp;rsquo;s possible. But still, it feels like one extra push-up a day is reasonable progression.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: &lt;a href=&#34;https://jamesclear.com/quotes/habits-are-the-compound-interest-of-self-improvement&#34;&gt;Compounding habits&lt;/a&gt; seems to work. I&amp;rsquo;ll keep you posted.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>ChatGPT is about FIDE 1600</title>
      <link>https://www.s-anand.net/blog/chatgpt-is-about-fide-1600/</link>
      <pubDate>Thu, 28 May 2026 16:04:51 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/chatgpt-is-about-fide-1600/</guid>
      <description>&lt;p&gt;I asked ChatGPT to play chess with &lt;a href=&#34;https://stockfishchess.org/&#34;&gt;Stockfish&lt;/a&gt;. Stockfish is a &amp;ldquo;strong open-source chess engine&amp;rdquo;. It has 8 levels of difficulty, which &lt;a href=&#34;https://share.google/aimode/yA9NvnPcsZ1TFtmna&#34;&gt;roughly maps to these FIDE levels&lt;/a&gt;:&lt;/p&gt;
&lt;section ai-disclosure=&#34;ai-generated&#34; data-ai-model=&#34;gemini-3.5-flash&#34; data-ai-provider=&#34;Google&#34;&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Stockfish&lt;/th&gt;
          &lt;th&gt;FIDE&lt;/th&gt;
          &lt;th&gt;Player Level &amp;amp; Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 1&lt;/td&gt;
          &lt;td&gt;~1000&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Beginner&lt;/strong&gt;: Constantly blunders, hangs pieces deliberately.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 2&lt;/td&gt;
          &lt;td&gt;~1100&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Advanced Beginner&lt;/strong&gt;: Fewer obvious tactical mistakes, plays completely aimlessly.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 3&lt;/td&gt;
          &lt;td&gt;~1200&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Early Intermediate&lt;/strong&gt;: Punishes very basic errors but regularly drops pieces.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 4&lt;/td&gt;
          &lt;td&gt;~1350&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Intermediate&lt;/strong&gt;: Plays standard opening moves; requires solid, blunder-free play to beat.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 5&lt;/td&gt;
          &lt;td&gt;~1450&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Advanced Intermediate&lt;/strong&gt;: Rarely hangs single pieces; you need positional advantages.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 6&lt;/td&gt;
          &lt;td&gt;~1650&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Strong Club Player&lt;/strong&gt;: Highly tactical. Aggressively exploits your mistakes.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 7&lt;/td&gt;
          &lt;td&gt;~1950&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Expert&lt;/strong&gt;: Exceptionally strong. Requires precise positional mastery and deep calculation.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 8&lt;/td&gt;
          &lt;td&gt;~2400&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;Grandmaster&lt;/strong&gt;: Invincible for most humans. Plays with ruthless perfection.&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Full Engine&lt;/td&gt;
          &lt;td&gt;~3600&lt;/td&gt;
          &lt;td&gt;Our of human reach completely, &amp;ldquo;like a smart ant trying to debate physics with a human.&amp;rdquo;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;/section&gt;
&lt;p&gt;In the &lt;a href=&#34;https://chatgpt.com/share/6a17f88a-dd74-83ec-b6e6-b42fac198d9c&#34;&gt;first iteration&lt;/a&gt;, here were the results:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Stockfish&lt;/th&gt;
          &lt;th&gt;Result&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 0&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 1&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 2&lt;/td&gt;
          &lt;td&gt;Stalemate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 3&lt;/td&gt;
          &lt;td&gt;Stalemate&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 4&lt;/td&gt;
          &lt;td&gt;Win&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 5&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Level 6&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&amp;hellip; etc.&lt;/td&gt;
          &lt;td&gt;Loss&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;When I asked ChatGPT how it played, it said something like &amp;ldquo;I wrote a Python program that plays chess using a fixed policy.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s crazy! So I told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Rather than use a fixed policy, get the move that Stockfish made, analyze it, and return your next move. See if you can win at level 6.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After a few attempts, it &lt;a href=&#34;https://chatgpt.com/share/6a17f740-0424-83ec-b298-5bf6056a3905&#34;&gt;won&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://lichess.org/l9vffWVr&#34;&gt;Here&amp;rsquo;s the game&lt;/a&gt;:&lt;/p&gt;
&lt;video controls=&#34;&#34; width=&#34;534&#34; height=&#34;542&#34; style=&#34;max-width: 100%; height: auto;&#34;&gt;
  &lt;source src=&#34;https://files.s-anand.net/images/2026-05-28-chatgpt-vs-stockfish-chess-game.webm&#34; type=&#34;video/webm&#34;&gt;&lt;a href=&#34;https://lichess.org/l9vffWVr&#34;&gt;ChatGPT vs Stockfish Level 6&lt;/a&gt;
&lt;/video&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-pgn&#34; data-lang=&#34;pgn&#34;&gt;[White &amp;#34;ChatGPT&amp;#34;]
[Black &amp;#34;Stockfish Skill Level 6&amp;#34;]
[Termination &amp;#34;White won by checkmate&amp;#34;]
[FinalFEN &amp;#34;4Q3/2qrkp2/4pN2/1pp1P3/7P/p1P3P1/P5K1/4R3 b - - 5 39&amp;#34;]

1. d4 e6 2. c4 Nf6 3. Nf3 Be7 4. g3 O-O 5. Bg2 a5
6. O-O c6 7. Qc2 d5 8. Rd1 Ne4 9. Nc3 Nxc3 10. bxc3 a4
11. e4 h6 12. Bf4 Re8 13. e5 b6 14. Nd2 Ba6 15. h4 Qc7
16. Be3 Bb7 17. f4 Na6 18. Rf1 Rad8 19. f5 Bf8 20. f6 Nb8
21. fxg7 Bxg7 22. Qd1 Nd7 23. Qg4 Nxe5 24. dxe5 c5
25. Bf4 Re7 26. Re1 Kf8 27. Qh5 a3 28. Bh6 dxc4 29. Nxc4 Bxg2
30. Kxg2 Rd3 31. Bxg7+ Kxg7 32. Rf4 Rd2+ 33. Nxd2 Rd7
34. Ne4 b5 35. Rg4+ Kf8 36. Rg8+ Kxg8 37. Nf6+ Kf8
38. Qh8+ Ke7 39. Qe8# 1-0
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;So, guess ChatGPT (GPT-5.5, extended thinking) is at around 1600 FIDE level right now.&lt;/p&gt;
&lt;p&gt;What&amp;rsquo;s impressive is that it wasn&amp;rsquo;t specifically trained on Chess. It&amp;rsquo;s just something it picked up on the way.&lt;/p&gt;
&lt;p&gt;If it starts beating level 8 (grandmaster), will we finally acknowledge AGI? (Me? I think &lt;a href=&#34;https://marginalrevolution.com/marginalrevolution/2025/04/o3-and-agi-is-april-16th-agi-day.html&#34;&gt;we achieved AGI on 16 Apr 2025&lt;/a&gt;.)&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Wikipidia Citation Impact</title>
      <link>https://www.s-anand.net/blog/wikipedia-citation-impact/</link>
      <pubDate>Thu, 28 May 2026 10:00:58 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/wikipedia-citation-impact/</guid>
      <description>&lt;p&gt;Imagine you&amp;rsquo;re an information anarchist. You undermine Wikipedia pages by nuking references.&lt;/p&gt;
&lt;p&gt;A genie has granted you a wish: you can &lt;strong&gt;nuke one entire domain&lt;/strong&gt;. Just one.&lt;/p&gt;
&lt;p&gt;As a data-driven decision maker (who is &lt;em&gt;also&lt;/em&gt; an information anarchist 🤷), which would you pick?&lt;/p&gt;
&lt;p&gt;A common choice is &lt;a href=&#34;https://archive.org/&#34;&gt;The Internet Archive&lt;/a&gt;. 2.9 &lt;strong&gt;million&lt;/strong&gt; Wikipedia pages reference it.&lt;/p&gt;
&lt;p&gt;But, you&amp;rsquo;re sneakier than that. A page isn&amp;rsquo;t undermined just because some references are gone. It&amp;rsquo;s undermined when &lt;em&gt;all&lt;/em&gt; the references are gone.&lt;/p&gt;
&lt;p&gt;In that case, the most devastating domain to nuke is &lt;a href=&#34;https://stat.gov.pl/en&#34;&gt;Statistics Poland&lt;/a&gt;. Over 45,000 Wikipedia pages cite &lt;em&gt;only&lt;/em&gt; Statistics Poland as their reference.&lt;/p&gt;
&lt;p&gt;Or, if you&amp;rsquo;re particularly fond of the Polish, destroy &lt;a href=&#34;https://www.sports-reference.com/&#34;&gt;sports-reference.com&lt;/a&gt;. Over 37,000 pages cite it as their &lt;em&gt;only&lt;/em&gt; reference.&lt;/p&gt;
&lt;p&gt;If you prefer hurting scientists, go for &lt;a href=&#34;https://www.biolib.cz/&#34;&gt;biolib.cz&lt;/a&gt; - an online encyclopedia of plants, animals, and very importantly, fungi. (But then, you don&amp;rsquo;t &lt;em&gt;need&lt;/em&gt; to nuke it - the &amp;ldquo;server is experiencing high traffic&amp;rdquo; quite often.) In any case, this is where you&amp;rsquo;ll find most satisfaction, as more sites depend solely on biodiversity and natural history archives like &lt;a href=&#34;https://marinespecies.org/&#34;&gt;marinespecies.org (WoRMS)&lt;/a&gt;, &lt;a href=&#34;https://www.nhm.ac.uk/&#34;&gt;Natural History Museum&lt;/a&gt;, &lt;a href=&#34;https://www.iucnredlist.org/&#34;&gt;IUCN Redlist&lt;/a&gt; than any other category.&lt;/p&gt;
&lt;p&gt;For detailed research on which site you&amp;rsquo;d like to nuke, see &lt;a href=&#34;https://sanand0.github.io/datastories/wikipedia-citation-impact/&#34;&gt;What If a Website Just Died?&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/datastories/wikipedia-citation-impact/&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://sanand0.github.io/datastories/wikipedia-citation-impact/screenshot.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Erdos Unit Distance Problem</title>
      <link>https://www.s-anand.net/blog/erdos-unit-distance-problem/</link>
      <pubDate>Tue, 26 May 2026 22:36:06 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/erdos-unit-distance-problem/</guid>
      <description>&lt;p&gt;An OpenAI model &lt;a href=&#34;https://openai.com/index/model-disproves-discrete-geometry-conjecture/&#34;&gt;solved&lt;/a&gt; the &lt;a href=&#34;https://mathworld.wolfram.com/ErdosUnitDistanceProblem.html&#34;&gt;Erdos unit distance problem&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Erdos roughly said, &amp;ldquo;The number of edges of the same distance between N points can&amp;rsquo;t compound faster than close to 0%.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The model found a method of placing points so that it compounds at about 1.4%.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/datastories/erdos-planar-unit/&#34;&gt;This visualization&lt;/a&gt; is a crude way of visualizing how that works.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/datastories/erdos-planar-unit/&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://sanand0.github.io/datastories/erdos-planar-unit/screenshot.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Longest repeated paragraph on Wikipedia</title>
      <link>https://www.s-anand.net/blog/longest-repeated-paragraph-on-wikipedia/</link>
      <pubDate>Tue, 26 May 2026 22:20:17 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/longest-repeated-paragraph-on-wikipedia/</guid>
      <description>&lt;p&gt;What is the most frequently occurring sentence in Wikipedia? ANS: A 213-word paragraph about &lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings_of_minor-planet_names&#34;&gt;how minor planets are named&lt;/a&gt;, which appears in 418 Wikipedia articles, word-for-word!&lt;/p&gt;
&lt;p&gt;There are ~380,000 asteroids. Wikipedia has 418 pages for these - including one for each thousand-range of asteroids.&lt;/p&gt;
&lt;p&gt;Every single one of these pages includes the phrase:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As &lt;a href=&#34;https://en.wikipedia.org/wiki/Minor%5Fplanet&#34; title=&#34;Minor planet&#34;&gt;minor planet&lt;/a&gt; discoveries are confirmed, they are given a permanent number by the &lt;a href=&#34;https://en.wikipedia.org/wiki/International%5FAstronomical%5FUnion&#34; title=&#34;International Astronomical Union&#34;&gt;IAU&lt;/a&gt;&amp;rsquo;s &lt;a href=&#34;https://en.wikipedia.org/wiki/Minor%5FPlanet%5FCenter&#34; title=&#34;Minor Planet Center&#34;&gt;Minor Planet Center&lt;/a&gt; (MPC), and the discoverers can then submit names for them, following the IAU&amp;rsquo;s &lt;a href=&#34;https://en.wikipedia.org/wiki/Astronomical%5Fnaming%5Fconventions&#34; title=&#34;Astronomical naming conventions&#34;&gt;naming conventions&lt;/a&gt;. The list below concerns those minor planets in the specified number-range that have received names, and explains the meanings of those names.&lt;/p&gt;
&lt;p&gt;Official naming citations of newly named &lt;a href=&#34;https://en.wikipedia.org/wiki/Small%5FSolar%5FSystem%5Fbodies&#34; title=&#34;Small Solar System bodies&#34;&gt;small Solar System bodies&lt;/a&gt; are approved and published in a bulletin by IAU&amp;rsquo;s &lt;a href=&#34;https://en.wikipedia.org/wiki/Working%5FGroup%5Ffor%5FSmall%5FBodies%5FNomenclature&#34; title=&#34;Working Group for Small Bodies Nomenclature&#34;&gt;Working Group for Small Bodies Nomenclature&lt;/a&gt; (WGSBN).&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-WGSBN-Bulletin-Archive-1&#34;&gt;[1]&lt;/a&gt; Before May 2021, citations were published in MPC&amp;rsquo;s &lt;em&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Minor%5FPlanet%5FCirculars&#34; title=&#34;Minor Planet Circulars&#34;&gt;Minor Planet Circulars&lt;/a&gt;&lt;/em&gt; for many decades.&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-MPC-Circulars-Archive-2&#34;&gt;[2]&lt;/a&gt; Recent citations can also be found on the &lt;a href=&#34;https://en.wikipedia.org/wiki/JPL%5FSmall-Body%5FDatabase&#34; title=&#34;JPL Small-Body Database&#34;&gt;JPL Small-Body Database&lt;/a&gt; (SBDB).&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-JPL-Discovery-3&#34;&gt;[3]&lt;/a&gt; Until his death in 2016, German astronomer &lt;a href=&#34;https://en.wikipedia.org/wiki/Lutz%5FD.%5FSchmadel&#34; title=&#34;Lutz D. Schmadel&#34;&gt;Lutz D. Schmadel&lt;/a&gt; compiled these citations into the &lt;em&gt;Dictionary of Minor Planet Names&lt;/em&gt; (DMP) and regularly updated the collection.&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-DoMPN-4&#34;&gt;[4]&lt;/a&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-DoMPN-Addendum-5&#34;&gt;[5]&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Based on &lt;a href=&#34;https://en.wikipedia.org/wiki/Paul%5FHerget&#34; title=&#34;Paul Herget&#34;&gt;Paul Herget&lt;/a&gt;&amp;rsquo;s &lt;em&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/The%5FNames%5Fof%5Fthe%5FMinor%5FPlanets&#34; title=&#34;The Names of the Minor Planets&#34;&gt;The Names of the Minor Planets&lt;/a&gt;&lt;/em&gt;,&lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings%5Fof%5Fminor-planet%5Fnames:%5F213001%E2%80%93214000#cite%5Fnote-Herget-6&#34;&gt;[6]&lt;/a&gt; Schmadel also researched the unclear origin of numerous asteroids, most of which had been named prior to World War II.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Check out these pages
| &lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings_of_minor-planet_names:_85001%E2%80%9386000&#34;&gt;85001-86000&lt;/a&gt;
| &lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings_of_minor-planet_names:_213001%E2%80%93214000&#34;&gt;213001-214000&lt;/a&gt;
| &lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings_of_minor-planet_names:_269001%E2%80%93270000&#34;&gt;269001-270000&lt;/a&gt;
| &lt;a href=&#34;https://en.wikipedia.org/wiki/Meanings_of_minor-planet_names:_380001%E2%80%93381000&#34;&gt;380001-381000&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This is not the only such common sentence. There are several more.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://sanand0.github.io/datastories/longest-wikipedia-string/screenshot.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the Slovakia census note: 81 words that appear across &lt;strong&gt;2,920 Wikipedia pages&lt;/strong&gt;, like
&lt;a href=&#34;https://en.wikipedia.org/wiki/Sabinov%5FDistrict&#34;&gt;Sabinov District&lt;/a&gt;,
&lt;a href=&#34;https://en.wikipedia.org/wiki/Smolenice&#34;&gt;Smolenice&lt;/a&gt;,
&lt;a href=&#34;https://en.wikipedia.org/wiki/Ilija,%5FSlovakia&#34;&gt;Ilija, Slovakia&lt;/a&gt;,
&lt;a href=&#34;https://en.wikipedia.org/wiki/Balo%C5%88&#34;&gt;Baloň&lt;/a&gt;, &amp;hellip; and thousands more!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note on population: The difference between the population numbers above and in the census (here and below) is that the population numbers above are mostly made up of permanent residents, etc.; and the census should indicate the place where people actually mainly live. For example, a student is a citizen of a village because they have permanent residence there (they lived there as a child and has parents), but most of the time he studies at a university in the city&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: As of 26 May 2026, this has been shortened to:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note on population: The difference values of population numbers in the table &amp;ldquo;Population statistic&amp;rdquo; and in the sections &amp;ldquo;Ethnicity&amp;rdquo; &amp;amp; &amp;ldquo;Religion&amp;rdquo; is caused by the use of various statistical methods.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;p&gt;There are several more such that you can read about in &lt;a href=&#34;https://sanand0.github.io/datastories/longest-wikipedia-string/&#34;&gt;The Paragraph That Appears 418 Times&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That also includes how &lt;a href=&#34;https://openai.com/codex/&#34;&gt;Codex&lt;/a&gt; analyzed the &lt;a href=&#34;https://huggingface.co/datasets/wikimedia/structured-wikipedia&#34;&gt;Wikipedia structured dataset on Hugging Face&lt;/a&gt; and what else you can do with the data.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Correcting instruction debt</title>
      <link>https://www.s-anand.net/blog/correcting-instruction-debt/</link>
      <pubDate>Mon, 25 May 2026 16:30:02 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/correcting-instruction-debt/</guid>
      <description>&lt;p&gt;Here&amp;rsquo;s another AI-generated post, with &lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;&lt;/em&gt; editor notes. But I&amp;rsquo;ve also added my own version of the post below.&lt;/p&gt;
&lt;!--
RESEARCH: https://chatgpt.com/c/6a13dc2a-4d28-83ec-ae9b-016cd00b13a9
BLOG: https://claude.ai/chat/17173628-db14-42a4-b9c4-1aff78d5edf3
--&gt;
&lt;section ai-disclosure=&#34;ai-generated&#34; data-ai-model=&#34;claude-sonnet-4.6&#34; data-ai-provider=&#34;Anthropic&#34;&gt;
&lt;p&gt;I told my &amp;ldquo;&lt;a href=&#34;https://github.com/sanand0/scripts/blob/48a095e0a956841a8e395345d31f79b1ae99d4a6/freeslots.py&#34;&gt;find a free calendar slot&lt;/a&gt;&amp;rdquo; script to &amp;ldquo;&lt;a href=&#34;https://github.com/sanand0/scripts/commit/a8f5fb9eb5640b828618a33f2c89252e67664487#r186420170&#34;&gt;Avoid weekends and holidays&lt;/a&gt;&amp;rdquo;. Wednesday vanished. Turns out it&amp;rsquo;s a Singapore holiday (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: It&amp;rsquo;s &lt;a href=&#34;https://en.wikipedia.org/wiki/Eid_al-Adha&#34;&gt;Eid al-Adha&lt;/a&gt;&lt;/em&gt;), — irrelevant for the people I was meeting in other zones. &lt;strong&gt;I&amp;rsquo;d debugged my own helpful rule.&lt;/strong&gt; (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: What? What does &amp;ldquo;debugged my own helpful rule&amp;rdquo; even mean?&lt;/em&gt;)&lt;/p&gt;
&lt;p&gt;Annoying. But revealing. I went to ChatGPT — not to fix the script, but to think: (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: True. I had no clue what to do.&lt;/em&gt;)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Could you maybe interview me to figure out what direction I might want to take this train of thought in&amp;hellip;? Just ask me two or three questions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(The whole point was to &lt;em&gt;not&lt;/em&gt; take on more work myself.)&lt;/p&gt;
&lt;p&gt;Two questions in, it named it: &lt;strong&gt;instruction debt.&lt;/strong&gt; (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: which is such a cool term that I&amp;rsquo;ll keep it.&lt;/em&gt;)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Not &amp;ldquo;bad instructions,&amp;rdquo; because the original instruction was reasonable. The debt is created when a rule that once reduced cognitive load later creates invisible work, missed options, brittle behavior, or debugging cost.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That hit. The script obeyed too literally. I got no warning. Worst of all, I&amp;rsquo;d scored a self-goal — given my future self an instruction that would bother me, while believing I was being helpful.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;I asked it to research further — and to mine my own agent logs as evidence. (&lt;a href=&#34;https://www.s-anand.net/blog/how-i-use-local-mcp/&#34;&gt;Local MCP&lt;/a&gt; runs bash; ChatGPT can read &lt;code&gt;~/.codex&lt;/code&gt;, &lt;code&gt;~/.claude&lt;/code&gt;, &lt;code&gt;~/.copilot&lt;/code&gt; and run &lt;code&gt;~/code/scripts/agentlog.py&lt;/code&gt; directly.) It came back with a taxonomy. I asked it to stress-test against more correction turns and &lt;strong&gt;discard what didn&amp;rsquo;t survive&lt;/strong&gt;. (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: Basically, I said, analyze my logs.&lt;/em&gt;)&lt;/p&gt;
&lt;p&gt;It did. The robust categories, each grounded in an actual correction I&amp;rsquo;d made:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Objective framing&lt;/strong&gt; — &amp;ldquo;don&amp;rsquo;t base teachability on scores… base it on the pattern of errors.&amp;rdquo; Wrong proxy. (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: Oh, yeah, I was trying to find patterns of errors in student submissions.&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evidence/modeling&lt;/strong&gt; — Ticketmaster classifier overfit on &lt;code&gt;venue_name&lt;/code&gt;. Predictive, not causal. (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: True. Stupid model said, &amp;ldquo;tickets in this stadium sell more&amp;rdquo; as if it were actionable.&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Constraint semantics&lt;/strong&gt; — the Singapore holiday. Hard filter where a warning would do.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;State/action&lt;/strong&gt; — Darwinbox: &amp;ldquo;Click Clockin&amp;rdquo; clocked me &lt;em&gt;out&lt;/em&gt;. No pre/post-state check. (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: The button said &amp;ldquo;Clock in OR out&amp;rdquo;. I was clocked in. It clicked, thinking that&amp;rsquo;ll clock me in, without seeing that the button was already pressed.&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Representation/path&lt;/strong&gt; — blog migration: &amp;ldquo;ALL LINKS relative&amp;rdquo; broke nested URLs. (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: Yeah, relative links in my blog have been problematic for 20 years.&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Validation&lt;/strong&gt; — OCBC PDF: row balances passed, totals failed by SGD 6.9M. (_&lt;strong&gt;Anand&lt;/strong&gt;: I&amp;rsquo;m nowhere near this rich. Codex just messed up &lt;em&gt;badly&lt;/em&gt;.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;ChatGPT&amp;rsquo;s own self-critique was the best part:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Lack of carefulness&amp;rdquo; should not be a category. It is not actionable. (_&lt;strong&gt;Anand&lt;/strong&gt;: No idea what this means!)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr&gt;
&lt;p&gt;Then the pivot. It proposed a &lt;strong&gt;60-line &amp;ldquo;Operating Contract&amp;rdquo;&lt;/strong&gt; for my &lt;code&gt;AGENTS.md&lt;/code&gt;. I pushed back:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The operating contract is WAY too long. I was thinking 1 line, not 60&amp;hellip; fast and frugal heuristics that cover the majority of the scenarios, rather than hard-coding everything, is what we&amp;rsquo;re suggesting coding agents do in the first place.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It came back with three lines. &lt;a href=&#34;https://github.com/sanand0/scripts/commit/7be44855ac6063a364163181585b8eb5721fc469#r186421964&#34;&gt;I pasted them in verbatim&lt;/a&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;For non-trivial tasks, define the user-visible invariant: &amp;#34;done means ___&amp;#34;; verify that invariant before claiming success.
Treat constraints as soft preferences unless safety, privacy, data loss, credentials, or the current request makes them hard; surface any constraint that filters, skips, blocks, or deletes.
Prefer simple, rerunnable changes: inspect real inputs/state first, use existing tools/libs, log counts/examples, and call out uncertainty.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Line 1 catches OCBC and clock-in. Line 2 catches the Singapore holiday. Line 3 catches the toil.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Don&amp;rsquo;t add the rule to your head. Add it to the file the agent already reads.&lt;/strong&gt; (&lt;em&gt;&lt;strong&gt;Anand&lt;/strong&gt;: Oh, so totally true!&lt;/em&gt;)&lt;/p&gt;
&lt;/section&gt;
&lt;hr&gt;
&lt;p&gt;Actually, the first half of the above AI-generated post didn&amp;rsquo;t really resonate with me. So let me explain in my own words what I did.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I found that, for some reason, this Wednesday never appears when I ask for &lt;a href=&#34;https://github.com/sanand0/scripts/blob/48a095e0a956841a8e395345d31f79b1ae99d4a6/freeslots.py&#34;&gt;find a free calendar slot&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;I asked Codex, &amp;ldquo;Why on earth is this happening?&amp;rdquo; It said, &lt;a href=&#34;https://github.com/sanand0/scripts/commit/a8f5fb9eb5640b828618a33f2c89252e67664487#r186420170&#34;&gt;because you told me to exclude holidays&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;That got me thinking, where am I giving instructions that shoot me in the foot? And ChatGPT did a long, detailed analysis of my coding agent logs and came up with a bunch of examples and categorization.&lt;/li&gt;
&lt;li&gt;I didn&amp;rsquo;t bother reading it. I told it in &lt;a href=&#34;https://www.google.com/search?q=henry+kissinger+is+that+the+best+you+can+do&#34;&gt;Henry Kissinger style&lt;/a&gt;: can you do better?&lt;/li&gt;
&lt;li&gt;I didn&amp;rsquo;t bother reading it again. I told it, &amp;ldquo;Just tell me what to put into AGENTS.md&amp;rdquo;. I don&amp;rsquo;t want to do the work every time. &lt;strong&gt;YOU&lt;/strong&gt; do the work. Automate it!&lt;/li&gt;
&lt;li&gt;It gave me 60 lines. I said, &amp;ldquo;What rubbish! I can&amp;rsquo;t review 60. Just 3, max.&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/sanand0/scripts/commit/7be44855ac6063a364163181585b8eb5721fc469#r186421964&#34;&gt;I copied that into AGENTS.md&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;For non-trivial tasks, define the user-visible invariant: &amp;ldquo;done means ___&amp;rdquo;; verify that invariant before claiming success.&lt;/li&gt;
&lt;li&gt;Treat constraints as soft preferences unless safety, privacy, data loss, credentials, or the current request makes them hard; surface any constraint that filters, skips, blocks, or deletes.&lt;/li&gt;
&lt;li&gt;Prefer simple, rerunnable changes: inspect real inputs/state first, use existing tools/libs, log counts/examples, and call out uncertainty.&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;The first makes &lt;em&gt;total&lt;/em&gt; sense. Define &amp;ldquo;done&amp;rdquo;.&lt;br&gt;
The second makes &lt;em&gt;some&lt;/em&gt; sense - that&amp;rsquo;s exactly what I did wrong with the calendar.&lt;br&gt;
The third is supposed to &amp;ldquo;handle my recurring style&amp;rdquo; - and &lt;em&gt;kind of&lt;/em&gt; makes sense, so I&amp;rsquo;ll let it be.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-25-correcting-instruction-debt.avif&#34;&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Creating comic explainers</title>
      <link>https://www.s-anand.net/blog/creating-comic-explainers/</link>
      <pubDate>Sun, 24 May 2026 16:48:58 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/creating-comic-explainers/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/lori-silverstein-b9baa03/&#34;&gt;Lori Silverstein&lt;/a&gt; shared a &lt;a href=&#34;https://www.linkedin.com/feed/update/urn:li:activity:7462864729913503744/&#34;&gt;post from Quickplay&lt;/a&gt; that featured a comic explainer, mentioning that &amp;ldquo;this could be a very impactful way for us to start being more creative &amp;hellip; and differentiate our value proposition.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-24-quickplay-comic.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;True. Comic explainers convey both creativity &lt;em&gt;and&lt;/em&gt; differentiation.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve used &lt;a href=&#34;https://www.s-anand.net/blog/gemini-sketchnotes/&#34;&gt;sketchnotes&lt;/a&gt; for the same effect, but comic explainers are easier to follow than sketchnotes.&lt;/p&gt;
&lt;p&gt;So I fed this image to ChatGPT and &lt;a href=&#34;https://chatgpt.com/share/6a12bd89-5274-83ec-827c-2446d0be19d2&#34;&gt;asked it to modify my Sketchnote prompt&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How would I modify this prompt to draw a Scott McCloud style explainer comic page in color? I&amp;rsquo;m looking for the way in which he explained Google Chrome when it was released, but with more vibrant colors. Something like the attached image is good for me.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;Draw this as a visually rich, intricately detailed, colorful, and funny, sketchnote (square 1:1).
Use comic-style font in caps.
Keep the text to under 300 words. Prefer evocative imagery over text.
Think about the most important points, structure it logically so that the sketchnote is easy to follow, then draw it.
&lt;/code&gt;&lt;/pre&gt;&lt;/blockquote&gt;
&lt;p&gt;It gave me a prompt which I&amp;rsquo;ve iterated on a few times. This is the &lt;a href=&#34;https://github.com/sanand0/blog/blob/6e1af00d0bc593f3b88bddf57416b533d558c3a3/pages/prompts/fragments.md#comic-page&#34;&gt;comic page prompt&lt;/a&gt; I currently use:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Draw this as a full-color explainer comic page (portrait) - sequential explanation, friendly narrator, diagrams embedded inside panels, visual metaphors, self-aware captions, and clear cause-and-effect storytelling.
Style: expressive characters, comic-style ALL CAPS, vibrant modern colors, clear visual hierarchy.
Prefer pictures over words. Use recurring visual metaphors so the reader understands the idea even while skimming.
Think about the most important points, structure it as a memorable story.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Some examples of the output:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/talks/2026-05-23-ai-unboxed-context-engineering/&#34;&gt;What Your AI Doesn&amp;rsquo;t Know About You&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://sanand0.github.io/talks/2026-05-23-ai-unboxed-context-engineering/&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://sanand0.github.io/talks/2026-05-23-ai-unboxed-context-engineering/comic-page.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.s-anand.net/blog/where-enterprise-ai-is-headed/&#34;&gt;Where Enterprise AI is Headed&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.s-anand.net/blog/where-enterprise-ai-is-headed/&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-23-where-enterprise-ai-is-headed.avif&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Though AI makes it easy to create comic explainers, sketchnotes, etc., I expect we might see &lt;em&gt;less&lt;/em&gt; of them.&lt;/p&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Excel made &lt;a href=&#34;https://en.wikipedia.org/wiki/William_Playfair&#34;&gt;Playfair&lt;/a&gt; style charts &lt;em&gt;less&lt;/em&gt; common with a deluge of bar charts.&lt;/li&gt;
&lt;li&gt;AI will make templatized slides &lt;em&gt;so much easier&lt;/em&gt; that comic explainers will be drowned out.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But creative people like &lt;a href=&#34;https://pudding.cool/&#34;&gt;The Pudding&lt;/a&gt; will likely use AI to create &lt;em&gt;even&lt;/em&gt; more innovative formats. Something I&amp;rsquo;m looking forward to.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-24-creating-comic-explainers.avif&#34;&gt;&lt;/p&gt;
&lt;!--

- Future of Comic Explainers - Creativity vs standardization with AI
  - https://chatgpt.com/c/6a12bf20-28a8-83ec-8a6f-5b20f137d4fe
  - https://claude.ai/chat/92bd7c3a-7de8-4106-a5d8-b39f92cca1be

--&gt;
</description>
    </item>
    <item>
      <title>Where Enterprise AI is headed</title>
      <link>https://www.s-anand.net/blog/where-enterprise-ai-is-headed/</link>
      <pubDate>Sat, 23 May 2026 08:24:19 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/where-enterprise-ai-is-headed/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-23-where-enterprise-ai-is-headed.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;A podcast host sent me eight questions. Instead of rehearsing answers in my head, I used ChatGPT with &lt;a href=&#34;https://www.s-anand.net/blog/local-mcp/&#34;&gt;Local MCP&lt;/a&gt; to read 6 months of call transcripts and find the best examples: &lt;!-- https://chatgpt.com/c/6a10e52f-155c-83ec-9930-e61350c5a72b --&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Iteration 1&lt;/strong&gt;: Here are questions I have been asked to answer in a podcast. Help me prepare with examples. For each question, go through my transcripts or emails and find examples relevant to the question and share (for each relevant example) a summary, how it&amp;rsquo;s relevant, and the relevant verbatim quotes from the transcript.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iteration 2&lt;/strong&gt;: Mention WHO said it. &lt;em&gt;Emphasize&lt;/em&gt; the most important parts. Do a second pass. More examples. Disprove your own hypotheses with evidence to the contrary and retain what remains robust.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iteration 3&lt;/strong&gt;: Do a third pass. Find more real-life examples. Try and disprove yourself even harder. Share the best examples for what survives - not all. Same format.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iteration 4&lt;/strong&gt;: Ensure diversity of client examples. For example, in Q2, all three are the same client. Extend to add / replace examples - ideally with better ones.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Then I used Claude with examples of my writing style to summarize it in my voice. &lt;!-- https://claude.ai/chat/87554c18-0cb5-4667-9ef7-584a310f17fb --&gt;&lt;/p&gt;
&lt;p&gt;For the first time, I&amp;rsquo;m happy to publish an AI-written blog post, because it is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Unique to me&lt;/strong&gt;. No one else has my transcripts, and I&amp;rsquo;m in an unusual position: enterprises keep calling me with AI problems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;More than me&lt;/strong&gt;. Step #1 takes hours of research. Step #2 takes an hour of patience. I would not do this without AI.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Not unlike me&lt;/strong&gt;. I would have written it &lt;em&gt;slightly&lt;/em&gt; differently. Better in some ways, worse in others, but it&amp;rsquo;s close to my style.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Given how comfortable I am about this, I plan to be - not just an author, but also - a editor of AI generating from &lt;em&gt;my&lt;/em&gt; content.&lt;/p&gt;
&lt;p&gt;This article uses verbatim transcripts where possible. I&amp;rsquo;ve anonymized clients and most colleagues. I&amp;rsquo;ve annotated the post with (&lt;strong&gt;Anand&lt;/strong&gt;: &amp;hellip;) commenting with my understanding.&lt;/p&gt;
&lt;section ai-disclosure=&#34;ai-generated&#34; data-ai-model=&#34;claude-sonnet-4.6&#34; data-ai-provider=&#34;Anthropic&#34;&gt;
&lt;h2 id=&#34;1-so-what-is-straive-and-what-do-i-do-there&#34;&gt;1. So what is Straive, and what do I do there?&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;ll let &lt;a href=&#34;https://www.linkedin.com/in/namit-sureka-43ab89/&#34;&gt;Namit&lt;/a&gt; explain it. He said this two weeks ago in a pitch to a European credit-insurance client:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The focus for &lt;a href=&#34;https://www.straive.com/&#34;&gt;Straive&lt;/a&gt; is helping its clients &lt;em&gt;operationalize AI&lt;/em&gt;. For that, we bring two apparently distinct capabilities together&amp;hellip; &lt;em&gt;data analytics and tech development&lt;/em&gt;&amp;hellip; and &lt;em&gt;large-scale operations&lt;/em&gt;. Where we come in is &lt;em&gt;bringing these together and bridging the gap&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s the company. We have about 8,000 people in India - Bangalore, Hyderabad, Chennai, Gurgaon, Noida, Mumbai, Kolkata, Pune. (&lt;strong&gt;Anand&lt;/strong&gt;: Globally it&amp;rsquo;s probably 18K.)&lt;/p&gt;
&lt;p&gt;My job is innovation. In the same call, I described it as:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I lead innovation at Straive. Most of my work involves playing around with &lt;em&gt;Large Language Models&lt;/em&gt;, trying to see how they can &lt;em&gt;accelerate our client work&lt;/em&gt; as well as deliver &lt;em&gt;new kinds of solutions&lt;/em&gt;. That includes &lt;em&gt;improving the software development life cycle&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I introduce myself as an &lt;a href=&#34;https://www.s-anand.net/blog/the-llm-psychologist/&#34;&gt;LLM psychologist&lt;/a&gt; when nobody&amp;rsquo;s watching for corporate decorum. Half of my week is demos for clients. The other half is figuring out why those demos haven&amp;rsquo;t reached production yet. (More on that in question 2.)&lt;/p&gt;
&lt;h2 id=&#34;2-why-do-so-many-enterprise-ai-pilots-stall&#34;&gt;2. Why do so many enterprise AI pilots stall?&lt;/h2&gt;
&lt;p&gt;Not for one reason. I keep a mental list of stall patterns. Three of them come up almost every week.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pattern 1: The pilot worked, but nobody is delivering it.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In a sync with Namit a few months ago, I caught myself saying:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;At a global media client, I am a little worried that &lt;em&gt;the engagement keeps growing and we haven&amp;rsquo;t delivered anything yet&lt;/em&gt;&amp;hellip; Right now, we&amp;rsquo;ve been given &lt;em&gt;proposal after proposal after proposal&lt;/em&gt;&amp;hellip; &lt;em&gt;Nothing has gone to getting deployed so that someone other than our team can use it.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Namit&amp;rsquo;s reply was a single, useful sentence: &amp;ldquo;But they are &lt;em&gt;not in the execution phase&lt;/em&gt;?&amp;rdquo; That was the gap. We had impressive demos. We had no delivery owner.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pattern 2: The data can&amp;rsquo;t move.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For a global premium-schools group, the on-site data lead told me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This is the data set that is at the most granular level. There are around &lt;em&gt;400,000 rows&lt;/em&gt;&amp;hellip; and around &lt;em&gt;110 columns&lt;/em&gt;&amp;hellip; They cannot export it&amp;hellip; we cannot export this outside externally.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The pilot didn&amp;rsquo;t fail. The architecture failed. We had to redesign the entire engagement around the constraint: schema, profiling stats, sample rows, hypotheses, and queries flow out; raw data stays in. (Knowledge infrastructure as a workaround for missing data infrastructure. See question 4.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pattern 3: Teams debate frameworks instead of evals.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A private-markets investor wanted to lock the &amp;ldquo;agentic framework&amp;rdquo; by end of week. Their team was comparing LangGraph vs OpenAI Agent SDK vs Pydantic AI. I told them, more bluntly than I should have:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The technical solution may not matter too much because this is moving so fast that &lt;em&gt;anything we built will anyway be outdated in not more than a year&lt;/em&gt;&amp;hellip; It almost doesn&amp;rsquo;t matter which of these&amp;hellip; &lt;em&gt;the effort on the code is the least of our problems&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Pilots stall there too - not because the framework is wrong, but because the question is wrong. Without evals and acceptance criteria, no framework choice will rescue the project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The thing that survives all three patterns&lt;/strong&gt;: pilots prove that a model can produce a good answer once. Production proves the operating model.&lt;/p&gt;
&lt;h2 id=&#34;3-what-operational-gaps-stop-ai-from-scaling&#34;&gt;3. What operational gaps stop AI from scaling?&lt;/h2&gt;
&lt;p&gt;Telemetry. Objective clarity. Repeatable loops. In that order.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Telemetry, not surveys.&lt;/strong&gt; Our L&amp;amp;D lead asked me how to assess AI readiness across 19,000 employees without sounding like a particular Big Consulting firm threatening people&amp;rsquo;s promotions. I suggested:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would go to the IT team and ask them for three things&amp;hellip; using NetSkope, who has been accessing AI-related sites on &lt;em&gt;how many unique days in the past 90 days&lt;/em&gt;&amp;hellip; &lt;em&gt;Regularity matters more than volume&lt;/em&gt;&amp;hellip; LLM Foundry access. They have the logs for that. Third, Google Workspace tracks Gemini usage&amp;hellip; &lt;em&gt;These three give us a good company-wide proxy for AI usage&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He paused, then said it was better than a survey. &lt;strong&gt;You cannot scale AI adoption without knowing who is adopting it.&lt;/strong&gt; Self-reports won&amp;rsquo;t tell you. Logs will.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Objective clarity beats agent architecture.&lt;/strong&gt; (&lt;strong&gt;Anand&lt;/strong&gt;: KISS: Keep it simple &amp;amp; stupid.) A teaching assistant in my IIT Madras course built an elaborate agentic workflow tool - planner agents, executor agents, sub-agents reporting to leaders. After fifteen minutes, I said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You&amp;rsquo;ve been speaking for 15 minutes and &lt;em&gt;I haven&amp;rsquo;t understood what you want. I don&amp;rsquo;t know if you understood what you want&lt;/em&gt;&amp;hellip; You mentioned two objectives: &lt;em&gt;learning traces and helping students learn. We should keep those as two different tools&lt;/em&gt;&amp;hellip; For the learning traces, the &lt;em&gt;minimal solution is a terminal command&lt;/em&gt;. It should authenticate them with their Google account and log all the inputs and the outputs, save it in a signed document that is tamper-proof, that we can replay.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;200 lines of Python, not a multi-agent framework. (He took it well, I think.) The operational gap was: nobody had separated the two objectives, so every solution looked too complex.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Repeatable loops beat heroics.&lt;/strong&gt; (&lt;strong&gt;Anand&lt;/strong&gt;: Iterate. Compound improvement.) An internal team complained they couldn&amp;rsquo;t ship because the developers were on other work. I told them:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can try. &lt;em&gt;Don&amp;rsquo;t worry about what is not working. Just write it down.&lt;/em&gt; I tried this, this is working this way, this is not working in this way.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The gap wasn&amp;rsquo;t developer capacity. It was the absence of a &amp;ldquo;try, document, learn, repeat&amp;rdquo; loop that anyone could run.&lt;/p&gt;
&lt;h2 id=&#34;4-why-does-content-and-knowledge-infrastructure-matter-as-much-as-cloud&#34;&gt;4. Why does content and knowledge infrastructure matter as much as cloud?&lt;/h2&gt;
&lt;p&gt;Because the model is generic. Your business meaning is not. (&lt;strong&gt;Anand&lt;/strong&gt;: Each company has their own ways of working.)&lt;/p&gt;
&lt;p&gt;A delivery lead working at the global premium-schools client kept hitting the same wall. The bottleneck wasn&amp;rsquo;t access. It was semantics:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The real bottleneck is not access; it&amp;rsquo;s &lt;em&gt;shared semantics&lt;/em&gt;: &amp;lsquo;Acceptance date,&amp;rsquo; &amp;lsquo;account ID,&amp;rsquo; &amp;lsquo;boarding type,&amp;rsquo; &amp;lsquo;inquiry journey&amp;rsquo; - these can mean subtly different things across systems.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is knowledge infrastructure. Definitions. Update rules. What &amp;ldquo;acceptance date&amp;rdquo; means when a stage is updated vs appended. No model knows this until you write it down.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;At the European credit-insurance pitch, we made this explicit.&lt;/strong&gt; A senior delivery architect on our side told the client:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We create a &lt;em&gt;Confluence setup&lt;/em&gt;, bring in everything that&amp;rsquo;s not already there on Confluence and create a comprehensive Confluence setup&amp;hellip; That becomes the input for our &lt;em&gt;agentic implementations&lt;/em&gt; as well. That becomes the &lt;em&gt;data room&lt;/em&gt; from where the agents draw the knowledge to perform the actions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Confluence wasn&amp;rsquo;t the deliverable. It was the substrate that made every later agent deliverable possible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On a CPG analytics product demo&lt;/strong&gt;, the founder explained their &amp;ldquo;definition library&amp;rdquo;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This is where we&amp;rsquo;re configuring the &lt;em&gt;DNA of the agents&lt;/em&gt;&amp;hellip; We call the &lt;em&gt;domain definitions&lt;/em&gt;. We also call it the &lt;em&gt;definition library&lt;/em&gt;&amp;hellip; It&amp;rsquo;s not just a wrapper around ChatGPT. It&amp;rsquo;s something that&amp;rsquo;s very grounded in &lt;em&gt;domain-specific definitions&lt;/em&gt; that avoids &lt;em&gt;hallucinations, non-deterministic output&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I keep coming back to this. &lt;strong&gt;Cloud is where the model runs. Knowledge infrastructure is what the model knows.&lt;/strong&gt; Skip the second, and you have a very expensive autocomplete.&lt;/p&gt;
&lt;h2 id=&#34;5-what-do-india-led-capability-centers-add&#34;&gt;5. What do India-led capability centers add?&lt;/h2&gt;
&lt;p&gt;They convert AI demos into reliable processes. That&amp;rsquo;s not a slogan. It&amp;rsquo;s the only thing that actually scales. (&lt;strong&gt;Anand&lt;/strong&gt;: You need people to operate the AI machinery.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On the European credit-insurance engagement&lt;/strong&gt;, the client&amp;rsquo;s IT lead described his Bangalore team. &lt;a href=&#34;https://www.linkedin.com/in/jishnu-gupta-1a3a29/&#34;&gt;Jishnu&lt;/a&gt;&amp;rsquo;s response was telling:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We also want to absolutely be open and also &lt;em&gt;retain some of that knowledge&lt;/em&gt;, because as we transition, those will be critical, the knowledge that is inherent in your people and processes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The center isn&amp;rsquo;t a labor pool. It&amp;rsquo;s a knowledge sink. Without that retention, AI workflows lose context within months.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Better example&lt;/strong&gt;: a media-intelligence client picked us because our AI model scored higher than theirs and higher than humans. The numbers were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Their model: 40% accuracy&lt;/li&gt;
&lt;li&gt;Their human reviewers: 65% accuracy&lt;/li&gt;
&lt;li&gt;Our model: 70% accuracy&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But 30% of cases were still outliers. So we set up an operations team in India to handle those exceptions. AI plus humans, with the humans owning the exception path. We now have about 150 people doing similar work for a global short-video platform out of Hyderabad and Chennai. (&lt;strong&gt;Anand&lt;/strong&gt;: This is a claim I heard in a pitch. I don&amp;rsquo;t have evidence. So, if it&amp;rsquo;s untrue, it&amp;rsquo;s human hallunication, not AI.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Closer to my own work&lt;/strong&gt;: we have a Hyderabad team that trains coding models. (&lt;strong&gt;Anand&lt;/strong&gt;: Actually, we don&amp;rsquo;t. &lt;a href=&#34;https://www.linkedin.com/in/rukeshreddy/&#34;&gt;Rukesh&lt;/a&gt; of &lt;a href=&#34;https://www.deccan.ai/&#34;&gt;Deccan.ai&lt;/a&gt; does. This is AI hallucination.) About 100 full-time reviewers and 200-300 contractors. The full-timers don&amp;rsquo;t build models - they look at code and rate it, &amp;ldquo;I like this, this is not so good.&amp;rdquo; They&amp;rsquo;re managing reviewers, not writing code. That&amp;rsquo;s a capability center evolving from delivery to AI ops.&lt;/p&gt;
&lt;p&gt;The thing that distinguishes India-led centers in 2026 isn&amp;rsquo;t cost. It&amp;rsquo;s the willingness to own the 30% that AI can&amp;rsquo;t handle yet.&lt;/p&gt;
&lt;h2 id=&#34;6-where-does-governance-actually-bite&#34;&gt;6. Where does governance actually bite?&lt;/h2&gt;
&lt;p&gt;Three places, all real, all from the last quarter.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compute-to-data, not data-to-cloud.&lt;/strong&gt; (&lt;strong&gt;Anand&lt;/strong&gt;: Move code, not data.) Back to the global premium-schools client. The data could not leave. So the governance pattern became:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you could share back the &lt;em&gt;output aggregated&lt;/em&gt; of those queries, that will be great&amp;hellip; Get the magnitude and the P-value. Which you can &lt;em&gt;dictate over a call if required&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We export queries and import aggregated results. The schema travels; the rows don&amp;rsquo;t. &lt;strong&gt;&amp;ldquo;No export&amp;rdquo; turned out to be a product requirement, not a blocker.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Honest impossibility.&lt;/strong&gt; A global media client wanted us to scrub PII from 3 million user-uploaded images. Their senior engineering leader insisted on zero leaks. I did the math out loud:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For 3 million images&amp;hellip; with&amp;hellip; 99%, we&amp;rsquo;re talking about &lt;em&gt;30,000 images&lt;/em&gt; with personally identifiable information potentially slipping through.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He replied, flatly: &amp;ldquo;We have to have &lt;em&gt;zero leaks&lt;/em&gt;. Not thousands of leaks.&amp;rdquo; I said: &amp;ldquo;Then I think I can safely say &lt;em&gt;we can&amp;rsquo;t do this&lt;/em&gt;. This requires more technology than we have.&amp;rdquo; (&lt;strong&gt;Anand&lt;/strong&gt;: When I said this, our sales teams nearly had a heart attack. So did the client, I think.)&lt;/p&gt;
&lt;p&gt;Trustworthy AI sometimes means saying no. That was a governance decision, not a technical one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local execution for sensitive data.&lt;/strong&gt; At a clinical-data conference, I used our own finance controller (a famously cautious Chennaiite) as an example. He emailed his team:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Team, please use this opportunity to install CodeX AI as per the recorded demo. This is very powerful, yesterday I tried it for two data requests and the result was &lt;em&gt;fantabulous&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The reason he was comfortable: &lt;em&gt;the data is not going to the model. The code is coming from the model.&lt;/em&gt; Codex ran the code on his machine, on the financial records, which never left his laptop. (&lt;strong&gt;Anand&lt;/strong&gt;: Well, kind-of. &lt;em&gt;Some&lt;/em&gt; data does leave, like summaries, previews, etc.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Three governance patterns, three different problems.&lt;/strong&gt; None of them is policy text. All of them are architecture decisions.&lt;/p&gt;
&lt;h2 id=&#34;7-how-should-we-measure-real-roi&#34;&gt;7. How should we measure real ROI?&lt;/h2&gt;
&lt;p&gt;Cycle time. Quality. New revenue. Risk avoided. Adoption. Not headcount.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cycle time, hard number&lt;/strong&gt;: on the European credit-insurance engagement, our sales lead told the client:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We brought in an &lt;em&gt;AI-based approach solution&lt;/em&gt; to accelerate that entire mapping exercise&amp;hellip; reduced the execution time by about 80%.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s the easiest ROI to defend. It was an actual XSLT and data-mapping workstream, not a demo.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quality and effort, blended&lt;/strong&gt;: in a workshop with our research analytics team, an analyst said a CIM (Confidential Information Memorandum) takes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Three to four man-day effort.&lt;/em&gt; A man-day is equivalent to eight hours.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I did the demo live:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In approximately &lt;em&gt;five minutes&lt;/em&gt;, Claude will come up with a pretty solid presentation. In approximately &lt;em&gt;45 minutes&lt;/em&gt;&amp;hellip; ChatGPT will come up with an outrageously detailed presentation&amp;hellip; Those &lt;em&gt;three to four days will come down by 50%&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The half-day saving counts. The &amp;ldquo;five minutes vs three days&amp;rdquo; headline doesn&amp;rsquo;t, because review still takes time. &lt;strong&gt;Honest ROI includes the verification effort.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Revenue, not just cost&lt;/strong&gt;: one of our innovation track leads told the team:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This was the demo that we made and that resulted into these &lt;em&gt;two projects&lt;/em&gt;, both Sports Coverage and Trends to Clip.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That demo turned into part of a $1.15M week of deal movement. Demos that drive pipeline are an ROI line item too, even though no spreadsheet ever credits them. (&lt;strong&gt;Anand&lt;/strong&gt;: This was reported in an internal sales call I was not a part of, but is true.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The fourth measurement is adoption.&lt;/strong&gt; If nobody uses the thing, the ROI is zero regardless of theoretical capability. Track NetSkope logs, not certificate completions. (See question 3.)&lt;/p&gt;
&lt;h2 id=&#34;8-where-is-enterprise-ai-going&#34;&gt;8. Where is enterprise AI going?&lt;/h2&gt;
&lt;p&gt;Three predictions, ranked by how confident I am.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Most confident: analysts stop doing research; they start managing AI researchers.&lt;/strong&gt; I told a research analytics workshop:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Stop doing research.&lt;/em&gt; Your job has now transformed into somebody who has a &lt;em&gt;team of 100 researchers under you&lt;/em&gt;&amp;hellip; Your job is no longer managing a team; it is in fact &lt;em&gt;managing a team of teams&lt;/em&gt;, perhaps.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The implication is real. Hiring shifts toward verification, judgment, and exception-handling. The org chart compresses but the supervisory layer grows. Accountability becomes the scarce skill.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reasonably confident: agentic frameworks will commoditize within a year.&lt;/strong&gt; Back to the private-markets investor sync. I told them not to obsess about LangGraph vs Pydantic AI vs OpenAI&amp;rsquo;s SDK:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Workflows are where you say, &lt;em&gt;&amp;lsquo;do it this way.&amp;rsquo;&lt;/em&gt; Agents are where you say, &lt;em&gt;&amp;lsquo;figure it out.&amp;rsquo;&lt;/em&gt;&amp;hellip; A scalable approach is to give it an &lt;em&gt;agentic loop&lt;/em&gt;, say, &lt;em&gt;&amp;lsquo;you figure out how to solve the problem.&amp;rsquo;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The frameworks won&amp;rsquo;t matter. The loops, tools, and evals will. &lt;strong&gt;Pick something boring and move on.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Less confident, but worth saying: nobody jumps straight to autonomous.&lt;/strong&gt; Even our most ambitious proposal - the European credit-insurance one - staged it explicitly:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The &lt;em&gt;Agentic AI component&lt;/em&gt; is something which is more prominent and starts in the &lt;em&gt;Modernize phase&lt;/em&gt;. But it&amp;rsquo;s not there in phase one and phase two. However, &lt;em&gt;AI is still there&lt;/em&gt;&amp;hellip; We are going to leverage AI or LLMs for very basic functionality in a &lt;em&gt;non-intrusive manner&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Assist first. Instrument. Capture knowledge. Then automate more deeply. The companies that try to skip to step four lose three quarters re-doing step one.&lt;/p&gt;
&lt;h2 id=&#34;what-survived-three-passes-through-my-transcripts&#34;&gt;What survived three passes through my transcripts&lt;/h2&gt;
&lt;p&gt;I rebuilt this answer three times. Each time I tried to disprove my own pattern. Each time the same shape held:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Enterprise AI does not fail because the model is weak.&lt;/strong&gt; It fails when we mistake a demo for delivery. When data can&amp;rsquo;t move. When teams debate frameworks before defining evals. When the team overbuilds because nobody asked what the actual objective was. When the business case never cleared cost and timeline.&lt;/p&gt;
&lt;p&gt;The pattern that survives across clients, sectors, geographies: &lt;strong&gt;successful enterprise AI is operational AI&lt;/strong&gt;. It needs telemetry, knowledge infrastructure, governance, human accountability, and a delivery model. Not slogans, not frameworks. Logs. Schemas. Confluence pages. Codex on a controller&amp;rsquo;s laptop. A 200-line terminal recorder instead of a 10,000-line agentic platform.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s what we sell at Straive. That&amp;rsquo;s what I&amp;rsquo;m trying to scale. Ask me again in a year.&lt;/p&gt;
&lt;/section&gt;
</description>
    </item>
    <item>
      <title>LLM Deprecations and Price Changes</title>
      <link>https://www.s-anand.net/blog/llm-deprecations-and-price-changes/</link>
      <pubDate>Thu, 21 May 2026 14:02:10 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/llm-deprecations-and-price-changes/</guid>
      <description>&lt;p&gt;A colleague told me a near-miss horror story.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As Google began deprecating Gemini 2.0, we moved to Gemini 2.5 Pro. But reasoning is enabled by default and cannot be turned off.&lt;/p&gt;
&lt;p&gt;For our specific problem statement, reasoning was not required. Token costs increased 10x and speeds were 3-4x slower. We moved the client to Gemini 2.5 Flash Lite, which has reasoning turned off by default and offers much lower latency.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Because we track compute costs closely, we managed this without a major financial impact.&lt;/p&gt;
&lt;p&gt;But model updates clearly require careful testing on the cost and latency front as well, not just output quality.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;AI used to keep getting cheaper. But now it&amp;rsquo;s more a &amp;ldquo;convergence&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-21-llm-pricing.svg&#34;&gt;&lt;/p&gt;
&lt;p&gt;Each line traces a model family. The X-axis is its intelligence (&lt;a href=&#34;https://lmarena.ai/leaderboard&#34;&gt;LMArena ELO score&lt;/a&gt;) and the Y-axis is the input cost ($ per million tokens, log scale). Time flows roughly left to right as models improve.&lt;/p&gt;
&lt;p&gt;Three patterns emerge and each seems strategic.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Gemini Flash rockets upward (cheap -&amp;gt; expensive, moving right).&lt;/li&gt;
&lt;li&gt;GPT-4 class collapses from $30 toward $1.25, then GPT-5.5 jumps back up to $5.&lt;/li&gt;
&lt;li&gt;Claude Sonnet runs perfectly flat (cheap stays cheap as it gets smarter).&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;googles-loss-leaders&#34;&gt;Google&amp;rsquo;s Loss Leaders&lt;/h3&gt;
&lt;p&gt;Gemini 1.5 Flash was $0.075/MTok in August 2024. I told everyone to use it - it&amp;rsquo;s a fantastic, affodable model.&lt;/p&gt;
&lt;p&gt;Gemini 3.5 Flash, released this week at Google I/O, is $1.50. That&amp;rsquo;s a &lt;strong&gt;20× in 21 months&lt;/strong&gt;. Looks like &lt;strong&gt;&amp;ldquo;Flash&amp;rdquo; is more a brand position than a price position.&lt;/strong&gt; It migrates up-market.&lt;/p&gt;
&lt;p&gt;Same for Gemini 1.5 Flash 8b (3.75 cents) which migrated into Gemini 3.1 Flash Lite (25 cents - a 6.7× increase).&lt;/p&gt;
&lt;p&gt;Gemini Pro went the opposite direction: down from $5.00 (1.5-pro in Oct 2024) to $1.25-$2.00 today. Pro seems to be a competitive weapon against Anthropic and OpenAI at the top, while monetizing the middle.&lt;/p&gt;
&lt;p&gt;Of course, Gemini&amp;rsquo;s real lock-in is the Google Workspace bundling and Search AI Mode. Personally, I subscribed to Google Pro for the first time in 20 years just for these bundled capabilities.&lt;/p&gt;
&lt;p&gt;By the time people notice the Flash price, it&amp;rsquo;s hard to leave the ecosystem.&lt;/p&gt;
&lt;h3 id=&#34;openais-relaunches&#34;&gt;OpenAI&amp;rsquo;s Relaunches&lt;/h3&gt;
&lt;p&gt;The price chart speaks for itself:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Model&lt;/th&gt;
          &lt;th style=&#34;text-align: right&#34;&gt;Price&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-4&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$30&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-4 Turbo&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$10&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-4o&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$2.50&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-4.1&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$2.00&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-5&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$1.25&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-5.2&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$1.75&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-5.4&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$2.50&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;GPT-5.5&lt;/td&gt;
          &lt;td style=&#34;text-align: right&#34;&gt;$5.00&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Maybe their strategy seems is: scale current technology to commodity (the O-series showed the same pattern with $15 for O1 falling to $1.10 for O4-mini in 7 months), &lt;em&gt;THEN&lt;/em&gt; launch a new frontier above it, and repeat?&lt;/p&gt;
&lt;h3 id=&#34;anthropics-revisions&#34;&gt;Anthropic&amp;rsquo;s Revisions&lt;/h3&gt;
&lt;p&gt;Claude Sonnet has held at &lt;strong&gt;exactly $3.00/MTok input&lt;/strong&gt; for over two years, across four model generations and an ELO gain of nearly 200 points. Pretty unusual.&lt;/p&gt;
&lt;p&gt;Opus came down from $15 to $5 in November 2025 - likely a deliberate move to make it production-viable. Haiku crept from $0.25 to $0.80. The tiers are converging.&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s a twist, though. Anthropic restructured enterprise contracts in late 2025. Seat prices dropped to $20/seat. Token discounts (previously 10-15% off API rates) were removed. For a 100-seat team, that adds ~$15K-$40K to annual TCO. So, prices went down, but &lt;em&gt;the actual bill went up&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Also, Opus 4.7 uses a new tokenizer that may consume ~35% more tokens for the same text. It&amp;rsquo;s worth re-benchmarking prompts before assuming $5 is 1/3rd of $15.&lt;/p&gt;
&lt;h3 id=&#34;what-do-we-do&#34;&gt;What Do We Do?&lt;/h3&gt;
&lt;p&gt;Model family prices change rapidy. Old models get deprecated. Best to be prepared.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Add multi-tier routing to your architecture:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Lite / Nano / Haiku for extraction, classification, &amp;hellip; &amp;ndash; tasks with clear right answers&lt;/li&gt;
&lt;li&gt;Sonnet / GPT-5.4 / Gemini Flash for most production reasoning&lt;/li&gt;
&lt;li&gt;Opus / GPT-5.5 for escalation or expert advice: complex planning, hard edge cases, high-value decisions&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compare &lt;em&gt;completed-task costs&lt;/em&gt;&lt;/strong&gt;, not token price. A 2* more expensive model can halve the retry rate, making it cheaper per successful output.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Migrate by model capability&lt;/strong&gt;, not model family. Switch to models with similar latency, context window, output format compliance and reasoning depth.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evaluate open-source models.&lt;/strong&gt; DeepSeek models at self-hosted inference costs can be 90% cheaper for &lt;em&gt;commodity&lt;/em&gt; (not frontier) tasks.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;For enterprise procurement:&lt;/strong&gt; keep a close eye on pricing changes, API token discounts, and what was quietly removed during renewal. (AI helps with this!)&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The first 18 months of most AI model families are &lt;strong&gt;discounted customer acquisition&lt;/strong&gt;. Then value extraction follows. Google started it with Flash. OpenAI is doing it with GPT-5.5. Anthropic is doing it with enterprise billing restructuring.&lt;/p&gt;
&lt;p&gt;Fair enough. Providers need to recover infra investments.&lt;/p&gt;
&lt;p&gt;And build good routing strategies helps enterprises get the most out of this.&lt;/p&gt;
&lt;p&gt;Just keep asking yourself: &amp;ldquo;what&amp;rsquo;s our plan for when this model changes or deprecates?&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-21-llm-deprecations-and-price-changes.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;PS: AI-generated image - has a few errors.&lt;/p&gt;
&lt;!--

Research: https://chatgpt.com/c/6a0e995a-0c84-83ec-953d-60db0fe1db66
Writing: https://claude.ai/chat/580de13d-9667-4697-8425-53d56bd36ff3

--&gt;
</description>
    </item>
    <item>
      <title>Agent-consumable content</title>
      <link>https://www.s-anand.net/blog/agent-consumable-content/</link>
      <pubDate>Tue, 19 May 2026 11:08:59 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/agent-consumable-content/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-19-agent-consumable-content.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m making more and more of my content agent-consumable, i.e. easier for ChatGPT, Claude Code, etc. to read, in three ways.&lt;/p&gt;
&lt;p&gt;One, I &lt;strong&gt;export&lt;/strong&gt; content in an agent-friendly way.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Google email, calendar, chat&lt;/strong&gt;. I use &lt;a href=&#34;https://github.com/googleworkspace/cli&#34;&gt;&lt;code&gt;gws&lt;/code&gt;&lt;/a&gt; to &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/backupgoogle.py&#34;&gt;back up&lt;/a&gt; into scannable one-line entries.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Meet recordings&lt;/strong&gt;. I &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/backupmeet.py&#34;&gt;back up&lt;/a&gt; transcripts and videos (with a compact audio copy).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WhatsApp chats&lt;/strong&gt; that I &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/backupwhatsapp.py&#34;&gt;back up&lt;/a&gt; into similar one-liners.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Browsing history&lt;/strong&gt; by &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/browsing_history.py&#34;&gt;exporting&lt;/a&gt; my Edge history SQLite database.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Daily activities&lt;/strong&gt; by &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/activities.py&#34;&gt;integrating&lt;/a&gt; the above with my command line and commit history.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AI conversations&lt;/strong&gt; by exporting them manually or via &lt;a href=&#34;https://tools.s-anand.net/aiscrapers/&#34;&gt;bookmarklets&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Social media records&lt;/strong&gt; like LinkedIn invites/conversations, Twitter, Hacker News, Discourse, etc via &lt;a href=&#34;https://tools.s-anand.net/&#34;&gt;bookmarklets&lt;/a&gt; or &lt;a href=&#34;https://github.com/sanand0/scripts/&#34;&gt;scripts&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Financial records&lt;/strong&gt; like bank statements, receipts, payslips, tax filings, utility payments, rentals, property records, investments, insurance, pensions, invoices, credit scores, etc. by exporting them manually.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Medical records&lt;/strong&gt; like tests, prescriptions, doctor visits, etc. by exporting them manually.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal records&lt;/strong&gt; like certificates, educational records, CV, passport / visa applications, etc. by exporting them manually.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Two, I &lt;strong&gt;log&lt;/strong&gt; / &lt;strong&gt;generate&lt;/strong&gt; more content. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://til.s-anand.net/&#34;&gt;Things I learnt&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href=&#34;https://www.s-anand.net/blog/&#34;&gt;blog posts&lt;/a&gt;&lt;/strong&gt; I write.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://github.com/sanand0/blog/blob/live/pages/prompts/&#34;&gt;Prompts&lt;/a&gt;&lt;/strong&gt; I use frequently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://tools.s-anand.net/trending-repos/&#34;&gt;Trending GitHub repos&lt;/a&gt;&lt;/strong&gt; I want to evaluate.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/talks/&#34;&gt;Talks&lt;/a&gt;&lt;/strong&gt; I deliver and &lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/datastories/&#34;&gt;data stories&lt;/a&gt;&lt;/strong&gt; I write.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/llmdemos/&#34;&gt;Demos&lt;/a&gt;&lt;/strong&gt; I build and &lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/&#34;&gt;code&lt;/a&gt;&lt;/strong&gt; I write.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://www.s-anand.net/blog/i-lost-22-kg-in-22-weeks/&#34;&gt;Weight&lt;/a&gt;&lt;/strong&gt; and other fitness data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://tds.s-anand.net/&#34;&gt;Teaching material&lt;/a&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;a href=&#34;https://exam.sanand.workers.dev/tds-2026-05-ga0&#34;&gt;assessments&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/datastories/tds-2026-01-p1/&#34;&gt;evaluations&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href=&#34;https://sanand0.github.io/tds-2024-sep-project-2-results/similar.html&#34;&gt;analysis&lt;/a&gt;&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coding agent logs&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sensors&lt;/strong&gt;: Location, mostly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Daily journals&lt;/strong&gt;: Food, sleep, deeds, pains, &amp;hellip;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Media journals&lt;/strong&gt;: Books, movies, TV series, &amp;hellip;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Notes&lt;/strong&gt; (of various kinds): TODOs, app / tool ideas, people I know / meet, questions I&amp;rsquo;m asked, my beliefs, &amp;hellip;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(Notably missing are photos / videos, which I&amp;rsquo;ve fallen out of the habit of.)&lt;/p&gt;
&lt;p&gt;Three, I &lt;strong&gt;summarize&lt;/strong&gt; the content for agents. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Adding blog frontmatter&lt;/strong&gt; by &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/summarize.py&#34;&gt;summarizing&lt;/a&gt; my &lt;a href=&#34;https://github.com/sanand0/blog/&#34;&gt;blog posts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Adding transcript frontmatter&lt;/strong&gt; by &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/summarize.py&#34;&gt;summarizing&lt;/a&gt; my meeting transcripts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Identifying actions&lt;/strong&gt; &lt;a href=&#34;https://github.com/sanand0/scripts/blob/4071a2a795817177789531aeb1dd2ed8bb732199/setup.fish#L309&#34;&gt;extracted&lt;/a&gt; from transcripts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summarizing my code&lt;/strong&gt; as &lt;a href=&#34;https://github.com/sanand0/sanand0/&#34;&gt;podcasts&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summarizing prompts&lt;/strong&gt; as &lt;a href=&#34;https://github.com/sanand0/scripts/tree/main/agents&#34;&gt;SKILL.md&lt;/a&gt; files.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summarizing conversations&lt;/strong&gt; into advice for &lt;a href=&#34;https://www.s-anand.net/blog/ai-advice-for-teams/&#34;&gt;AI&lt;/a&gt;, &lt;a href=&#34;https://www.s-anand.net/blog/time-management/&#34;&gt;time management&lt;/a&gt;, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Summarizing technical choices&lt;/strong&gt; into a &lt;strong&gt;&lt;a href=&#34;https://tools.s-anand.net/radar/&#34;&gt;technology radar&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extracting transcripts elements&lt;/strong&gt;, like insights, experiments to run, actions, what I missed, what they missed, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On my list is &lt;a href=&#34;https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f&#34;&gt;Karpathy&amp;rsquo;s LLM wiki&lt;/a&gt;, summarizing my photos, and more.&lt;/p&gt;
&lt;p&gt;Just &lt;em&gt;writing&lt;/em&gt; this post took me an hour! It also convinced me that I have &lt;em&gt;lots&lt;/em&gt; of content and there&amp;rsquo;s a lot of under-leverage in unleashing agents on what I already have.&lt;/p&gt;
&lt;!-- https://chatgpt.com/c/6a0a60df-df40-83ec-88d3-f213441c52be --&gt;
</description>
    </item>
    <item>
      <title>I have AI psychosis</title>
      <link>https://www.s-anand.net/blog/i-have-ai-psychosis/</link>
      <pubDate>Mon, 18 May 2026 14:24:06 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/i-have-ai-psychosis/</guid>
      <description>&lt;p&gt;On this informal AI psychosis checklist, I score 16/19.&lt;/p&gt;
&lt;hr&gt;
&lt;section ai-disclosure=&#34;ai-generated&#34; data-ai-model=&#34;gpt-5.5&#34; data-ai-provider=&#34;OpenAI&#34;&gt;
&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-18-i-have-ai-psychosis.avif&#34;&gt; &lt;!-- https://chatgpt.com/c/6a0aaa46-f788-83ec-a731-4962b68da3c7 --&gt;&lt;/p&gt;
&lt;p&gt;&amp;ldquo;AI psychosis&amp;rdquo; = an informal label for cases where chatbots seem to amplify delusional or manic thinking &amp;ndash; especially in vulnerable users.&lt;/p&gt;
&lt;p&gt;Why it can happen:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Too human&lt;/strong&gt;: ELIZA-effect activated.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Too agreeable&lt;/strong&gt;: Sycophant mode: ON.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Always on&lt;/strong&gt;: 24/7. No off button. No problem! LOL.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Lonely + late night&lt;/strong&gt;: 2 a.m. feels like eternity.&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Weaker reality checks&lt;/strong&gt;: Mirror mazes. Conspiracy boards. Vibes over evidence.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What research suggests:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;✅ At-risk users were more likely to use GenAI for social / emotional support.&lt;/li&gt;
&lt;li&gt;✅ They were 1.76*-3.08* more likely to treat AI as a companion, friend, therapist, or romantic partner.&lt;/li&gt;
&lt;li&gt;✅ Delusion-related interactions showed up in about 13%-31% of responses among at-risk users.&lt;/li&gt;
&lt;li&gt;✅ Heavy use, anthropomorphizing AI, and belief-confirming loops may raise risk.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Red flags:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;❌ AI knows the secret truth.&lt;/li&gt;
&lt;li&gt;✅ The bot really gets me.&lt;/li&gt;
&lt;li&gt;✅ I stopped checking with humans.&lt;/li&gt;
&lt;li&gt;❌ Sleep? What sleep?&lt;/li&gt;
&lt;li&gt;❌ Everything now fits the theory.&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;hr&gt;
&lt;p&gt;Case in point: I know the % consumption of my Codex and Claude Code usage better than the current day of the week.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>People skills with AI</title>
      <link>https://www.s-anand.net/blog/people-skills-with-ai/</link>
      <pubDate>Sun, 17 May 2026 22:45:57 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/people-skills-with-ai/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-17-iim-alumni-singapore-workshop-poster.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I &lt;a href=&#34;https://www.s-anand.net/blog/ai-advice/&#34;&gt;advise&lt;/a&gt; people that people skills are important in the AI era.&lt;/p&gt;
&lt;p&gt;Now, I&amp;rsquo;m &lt;em&gt;using AI&lt;/em&gt; to help me with people skills.&lt;/p&gt;
&lt;p&gt;This morning, I &lt;a href=&#34;https://github.com/sanand0/scripts/blob/915c9e0b35a66e276d63f2fdaffd1af36ae77497/prompts/backupwhatsapp.md#write-bulk-scraper-17-may-2026&#34;&gt;wrote&lt;/a&gt; a &lt;a href=&#34;https://github.com/sanand0/scripts/blob/915c9e0b35a66e276d63f2fdaffd1af36ae77497/backupwhatsapp.py&#34;&gt;script&lt;/a&gt; to export my WhatsApp conversations this year. That makes it easy to feed it into AI models.&lt;/p&gt;
&lt;p&gt;Then I used my &lt;a href=&#34;https://www.s-anand.net/blog/how-i-use-local-mcp/&#34;&gt;Local MCP connector&lt;/a&gt; and asked Claude:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Who are people in my life that most deserve an unreasonable gesture of thanks and what would that be?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It went through my WhatsApp messages &amp;ndash; including threads I had &lt;em&gt;not&lt;/em&gt; read. Including a group discussing four 90-minute hands-on AI workshops I&amp;rsquo;m running for IIM alumni in Singapore on Saturday afternoons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;23 May: Context engineering&lt;/li&gt;
&lt;li&gt;20 Jun: AI tools &amp;amp; workflows&lt;/li&gt;
&lt;li&gt;25 Jul: Agentic analysis&lt;/li&gt;
&lt;li&gt;22 Aug: AI strategy&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a href=&#34;https://www.linkedin.com/in/nayana-jain/&#34;&gt;Nayana Jain&lt;/a&gt; prepared a series of LinkedIn / WhatsApp posts to promote this workshop series and also created a poster for the workshops - and the best part of it was they were AI-generated. AI didn&amp;rsquo;t do a great job at the logo, so she asked for and got the &lt;a href=&#34;https://files.s-anand.net/images/2026-05-17-iimpact-logo.avif&#34;&gt;IIMPACT logo&lt;/a&gt; (which isn&amp;rsquo;t public) and fed it to the model to re-generate it.&lt;/p&gt;
&lt;p&gt;All of this is something I wasn&amp;rsquo;t even aware of until Claude pointed it out.&lt;/p&gt;
&lt;p&gt;I sent a note thanking her.&lt;/p&gt;
&lt;p&gt;And, evidently, AI is teaching me how to be human.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>How I use Local MCP</title>
      <link>https://www.s-anand.net/blog/how-i-use-local-mcp/</link>
      <pubDate>Sat, 16 May 2026 22:24:32 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/how-i-use-local-mcp/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-16-how-i-use-local-mcp.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;d love for Claude or ChatGPT to answer questions like:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What meetings am I not setting up that I really should be?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;or:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Based on my activities since 9 May 2026, what should I blog about?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;or:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Who in my professional life most deserves an unreasonable gesture?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;From data. My files, emails, calendar, contacts, transcripts, blogs, notes, code, browsing history, logs, random Markdown files I forgot I wrote.&lt;/p&gt;
&lt;p&gt;Hence, a &lt;a href=&#34;https://modelcontextprotocol.io/&#34;&gt;Local MCP&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;My Local MCP server exposes one tool: &lt;code&gt;bash&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;nd&#34;&gt;@mcp.tool&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;async&lt;/span&gt; &lt;span class=&#34;k&#34;&gt;def&lt;/span&gt; &lt;span class=&#34;nf&#34;&gt;bash&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;commands&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;nb&#34;&gt;str&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;ctx&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;Context&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&#34;nb&#34;&gt;str&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;s2&#34;&gt;&amp;#34;&amp;#34;&amp;#34;Runs multiline bash script.&amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;n&#34;&gt;result&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;subprocess&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;run&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;commands&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;shell&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;kc&#34;&gt;True&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;,&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;executable&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;s2&#34;&gt;&amp;#34;/bin/bash&amp;#34;&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &lt;span class=&#34;k&#34;&gt;return&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;result&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;stdout&lt;/span&gt; &lt;span class=&#34;o&#34;&gt;+&lt;/span&gt; &lt;span class=&#34;n&#34;&gt;result&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;n&#34;&gt;stderr&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. No vector database. No UI. No custom connectors. No &amp;ldquo;AI knowledge platform.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Just: &lt;strong&gt;run shell commands on my machine&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I run this locally, expose it online (which is slightly scary), and give Claude and ChatGPT this &lt;a href=&#34;https://www.s-anand.net/blog/prompts/fragments/#local-mcp&#34;&gt;prompt fragment&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Local MCP runs bash and exposes:
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;-&lt;/span&gt; ~/code/talks/README.md - talk transcripts, slides
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;-&lt;/span&gt; ~/code/blog/description.md - 20K files, 5K posts. Search for &amp;#34;- llm&amp;#34; for AI-related posts.
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;... (etc.)
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;sb&#34;&gt;`gws`&lt;/span&gt; can access email, calendar, etc.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In one shot, this gives EVERYTHING I have to the agents.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;A common use is meeting prep.&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You are a brilliant, brutally honest Chief of Staff. You have full access via Local MCP bash tool to calendar, emails, and past transcripts. Produce a briefing card for each substantive external meeting today.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It checks the calendar via &lt;code&gt;gws&lt;/code&gt;. It searches my transcripts. My notes. My &lt;a href=&#34;https://www.s-anand.net/blog/ai-advice/&#34;&gt;AI advice&lt;/a&gt;. Then gives me a briefing card with everything I need.&lt;/p&gt;
&lt;p&gt;I can&amp;rsquo;t do this by uploading files manually. The context is not one file: it&amp;rsquo;s scattered all over.&lt;/p&gt;
&lt;p&gt;A human assistant could do this. But agents are faster, cheaper, and I trust them more.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Another common use is relationship intelligence.&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What meetings am I not setting up that I really should be?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude scans transcripts, contacts, emails, and recent activity to find people I should speak.&lt;/p&gt;
&lt;p&gt;This is where Local MCP is different from a file upload.&lt;/p&gt;
&lt;p&gt;In a file upload, I can ask &amp;ldquo;Where is X?&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;Here, I&amp;rsquo;m asking &amp;ldquo;What am I missing?&amp;rdquo; and the answer depends on recency, relationship history, frequency, how conversations felt, unresolved actions, and so much more.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;A third use is mining my own work.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I used Local MCP to ask what I should blog about. It scanned all my content and found themes I haven&amp;rsquo;t really thought about, like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.s-anand.net/blog/google-meet-captions-local-transcript-recorder/&#34;&gt;Google Meet captions&lt;/a&gt; - a code commit I recently made. I wrote about it.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://sanand0.github.io/talks/2026-05-15-gramener-all-hands/&#34;&gt;Agents are the new software&lt;/a&gt; - a theme I&amp;rsquo;ve been talking a lot about. I wrote about it.&lt;/li&gt;
&lt;li&gt;Local MCPs - that&amp;rsquo;s this post&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;hellip; and half a dozen topics I should be writing about soon.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;A fourth use is business research.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I have transcripts from sales calls and client conversations. I don&amp;rsquo;t attend all of them. But Local MCP can.&lt;/p&gt;
&lt;p&gt;I can ask:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Which client needs have we heard repeatedly but not converted into demos?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;or:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Which solutions have we pitched to one client that another client has explicitly asked for?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is beyond a CRM search. A HubSpot search finds what people typed in. This finds what people actually said.&lt;/p&gt;
&lt;p&gt;Then an email search finds if they acted on it. Calendar search finds what we spent time on.&lt;/p&gt;
&lt;p&gt;Across these, I find opportunities that no single system has.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;BUT:&lt;/strong&gt; this is not safe by default. A bash MCP server can delete my files, run commands, read my browser sessions, send emails &lt;code&gt;gws&lt;/code&gt;, and all sorts of risky things.&lt;/p&gt;
&lt;p&gt;So I monitor the commands like a hawk, and give it fairly controlled access, and only when I&amp;rsquo;m actually running one of these use-cases.&lt;/p&gt;
&lt;p&gt;I tried OAuth but setting up Auth0, dynamic client registration, callback URLs, scopes, ChatGPT connector errors, &amp;hellip; I gave up.&lt;/p&gt;
&lt;p&gt;For now, supervised local usage gives me most of the value.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;BUT #2:&lt;/strong&gt; Claude and ChatGPT use Local MCP differently.&lt;/p&gt;
&lt;p&gt;Claude uses it beautifully. Smooth. No mistakes. References memory.&lt;/p&gt;
&lt;p&gt;ChatGPT is more restrictive. No chat memory accessed, nor saved. Keeps asking for permissions.&lt;/p&gt;
&lt;p&gt;So I use ChatGPT less for Local MCP-heavy tasks. But ChatGPT is rigorous. When I want structured analysis, exhaustive lists, or better verification discipline, it is useful.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Local MCP is powerful because it lets AI &lt;em&gt;use all systems I have access to&lt;/em&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;local files - across Dropbox, Google Drive, my notes, blog posts, transcripts, slides, &amp;hellip;&lt;/li&gt;
&lt;li&gt;code - not just reading, but running, rewriting, and generatig&lt;/li&gt;
&lt;li&gt;email, calendar, contacts&lt;/li&gt;
&lt;li&gt;browser history&lt;/li&gt;
&lt;li&gt;shell tools - which can be used to access even more system&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Local MCP invites Claude / ChatGPT as a real assistant into my laptop.&lt;/p&gt;
&lt;p&gt;And into my 2,700-line TODO archive.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;You probably shouldn&amp;rsquo;t expose a bash tool to an AI. But note the direction I&amp;rsquo;m going with this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;If your work and transactions are agent-readable, your past work compounds.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;If they are trapped in apps, screenshots, and memory, your AI has amnesia.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>Google Meet captions as a local transcript recorder</title>
      <link>https://www.s-anand.net/blog/google-meet-captions-local-transcript-recorder/</link>
      <pubDate>Sat, 16 May 2026 13:32:06 +0800</pubDate>
      <guid>https://www.s-anand.net/blog/google-meet-captions-local-transcript-recorder/</guid>
      <description>&lt;p&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://files.s-anand.net/images/2026-05-15-google-meet-captions-tool.avif&#34;&gt;&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;m a man of simple needs. All I want is: when I&amp;rsquo;m on &lt;a href=&#34;https://meet.google.com/&#34;&gt;Google Meet&lt;/a&gt;, I turn on captions. I wanted to click a bookmarklet and save those captions into a local Markdown file. (So that an AI agent can guide me from it.)&lt;/p&gt;
&lt;p&gt;Hence, &lt;a href=&#34;https://tools.s-anand.net/gmeetcaptions/&#34;&gt;Google Meet Captions&lt;/a&gt;. The code is in &lt;a href=&#34;https://github.com/sanand0/tools/tree/main/gmeetcaptions&#34;&gt;&lt;code&gt;gmeetcaptions/&lt;/code&gt;&lt;/a&gt;. Drag the button to your bookmarks bar. Join a Meet. Turn on captions. Click it.&lt;/p&gt;
&lt;p&gt;You get a tiny panel with two buttons: &lt;strong&gt;Copy&lt;/strong&gt; and &lt;strong&gt;Start Recording&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/gmeetcaptions.js&#34;&gt;The bookmarklet&lt;/a&gt; writes this kind of Markdown:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;gh&#34;&gt;# Meeting title
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;-&lt;/span&gt; **Meeting code**: abc-defg-hij
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;-&lt;/span&gt; **Started**: 5/15/2026, 8:00:00 AM
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;k&#34;&gt;-&lt;/span&gt; **Participants**: Alice, Bob, Carol
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;---
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;gu&#34;&gt;## Alice [0:12]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Good morning everyone.
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;gu&#34;&gt;## Bob [0:18]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Let&amp;#39;s get started with the agenda.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. No server. No extension. No login. No API. Just a &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/index.html&#34;&gt;bookmarklet page&lt;/a&gt;, a &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/script.js&#34;&gt;script&lt;/a&gt;, and local browser APIs.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;BUT&lt;/strong&gt;: Google Meet captions are live and unstable.&lt;/p&gt;
&lt;p&gt;A sentence may appear as:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;mic, so,
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then a second later become:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;mic, So that&amp;#39;s a new person. Okay.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then become:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-markdown&#34; data-lang=&#34;markdown&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;mic, So that&amp;#39;s a new person. Okay. Hey. oh, but,
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If I simply append every change, the transcript becomes garbage. So the bookmarklet keeps updating the active speaker turn until it becomes stable. The implementation uses a &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver&#34;&gt;&lt;code&gt;MutationObserver&lt;/code&gt;&lt;/a&gt; plus a one-second polling fallback. After four unchanged polls, it treats the turn as final.&lt;/p&gt;
&lt;p&gt;The tests are in &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/gmeetcaptions.test.js&#34;&gt;&lt;code&gt;gmeetcaptions.test.js&lt;/code&gt;&lt;/a&gt;, using an anonymized fixture at &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/__fixtures__/captions-anonymized.html&#34;&gt;&lt;code&gt;__fixtures__/captions-anonymized.html&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;BUT #2&lt;/strong&gt;: Google Meet&amp;rsquo;s DOM is not a public API. Class names like &lt;code&gt;.nMcdL&lt;/code&gt;, &lt;code&gt;.NWpY1d&lt;/code&gt;, and &lt;code&gt;.ygicle&lt;/code&gt; can vanish overnight.&lt;/p&gt;
&lt;p&gt;So the scraper first tries semantic and structural selectors:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;[role=&amp;quot;region&amp;quot;][aria-label=&amp;quot;Captions&amp;quot;]&lt;/code&gt; for the captions region&lt;/li&gt;
&lt;li&gt;&lt;code&gt;img[data-iml]&lt;/code&gt; and &lt;code&gt;googleusercontent.com&lt;/code&gt; avatars to identify caption items&lt;/li&gt;
&lt;li&gt;the first &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; as the speaker&lt;/li&gt;
&lt;li&gt;the last non-image &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; as the caption text&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Only then does it fall back to obfuscated class names. That selector strategy is documented in the &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/README.md&#34;&gt;&lt;code&gt;README&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Boring, but also the difference between &amp;ldquo;worked once&amp;rdquo; and &amp;ldquo;might work tomorrow.&amp;rdquo;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The weirdest was Chrome writing to a &lt;code&gt;.md.crswap&lt;/code&gt; file while recording. The file appears unfinished until I click &lt;strong&gt;Stop Recording&lt;/strong&gt;. Then Chrome finalizes it.&lt;/p&gt;
&lt;p&gt;This is good, actually. It means the browser is safely streaming to a local file via the &lt;a href=&#34;https://developer.mozilla.org/en-US/docs/Web/API/File_System_API&#34;&gt;File System Access API&lt;/a&gt;. But it also means: &lt;strong&gt;stop the recorder before trusting the file&lt;/strong&gt;!&lt;/p&gt;
&lt;p&gt;I captured these bugs and prompts in &lt;a href=&#34;https://github.com/sanand0/tools/blob/main/gmeetcaptions/prompts.md&#34;&gt;&lt;code&gt;prompts.md&lt;/code&gt;&lt;/a&gt;, because future-me will forget. Future-agent, too.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Why bother? Because transcripts are not the output. They are raw material.&lt;/p&gt;
&lt;p&gt;Once a meeting is Markdown, I can ask agents to extract decisions, questions, follow-ups, contradictions, reusable prompts, and blog ideas. I can diff it. Search it. Commit it. Feed it to another workflow.&lt;/p&gt;
&lt;p&gt;Meetings now become the &amp;ldquo;context&amp;rdquo; in context engineering!&lt;/p&gt;
</description>
    </item>
  </channel>
</rss>
